Postgres offers set operators that make it easy to query and filter the results of searches from your database. Set operators are used to join the results of two or more SELECT statements. These operators are UNION, UNION ALL, INTERSECT, and EXCEPT—each can be used to construct queries across mutliple tables and filter for the specific data that you need.
To return the combined results of two SELECT statements, we use the UNION set operator. This operator removes all the duplicates from the queried results—only listing one row for each duplicated result. To examine this behavior, the UNION ALL set operator can be used, as it retains duplicates in the final result. The INTERSECT set operator only lists records that are shared by both SELECT queries, and conversely, the EXCEPT set operator removes the results from the second SELECT query. Thus, the INTERSECT and EXCEPT set operators are used to produce unduplicated results.
All set operators initially share the same degree of precedence (except for INTERSECT, which we'll discuss later). Because parentheses are prioritized above dangling operators, they can cause the order to differ.
All SELECT statements that are used with a SET operator must select the same number of columns. The columns used in the display are obtained from the first query.
The column-list data types must be implicitly convertible by Postgres. Postgres does not perform implicit type conversion if the corresponding columns in the queries belong to different data types. If a column in the first query of type INT, and the corresponding column in the second query is of type CHAR, Postgres will not perform an implicit conversion—instead, it will raise a type error.
To sort the result set, positional ordering is used. Set operators do not allow individual result set ordering. At the end of the query, ORDER BY can only appear once.
The order of the queries in UNION and INTERSECT operators is not important, and doesn't change the final result - UNION and INTERSECT operators are commutative.
UNION ALL has better performance compared to the UNION operator because resources are not used in filtering duplicates and sorting the results.
- It is possible to use set operators as part of subqueries.
- It is not possible to use set operators in SELECT statements containing TABLE collection expressions.
Creating Postgres Tables for Our Examples
Before we continue exploring the set operators, we need to create tables that we will be using throught the tutorial:
DROP TABLE IF EXISTS top_rated_films; CREATE TABLE top_rated_films( title VARCHAR NOT NULL, release_year SMALLINT); DROP TABLE IF EXISTS most_popular_films; CREATE TABLE most_popular_films( title VARCHAR NOT NULL, release_year SMALLINT); INSERT INTO top_rated_films(title, release_year) VALUES ('Fast and furious 7', 2011), ('The redemption', 2015), ('Shooter', 2017); INSERT INTO most_popular_films(title, release_year) VALUES ('American snipper', 2015), ('Doulou continent',2018), ('Outpost',2019), ('Shooter', 2017);
Joining Multiple SELECT Queries with UNION
Postgres displays the combined results from all of the compounded SELECT queries after removal of duplicate values, and the results are sorted in ascending order when the UNION operator joins multiple SELECT queries:
SELECT * FROM top_rated_films UNION SELECT * FROM most_popular_films
the columns selected must be of the compatible data types, otherwise Postgres throws a type error:
SELECT title FROM top_rated_films UNION SELECT release_year FROM most_popular_films //error UNION types character varying and smallint cannot be matched
The Difference Between UNION and UNION ALL
The UNION and UNION ALL operators are similar—the difference being that UNION ALL returns the result set without removing duplicates and sorting the data.
Check the query in the UNION section below. Make sure to note the difference in the output that is generated without sorting and removal of duplicates:
SELECT * FROM top_rated_films UNION ALL SELECT * FROM most_popular_films
Find Overlapping Data from Multiple SELECT Queries Using INTERSECT
To display the common rows from both the select statements with no duplicates, and with data arranged in sorted order; the INTERSECT operator is used.
The shooter movie is returned because it is available in both top rated and most popular movie tables.
SELECT * FROM most_popular_films INTERSECT SELECT * FROM top_rated_films;
Returning Unique Rows with EXCEPT in Postgres
To display rows that are present in the first query but absent in the second one, the EXCEPT operator is used. It returns no duplicates, and the data is arranged in ascending order by default.
SELECT * FROM most_popular_films EXCEPT SELECT * FROM top_rated_films
Using the ORDER BY clause in SET operations
In a query containing a compound SELECT statement, the ORDER BY clause can only appear once at the end. This indicates that individual SELECT statements cannot contain the ORDER BY clause. Sorting is based on the columns that appear in the first SELECT query only. It is recommended to sort compound queries using column positions.
The query below sorts the results from two departments and unifies them:
SELECT * FROM most_popular_films UNION ALL SELECT * FROM top_rated_films ORDER BY title;
Now that you have learned how to use Postgres set operators, you can optimize your database queries with them to retrieve and compare data across different columns—allowing for a more streamlined data retrieval process.