Think that FROM only specifies tables in an SQL statement? Think again …
When I first began to work with SQL, I saw the
FROM clause as the simplest part of any query. You’d put only one table there, or may be two or more tables in the case of a join. That was all I thought about the
FROM clause in those days.
As I began to learn more about SQL, I discovered that there are a lot of techniques that can be used in a
FROM. Often, these techniques are the key that unlocks the solution of a complex query.
In this post, we’re going to consider everything that goes into a
FROM clause: one table, multiple tables (in a join), a view, a materialized view, a temporary table, a subquery, or a blend of any of the above.
We will use the following tables,
game, to illustrate our examples.
The Simplest FROM: Just One Table
The simplest way to use the
FROM clause is to put just one table in it. Suppose we want to obtain all the team names from the above table. We can use the following simple query:
SELECT name FROM team
Lions Penguins Blackbirds
Multiple Tables: A Join
We can also have more than one table in the
FROM clause. When we need to read data from two or more tables, we join these tables by listing them in the
FROM team, game
A join is a really important operation in SQL and it is one of the building blocks of the relational database model. Please note that when join tables we also need to include in a
ON clause to associate records from one table with their corresponding records in another table. In our example, we use the
WHERE clause to join records from the
game tables. This specifies a join based on the values in the
WHERE team.team_id = game.team_1
If we put two tables in the
FROM clause and we forget the join condition, we have a problem. The query will return a result set called a Cartesian product, which includes all the possible combinations of all matching records in both tables. Cartesian products should almost always be avoided, although under some specific situations they can be necessary.
Here is an example of a proper join. Notice the
SELECT * FROM team, game WHERE team.team_id = game.team_1
team_id name founded team_1 team_2 match_day stadium points_1 points_2 1 Lions 1900-01-01 1 2 2017-03-05 PenguinHouse 2 1 1 Lions 1900-01-01 1 3 2017-03-12 SuperArena 1 0 1 Lions 1900-01-01 1 2 2017-03-05 PenguinHouse 2 1
You can also use an alternative syntax to join tables. It looks different, but it returns the same results. This alternative syntax uses two reserved words:
ON. Here is the same query presented in the alternative syntax:
SELECT * FROM team JOIN game ON team.team_id = game.team_1
Both syntaxes are equivalent in terms of results. However, using the
JOIN clause gives you some extra options, like using an
OUTER clause to manage rows that have no corresponding records.
The Cartesian Product
Let’s go back to the Cartesian product. When would we want this in our result set? Suppose we are planning a tournament where every team must play against every other team. We can obtain a list of all possible matches (a Cartesian product) with the following SQL:
SELECT t1.name, t2.name FROM team t1, team t2
Lions Lions Lions Penguins Lions Blackbirds Blackbirds Lions Blackbirds Penguins Blackbirds Blackbirds Penguins Lions Penguins Penguins Penguins Blackbirds
This example is a pure Cartesian product. However, there are some pairs that we don’t want in the result – e.g. having a team play against itself (Lions vs. Lions) or having a redundant match (Lions vs. Penguins and Penguins vs. Lions). To exclude these pairs, we add a condition to the
SELECT t1.name, t2.name FROM team t1, team t2 WHERE t1.name > t2.name
Lions Blackbirds Penguins Blackbirds Penguins Lions
The condition in the
WHERE clause discards pairs of teams playing against itself or for duplicate matches to appear in our list.
A Self Join
A self join is a join where both tables are the same, mainly when a table have a foreign key pointing to the same table. Self joins can also be made by joining one table to itself. For example, you may use a self join to compare two columns in the same table against each other.
Suppose we have a table called
employee with columns called
manager_id. The following SQL uses a self join to compare these columns:
SELECT E1.name as employee, E2.name as boss FROM employee E1, employee E2 WHERE E1.manager_id = E2.employee_id
E2 are aliases for the
employee table. SQL does not allow you to use the same table name more than once per statement, so you must use table aliases to get around that rule. In this example, SQL is treating the
employee table as two different tables. The aliases differentiate between the “two tables”.
Using a View in the FROM Clause
Another option is to use a named view in the
FROM clause. A view appears to be a table but is not. A view is part of a table – it is actually a subset of records displayed by a
SELECT – but it doesn’t store records. As far as the
FROM clause is concerned, a view is treated like a table.
Let’s see an example. Suppose we need a view that shows any teams that have never lost a game. We will create a view titled
invictus to select the appropriate teams. The code is:
CREATE VIEW invictus AS SELECT name FROM team T WHERE NOT EXISTS ( SELECT * FROM game G WHERE (G.team_1 = T.team_id AND G.points_1 < G.points_2) OR (G.team_2 = T.team_id AND G.points_1 > G.points_2) )
Now we can use the view to obtain the unbeaten teams:
SELECT * FROM invictus
Using a Materialized View in the FROM Clause
A materialized view is similar to a view; it is also created using
SELECT statements on base tables. However, materialized views can store records. These records can be obtained or calculated from regular tables or even other views, so materialized views introduce a kind of redundancy. The records in materialized views are synchronized with the relevant base tables using the
We can use a materialized view in the
FROM clause, just as we did in the previous section. Let’s create a materialized view called
mvw_results using the columns
match_day. We will create a new column,
result, which will store a “WON”, “TIE” or “LOST” value for every game played by the team.
CREATE MATERIALIZED VIEW mvw_results AS -- won matches SELECT team_1 as team_id, match_day, ‘WON’ FROM game WHERE points_1 > points_2 UNION SELECT team_2 as team_id, match_day, ‘WON’ FROM game WHERE points_2 > points_2 -- tied matches UNION SELECT team_1 as team_id, match_day, ‘TIE’ FROM game WHERE points_1 = points_2 UNION SELECT team_2 as team_id, match_day, ‘TIE’ FROM game WHERE points_2 = points_2 -- tied matches UNION SELECT team_2 as team_id, match_day, ‘LOST’ FROM game WHERE points_1 > points_2 UNION SELECT team_1 as team_id, match_day, ‘LOST’ FROM game WHERE points_2 > points_2
Using a Subquery in the FROM Clause
We can also enclose a
SELECT in parenthesis, assign a name to it (thus making it a named subquery), and put it in the
FROM clause. The database will treat it as a regular table.
Let’s see an example. Suppose we want to rank every team in the tournament. To do this, we need to calculate how many points each team has. Two points are awarded for each win and one for each tie.
First, we will create two subqueries: one called
q_won that uses the columns
name of team and
quantity of won matches, and another called
q_tied with the same columns. Let’s use the previous materialized view when we create the two subqueries in the
SELECT T.name, (COALESCE(sq_won.quantity_won,0) * 2 + COALESCE(sq_tie.quantity_tied,0) * 1 ) points FROM (team T LEFT OUTER JOIN (SELECT team_id , count(*) quantity_won FROM mvw_results WHERE result ='WON' GROUP BY team_id) AS sq_won ON T.team_id = sq_won.team_id ) LEFT OUTER JOIN ( SELECT team_id , count(*) quantity_tied FROM mvw_results WHERE result ='TIE' GROUP BY team_id) AS sq_tie ON T.team_id = sq_tie.team_id ORDER BY points DESC
Using A Temporary Table in the FROM Clause
The last option we will review is the use of a temporary table in the
FROM clause. A temporary table is created during the current Postgres session; it belongs to the session, and other sessions cannot access it. Moreover, other sessions can have a temporary table with the same name.
Temporary tables are destroyed when the session ends or when a
DROP TABLE statement is executed. In the following example, we will do the same query as we did with the subquery example. The difference is that we will use two temporary tables instead of the subqueries. Here is the code:
CREATE TEMPORARY TABLE sq_won AS SELECT team_id , count(*) quantity_won FROM mvw_results WHERE result ='WON' GROUP BY team_id; CREATE TEMPORARY TABLE sq_tie AS SELECT team_id , count(*) quantity_tied FROM mvw_results WHERE result ='TIE' GROUP BY team_id; SELECT T.name, (coalesce(sq_won.quantity_won,0) * 2 + coalesce(sq_tie.quantity_tied,0) * 1 ) points FROM (team T LEFT OUTER JOIN sq_won ON T.team_id = sq_won.team_id ) LEFT OUTER JOIN sq_tie ON T.team_id = sq_tie.team_id ORDER BY points DESC;
So, we did the same query using subqueries and using temporary tables. Is there any difference between these approaches? Yes. We can create indexes on temporary tables; if we need to improve performance, the temporary table approach enables that.
Try It Yourself
In this article, we saw the many things we can use in the
FROM clause: one table, joined tables, views, materialized views, temporary tables, and subqueries. SQL is flexible and allows us to combine all these elements in a single
SQL offers several ways to write the same query. As a challenge to the reader, I’m going to mention that it is possible to obtain a materialized view using a
CASE statement instead of a
UNION. How would you do this?
For more SQL information, check out Vertabelo SQL Academy. It’s free to try and covers lots of good stuff!