SQL INNER JOIN Explained in Simple Words

TODO

In this article, I will cover perhaps the most important element of the SQL language. That's right! I'm talking about the SQL INNER JOIN clause.

As you know, in a database the data are stored in several tables. In my last article, I covered how to create SQL queries when the data you need is in only one table. But what if the data you need is in two tables? For this case, you need to use a JOIN clause to combine both tables, which is something similar to connecting both tables. In this article, I will explain the SQL INNER JOIN clause in simple words using some easy to understand examples.

A Travel Agency Database

Perhaps you already know that in a database the data are stored in tables. Suppose you work for a travel agency, and the agency database has two tables: TRAVEL_PACK and CITIES. In the following image you can see the database, of course not all data are being shown.

Sample of TRAVEL_PACK Table

PACK_NAMEBEST_SEASONPRICEDURATION
United States: Big CitiesAll year$3500.0010 days
United States: West CoastMarch to May$3700.0012 days
South American TourAugust to November$1850.0018 days
Beaches of BrazilDecember to March$2200.007 days

Sample of CITIES Table

NAMETYPE_OF_CITYPACK_NAMEDAYS_STAYING
San FranciscohistoricalUnited States: West Coast5
WashingtonhistoricalUnited States: Big Cities3
New YorkbusinessUnited States: Big Cities7
Rio de JaneirobeachBeaches of Brazil4
UshuaiamountainSouth American Tour3
Salvador de BahiabeachBeaches of Brazil3
Los AngelesbeachUnited States: West Coast7

Please note: In order to use a JOIN clause to combine two tables, there must be a field or column which appears in both tables.

In the sample database this field is the name of the travel packet. As you can see in the TRAVEL_PACK table, the name of the pack is under the column PACK_NAME, and in the CITIES table the name of the pack is in the column PACK_NAME too. It is easy to see that both columns have the same column name, however in other databases it can be different. In other words, when using an INNER JOIN clause the name of the common column can be different in both tables or not. Anyway, remember the column name PACK_NAME, because you will need it later to write the JOIN clause.

Your First SQL JOIN: Obtain the Travel Packets Including Beach Cities

I usually recommend starting out by thinking about where the data are located that you need for the query. In this case it is clear that you will need the TRAVEL_PACK table to obtain the names of the packs, but at the same time you will also need the CITIES table, because you need to check if the city is a beach city or not. So, the data needed for this query are in two tables, which is the reason that you need to use an SQL JOIN.

When you have to obtain data from two (or more) tables, you must use an SQL JOIN. The SQL JOIN acts as a connector between two tables, creating pairs of records. Basically it takes two records (one from each table) and joins them into a pair of records. This kind of join is called an INNER JOIN, and in SQL the terms JOIN or INNER JOIN are exactly the same. For those readers who want to go deeper, there are other SQL join types, for example the opposite of the INNER join is another join type called LEFT join or OUTER LEFT join, and you can also find other join types like RIGHT join, NATURAL join and LATERAL join among others. You can learn many of these SQL join types in the SQL JOINs online course. However, in this article I will focus on the INNER JOIN, which is really powerful!

The next image shows how the JOIN clause creates a pair of records:

A GIF representing how inner joining of tables works

(click to enlarge)

Now, I will show the syntax to implement an INNER JOIN, by using the next pattern

FROM table1 
INNER JOIN table2 ON common column in table1 = common column in table2

If you fill the pattern with your sample tables and column names, then you will have the next INNER JOIN clause:

FROM CITIES 
INNER JOIN TRAVEL_PACK ON CITIES.PACK_NAME = TRAVEL_PACK.PACK_NAME

FROM CITIES Once you complete the FROM clause with the INNER JOIN, you can continue working on the SELECT and WHERE clauses of the query. The SELECT clause is easy because you only need to return the column NAME, then the clause will be:

SELECT TRAVEL_PACK.PACK_NAME

Note the reference to the column PACK_NAME is done by adding a prefix with the table name (TRAVEL_PACK in this case). This is a good practice to apply when the query involves more than one table.

In the WHERE clause you will need to filter the results to include only beach cities. Then the clause will be:

WHERE CITIES.TYPE_OF_CITY = 'beach'

Finally if you put all the clauses together the query will be:

SELECT TRAVEL_PACK.PACK_NAME
FROM CITIES 
JOIN TRAVEL_PACK ON CITIES.PACK_NAME = TRAVEL_PACK.PACK_NAME
WHERE CITIES.TYPE_OF_CITY = 'beach'

Following is the result of the previous query:

NAME
Los Angeles
Salvador de Bahia
Rio de Janeiro

Your Next Join: the Cities Included in Travel Packs Cheaper Than $2500

Again, similar to the previous query, you can see you need to access two tables for this query, using a JOIN for the CITIES and TRAVEL_PACK tables again. Note I use the term JOIN instead of INNER JOIN because both are exactly the same in SQL.

From the previous query you already know how to create the JOIN clause. The changes are in the SELECT and WHERE clauses, as you can see in the following paragraphs.

The select clause is easy, because you only need the city name:

SELECT CITY.NAME

In the where clause you need to filter by price of the travel pack, then the WHERE clause will be:

WHERE TRAVEL_PACK.PRICE <= 2500

Then, the complete query will be the following:

SELECT CITY.NAME
FROM CITIES 
JOIN TRAVEL_PACK ON CITIES.PACK_NAME = TRAVEL_PACK.PACK_NAME
WHERE TRAVEL_PACK.PRICE <= 2500

The result of the previous query will be:

NAME
Rio de Janeiro
Ushuaia
Salvador de Bahia

Deep Diving: INNER JOIN for Three Tables

Is it possible to use an INNER JOIN with more than two tables? Sure it is. Moreover, you can use a JOIN clause with as many tables as you wish.

However, I must mention that each SQL JOIN clause is only for a single pair of tables. So if you have three tables to join, such as T1, T2, and T3, you will need two SQL JOIN clauses like you can see in the following example:

FROM T1 JOIN T2 ON …… 
				JOIN T3 ON …....

I left the ON clause empty because I am going to explain this part with the following example. Suppose you extend the agency database by adding a new table called STATES where you plan to store the states, regions, or provinces of each country. The database with three tables will be like the following picture:

TRAVEL_PACK Table

PACK_NAMEBEST_SEASONPRICEDURATION
United States: Big CitiesAll year$3500.0010 days
United States: West CoastMarch to May$3700.0012 days
South American TourAugust to November$1850.0018 days
Beaches of BrazilDecember to March$2200.007 days

STATES Table

NAMECOUNTRYPOPULATIONLANGUAGE
New YorkUnited States17000000english
Tierra del FuegoArgentina190000spanish
CaliforniaUnited States13400000english
Rio de JaneiroBrasil15000000portuguese
BahiaBrasil8000000portuguese

CITIES Table

NAMETYPE_OF CITYPACK_NAMEDAYS_STAYSTATE
San FranciscohistoricalUnited States: West Coast5California
WashingtonhistoricalUnited States: Big Cities3Washington
New YorkbusinessUnited States: Big Cities7New York
Rio de JaneirobeachBeaches of Brazil4Rio de Janeiro
UshuaiamountainSouth American Tour3Tierra del Fuego
Salvador de BahiabeachBeaches of Brazil3Bahia
Los AngelesbeachUnited States: West Coast7California

Suppose you need to obtain a report with a list of all cities with the state, the pack name, and the language used in the city in the states where the language is Spanish or Portuguese. It is clear that you need to JOIN the tables CITIES and STATES and then JOIN this result also with the TRAVEL_PACK table. You can start by applying what you learned in the previous examples.

First, use a JOIN clause to combine the table CITIES with the table STATES. It is clear you need to use the columns CITIES.STATE and STATE.NAME in the ON clause. Then the from clause will be the following:

FROM CITIES JOIN STATES ON CITIES.STATE = STATES.NAME 

Next, you need to add the third table to the JOIN clause, which is the table TRAVEL_PACK. You already learned how to JOIN this table in a previous example, so the complete JOIN will be:

FROM CITIES JOIN STATES ON CITIES.STATE = STATES.NAME
  JOIN TRAVEL_PACK ON CITIES.PACK_NAME = TRAVEL_PACK.PACK_NAME

Adding the SELECT and WHERE clauses, the complete query will be like the following:

SELECT CITIES.NAME, STATES.NAME, TRAVEL_PACK.PACK_NAME, STATES.LANGUAGE
FROM CITIES JOIN STATES ON CITIES.STATE = STATES.NAME
  JOIN TRAVEL_PACK ON CITIES.PACK_NAME = TRAVEL_PACK.PACK_NAME
WHERE STATES.LANGUAGE IN ('spanish','portuguese')

And the result will be:

NAMENAMEPACK_NAMELANGUAGE
SalvadorBahiaBeaches of Brazilportuguese
Rio de JaneiroRio de JaneiroBeaches of Brazilportuguese
UshuaiaTierra del FuegoSouth American Tourspanish

Next Steps to Continue Learning

In this article, I covered how to join tables in SQL. I explained by using examples of what an SQL INNER JOIN is and how the JOIN clause works. The JOIN clause opens up a huge world of new possibilities in SQL. Think that any non-technical person with the capacity to use the JOIN clause in SQL queries has extra value as an employee, regardless of which area of the company he or she works. Go deep with SQL JOINs and be a solid business contributor in your company!

Ignacio L. Bisso

Ignacio is a database consultant from Buenos Aires, Argentina. He’s worked for 15 years as a database consultant for IT companies like Informix and IBM. These days, he teaches databases at Sarmiento University and works as a PostgreSQL independent SQL consultant. A proud father of four kids with 54 years in his backpack, Ignacio plays soccer every Saturday afternoon, enjoying every match as if it’s his last one.

comments powered by Disqus