If you’re in the US, chances are you’ve been eagerly awaiting the approach of Black Friday just as much as Thanksgiving. Though the shopping frenzy takes hold of nearly everyone, some people have to stick to their budgets and shop prudently. In this article, we’ll take a look at how generating an SQL report can help you track how much your family spent shopping on Black Friday.
Storing Black Friday Purchases in a Database
Before we can create an SQL report, we first need some data we can use. Our example database consists of two tables:
person table is quite simple—it stores the names of family members who spent money on Black Friday, as well as their associated identifiers.
|2||Vertabelo “SQL Basics” course||29||40||1|
product table stores detailed information about all products the family purchased on Black Friday, including prices, discounts, and identifiers of the people who made the purchases.
Who Bought the Most Items?
Now that we’ve stored our information in a database, we can begin creating an SQL report to summarize our data. Let’s begin with the following simple query:
SELECT person_id, COUNT(id) FROM product GROUP BY person_id;
SELECT clause, we’ve chosen to work with the column named
person_id, which stores the identifiers of our family members. For each person, we need to compute the number of products they purchased, so we’ll use the
COUNT aggregate function. We’ll pass in the name of the column that stores product identifiers (
id) as the argument for this function. Essentially, this allows us to count the number of times a family member’s identifier appears in the table of purchases.
Since we listed the column
person_id in our
SELECT clause, we also need to pass it in as a parameter to the
GROUP BY clause. Remember: all columns outside aggregate functions must be listed in a
GROUP BY clause.
Here’s the result of executing that query:
The values stored in the column named
person_id are the identifiers of the family members who made the purchases, and the values stored in the column count indicate how many products they purchased.
There’s one problem, though. We’d rather not have to go back and forth between the result table and the original
person table to determine who’s who. Thus, we’ll modify our query so it displays the names of our family members instead of their identifiers.
SELECT ps.name, COUNT(p.id) FROM product p JOIN person ps ON p.person_id=ps.id GROUP BY ps.name ORDER BY COUNT(p.id) DESC;
In the query above, we’ve joined the tables
person so we can retrieve our family members’ names. Now, in our
SELECT clause, we specify the column that stores our family members’ names. With that set, we need to also modify the
GROUP BY clause so it contains the name column from the
person table. We’ll also use an
ORDER BY clause to sort the data in descending order (
DESC) in terms of the number of products purchased. Take a look at the result of our report:
Our initial report indicates that Ryan bought the most items on Black Friday. With three items, Emily came in second.
It wasn’t too difficult to prepare this simple report with SQL. If you’d like to extend your knowledge of SQL and learn about joins, aggregate functions, data ordering, and many other concepts, or if your adventure with SQL has just begun, we encourage you to take a look at our SQL Basics course. We’re offering a special Black Friday promotion— get 40% OFF the course!
What Exactly Did Ryan Purchase?
All right, we know that Ryan bought the most items, but what exactly did he purchase? The following SQL query can help us answer this question:
WITH record_products AS ( SELECT person_id, COUNT(id) FROM product GROUP BY person_id ORDER BY COUNT(id) DESC LIMIT 1 ) SELECT p.name FROM product p JOIN record_products rp ON rp.person_id=p.person_id ;
To simplify our query, we used a Common Table Extension (CTE). A CTE functions as a temporary result set that can be used elsewhere within the same query. A CTE begins with the WITH keyword and a temporary name, followed by a pair of parentheses. Inside those parentheses, we define our temporary result set. Here, our CTE is temporarily named
record_products and defines a query that returns the identifier of the person who purchased the most products. The result of this query is limited to the first record returned. Note that PostgreSQL uses LIMIT clauses with the number of rows to accomplish this.
After the CTE, we wrote a query that returns the name of the products that were purchased by this individual, using the information from the
product table. We joined this table with the result of the CTE. In the join condition, we supplied the ID of the person returned by the CTE and the ID of the person from the
product table. This is crucial to obtaining the correct number of rows.
CTEs are pretty cool, and if you’re interested in learning more about using them, go ahead and check out our Recursive Queries course. With our special Black Friday sitewide sale, you can save 40% on the course!
Here are the results of this new report:
|Vertabelo “SQL Basics” course|
Who Spent the Most Money?
Although we know who bought the most items, we don’t know who spent the most money. These two people may not necessarily be the same, so it’s worth investigating further.
Take a look at the query below:
SELECT ps.name, SUM(p.price-(p.price*p.discount/100.0)) AS sum_price FROM product p JOIN person ps ON p.person_id=ps.id GROUP BY ps.name ORDER BY sum_price DESC;
Here, we used data from the joined tables
person to retrieve the name of each family member, alongside the amount they spent on Black Friday. The precise amount they spent is calculated with the
SUM function. Inside the
SUM function, we used an expression that calculates the discounted price by calculating the discount reduction and then subtracting it from the product’s base price, using values from the respective columns of
GROUP BY clause, we listed the column containing each family member’s name. We also used an
ORDER BY clause to sort the records in descending order by the amount spent by each family member. Take a look at the result of the updated report:
As you can tell, Ava spent the most on Black Friday, even though she bought fewer items than Ryan.
In our final report, we’ll answer the following question: what was the average discount for each family member?
Who Got the Most Discounts?
We’ll create another report to determine who took advantage of discounts the most. Here’s our new query:
SELECT ps.name, AVG(p.discount) AS avg_discount FROM product p JOIN person ps ON p.person_id=ps.id GROUP BY ps.name ORDER BY avg_discount DESC;
Again, we joined the two tables
person in order to retrieve family members’ names and the average discounts they received on their purchases. We grouped our data by the family members’ names, hence why we passed in the name column from the
person table to both the
GROUP BY clauses. We also used the AVG function to find the average percentage discount, passing in the
discount column as its argument.
The resulting table was sorted in descending order by the average discount of products purchased on Black Friday.
It seems Ethan enjoyed the greatest average discount on his purchases, even though he didn’t buy the most items or spend the most on his purchases.
SQL is a handy skill in our economy-driven world, as it can help you track your personal finances and generate a variety of useful reports. If you’re interested in learning more about SQL, Vertabelo Academy is a great place to get started. Check out our interactive courses today, and take advantage of our Black Friday discounts!