Grouping Data in SQL Server

Happy woman learning SQL on the sofa at her home.

People who work with data know that data grouping can sometimes be a chore. If you struggle and waste hours with Excel, OpenOffice, or some other tool trying to group various sets of data, take a look at this article and learn how to do the work much quicker and easier by using SQL.

You often find yourself in a situation where you need to analyze data and present the results in a form other than the one in which the data is stored. For example, there are situations where you need to group individual data according to the level of aggregation defined by a user request. In this article, you will explore how SQL can help you when it comes to effective data manipulation in terms of grouping individual data.

Scenario

You work in the HR department of a startup company. The company has branches throughout Italy, and the CEO has decided to carry out workforce planning. The CEO has requested the following information:

  • Number of employees per location.
  • Number of employees per department at each location.

This information is internally stored in the company’s HR SQL Server database, which contains a table with data on employees. Your job is to analyze the data stored there and provide the requested information to the CEO.

Sample Data Table

The table below is named Employees and contains sample data that will be analyzed and manipulated throughout the article examples.

IDFIRST NAMELAST NAMESEXAGEDEPARTMENTLOCATION
1AndyMillerM25MarketingRome
2JohnJohnsonM38Business ManagementMilan
3MarioDavisM34ITTurin
4LisaBrownF22ITMilan
5StuartJonesM20ITFlorence
6MayaWilsonF30ITVerona
7MiaCookF40MarketingRome
8EmmaWardF32HRFlorence
9SteveFloresM44Business ManagementTurin
10BrandonFosterM23HRTurin

In order to graphically present the grouping process in the examples that follow, you will sort sample data by location and department. Such table data will be used as a starting point for explaining the data grouping process.

You will sort the data using the following query:

select *
from [dbo].[Employees]
order by location, department

Now, the sample data table looks like this:

IDFIRST NAMELAST NAMESEXAGEDEPARTMENTLOCATION
8EmmaWardF32HRFlorence
5StuartJonesM20ITFlorence
2JohnJohnsonM38Business ManagementMilan
4LisaBrownF22ITMilan
1AndyMillerM25MarketingRome
7MiaCookF40MarketingRome
9SteveFloresM44Business ManagementTurin
10BrandonFosterM23HRTurin
3MarioDavisM34ITTurin
6MayaWilsonF30ITVerona

As you can see, the data in the table is sorted in ascending order, first by location and then by department.

SQL GROUP BY Clause

The tasks defined by the CEO all require data grouping. In SQL, data grouping is performed using a GROUP BY clause.

The SQL GROUP BY clause allows us to group individual data based on defined criteria. You can group individual data by one or more table columns. In order to do the grouping properly, you often need to apply aggregate functions to the column(s) within the SQL SELECT statement.

Syntax of GROUP BY Clause

The syntax of the GROUP BY clause is the following:

select count(columnName1), columnName2, columnName3, ... , columnNameN
from tableName
where condition
GROUP BY columnName2, ... , columnNameN
order by columnName2, ... , columnNameN

Please note: When it comes to the syntax, it’s important to keep in mind the place of the GROUP BY clause within the SELECT statement. Always write a GROUP BY statement after the FROM and/or WHERE clauses and before the ORDER BY clause (if one exists).

As you can see, in order to use the GROUP BY SQL clause, you need to know the basics of querying data in SQL. You can brush up on those basics in our SQL Basics in MS SQL Server course.

If you’re already familiar with writing basic SQL SELECT statements, learning how to group data will be a piece of cake. Let's see how GROUP BY works in practice.

SQL GROUP BY Examples

You will solve the tasks presented in the scenario. First, you’ll write a basic GROUP BY query. Afterwards, you’ll go through a bit of a more complicated example in order to see how grouping can be performed on more than one table column.

Task #1: Get the Number of Employees per Location

SQL Query:

select location, count(location) as number_of_employees
from [dbo].[Employees]
group by location
order by location

Query Analysis:

Let’s take a closer look at the process of building the above query:

  1. Select the Location column; you will group your data using this column.

    select location
    from [dbo].[Employees]
    
  2. Add another column to the SELECT statement by applying the COUNT function on one of the table columns in order to get the number of employees. It doesn't matter which table column you apply the COUNT function to, since it is used to count the table rows.

    select location, count(location) as number_of_employees
    from [dbo].[Employees]
    
  3. 3. The previous query will not work without a GROUP BY clause since it contains an aggregate function. Insert a GROUP BY clause and use the Location column for grouping. Since the column contains five different values (Florence, Milan, Rome, Turin and Verona), the final query result will have five rows.

    select location, count(location) as number_of_employees
    from [dbo].[Employees]
    group by location
    

    The table below displays how the grouping is done: the data is grouped based on distinct values in the Location column, where each drawn rectangle contains data for a particular location. In other words, each rectangle represents one data group.The COUNT function will count the rows within the groups and return the number of employees within a location.

    GROUP BY clause
  4. Apply an ORDER BY clause on the Location column in order to sort the final query results in ascending order based on values in that column.

    select location, count(location) as number_of_employees
    from [dbo].[Employees]
    group by location
    order by location
    

Final Result:

LOCATIONNUMBER OF EMPLOYEES
Florence2
Milan2
Rome2
Turin3
Verona1

Task #2: Get the Number of Employees per Department at Each Location

SQL Query:

select location, department, count(department) as number_of_employees
from [dbo].[Employees]
group by location, department
order by location 
 

Query Analysis:

  1. Select the Location and Department columns; you will group your data using these columns.

    select location, department
    from [dbo].[Employees]
     
  2. Like in the previous example, add another column to the SELECT statement by applying the COUNT function to one of the table columns in order to get the number of employees.

    select location, department, count(department) as number_of_employees
    from [dbo].[Employees]
     
  3. Again, the query from the previous step won’t work without a GROUP BY clause since it contains an aggregate function. You now apply a GROUP BY clause on both the Location and Department columns. First, you group data by location; then you group those locations by department, effectively creating department subgroups within the location groups.

    select location, department, count(department) as number_of_employees
    from [dbo].[Employees]
    group by location, department
     

    The table below shows how this grouping is done: blue rectangles represent the groups based on Location, and red rectangles represent the subgroups based on Department. Notice how the department subgroups are made within the location groups. The COUNT function will count the rows within the subgroup of each group and return the number of employees per department at each location.

    GROUP BY clause on Location and Department
  4. Apply an ORDER BY clause on the Location column in order to sort the query results in ascending order by location.

    select location, department, count(department) as number_of_employees
    from [dbo].[Employees]
    group by location, department
    order by location
     

Final Result:

LOCATIONDEPARTMENTNUMBER OF EMPLOYEES
FlorenceHR1
FlorenceIT1
MilanBusiness Management1
MilanIT1
RomeMarketing2
TurinBusiness Management1
TurinHR1
TurinIT1
VeronaIT1

Summary

Grouping data is a common operation when it comes to data manipulation. Being able to group data easily is a very useful skill when working with datasets of different sizes. You have seen how the SQL GROUP BY clause makes grouping easy: all you need is basic knowledge of SQL and a couple of minutes to write the SQL query. In case you have it, it will be a pity if you don't upgrade it with GROUP BY functionality and get the most of it when it comes to data grouping.

This article reviewed the basics of data grouping in SQL Server, but if you’d like to explore more advanced SQL Server grouping options, be sure to check out our course on GROUP BY Extensions in SQL Server. However, if you’re new to SQL, take a look at some of Vertabelo Academy SQL Server courses to get started.

Belma Mesihovic

Belma is a software engineer with over a decade of hands-on experience in both database and application development. She has a strong background in statistical and financial software development and is especially passionate about data processing. In her free time, she is an insatiable traveler and a true nature lover, always planning her next adventure.

comments powered by Disqus