Do It in SQL: Recursive Tree Traversal

SQL Recursive Tree Traversal

In the previous article, I described how to use Common Table Expressions to find the shortest path in a directed graph. That example could be hard to follow, I admit. Let’s do something much more common, something that is implemented on almost every website – a menu. But… we’ll do as much as possible in a single SQL query instead of writing the code. We’ll use CTEs for PostgreSQL and the hierarchical query clause for Oracle.

So, let’s see what a simple menu looks like:

Vertabelo menu

What have we got here? There is a main menu and a drop-down menu which appears after we click the button from the main menu. Of course, there could be more submenus attached to another button on the main menu. What’s more, there could also be a sub-submenu, which would appear after clicking on one of the buttons of a submenu.

In general, there could be an unlimited number of menu levels. The data structure of such a menu is a SQL tree:

Menu tree

As you can see, it’s a typical SQL tree structure. There are menu nodes which are attached to parent nodes. The only node without a parent is the root node.

This is how we store such a structure in the database:

Every node has its own, unique id. It has a reference to the parent node. For every node in a submenu we need its sequence number for ordering purposes. The name column is just a label which will be shown on the website. The url_path column may require a little more explanation. Every node stores a part of the full URL path of the resource. Its entire path is a concatenation of the url_path of all nodes from the root to that node.

For example, for the following data:

> select * from menu_node;
 id | parent_node_id | seq |        name        | url_path  
----+----------------+-----+--------------------+----------- 
  1 |                |   1 | root               | 
  2 |              1 |   1 | Diagram            | diagram 
  3 |              1 |   2 | My models          | my_models 
  4 |              1 |   3 | Share requests     | share 
  5 |              1 |   4 | My account         | account 
  6 |              1 |   5 | Invite people      | invite 
  7 |              1 |   6 | Help               | help 
  8 |              7 |   1 | Documentation      | doc 
  9 |              7 |   2 | FAQ                | faq 
 10 |              7 |   3 | Ask a question     | ask 
 11 |              7 |   4 | Request a feature  | feature 
 12 |              7 |   5 | Report a problem   | problem 
 13 |              7 |   6 | Keyboard shortcuts | shortcuts 
 14 |              8 |   1 | Section 1          | section1 
 15 |              8 |   2 | Section 2          | section2 
 16 |              8 |   3 | Section 3          | section3 
(16 rows) 

the entire path of the node “Section 1” is: /help/doc/section1.

Having such a structure, we want to render the menu on the page. If we didn’t use a hierarchical query, we would have to either run multiple queries (for every node to get its children) or retrieve all the data and build the structure in the code. I’m not saying it’s a bad approach, but it can be done in an easier and smarter way.

PostgreSQL – recursive WITH clause

In order to render such a menu, we need to know the full URL path of each button, information about a level of the node (to appropriately style it in CSS) and where to attach it (it’s parent’s ID of course). So let’s start to build our select query with CTE:

WITH RECURSIVE menu_tree (id, name, url, level, parent_node_id, seq) 
AS (

At first, we need to get the root node:

  SELECT 
    id, 
    name, 
    '' || url_path, 
    0, 
    parent_node_id, 
    1 
  FROM menu_node 
  WHERE parent_node_id is NULL

Then, we recursively go deeper, concatenating the path and incrementing the level:

  UNION ALL 
  SELECT 
    mn.id, 
    mn.name, 
    mt.url || '/' || mn.url_path, 
    mt.level + 1, 
    mt.id, 
    mn.seq 
  FROM menu_node mn, menu_tree mt 
  WHERE mn.parent_node_id = mt.id

And in the end, we need to get all rows except for the root node (we don’t need it anymore) in the proper order. First, they should be ordered by level, then “grouped” by parent, and finally following seq order. So the query will look like this:

SELECT * FROM menu_tree 
WHERE level > 0 
ORDER BY level, parent_node_id, seq; 

The final query:

WITH RECURSIVE menu_tree (id, name, url, level, parent_node_id, seq) 
AS ( 
  SELECT 
    id, 
    name, 
    '' || url_path, 
    0, 
    parent_node_id, 
    1 
  FROM menu_node 
  WHERE parent_node_id is NULL 

  UNION ALL 
  SELECT 
    mn.id, 
    mn.name, 
    mt.url || '/' || mn.url_path, 
    mt.level + 1, 
    mt.id, 
    mn.seq 
  FROM menu_node mn, menu_tree mt 
  WHERE mn.parent_node_id = mt.id 
) 
SELECT * FROM menu_tree 
WHERE level > 0 
ORDER BY level, parent_node_id, seq;

Looks pretty easy, doesn’t it? As a result we get the data:

 id |        name        |        url         | level | parent_node_id | seq 
----+--------------------+--------------------+-------+----------------+----- 
  2 | Diagram            | /diagram           |     1 |              1 |   1 
  3 | My models          | /my_models         |     1 |              1 |   2 
  4 | Share requests     | /share             |     1 |              1 |   3 
  5 | My account         | /account           |     1 |              1 |   4 
  6 | Invite people      | /invite            |     1 |              1 |   5 
  7 | Help               | /help              |     1 |              1 |   6 
  8 | Documentation      | /help/doc          |     2 |              7 |   1 
  9 | FAQ                | /help/faq          |     2 |              7 |   2 
 10 | Ask a question     | /help/ask          |     2 |              7 |   3 
 11 | Request a feature  | /help/feature      |     2 |              7 |   4 
 12 | Report a problem   | /help/problem      |     2 |              7 |   5 
 13 | Keyboard shortcuts | /help/shortcuts    |     2 |              7 |   6 
 14 | Section 1          | /help/doc/section1 |     3 |              8 |   1 
 15 | Section 2          | /help/doc/section2 |     3 |              8 |   2 
 16 | Section 3          | /help/doc/section3 |     3 |              8 |   3 
(15 rows)

With this single query you can get all the data you need to render a simple multi-level menu.

Oracle – hierarchical queries

In Oracle you can use either the hierarchical query clause (also known as “CONNECT BY query”) or recursive subquery factoring (introduced in version 11g release 2).

The structure of the second one is almost the same as in the query for PostgreSQL. The only differences are:

  • lack of RECURSIVE keyword
  • “level” is a reserved word, so we must change it

The rest remains unchanged and the query works well.

With regards to the hierarchical query clause, its syntax is totally different. The query looks like this:

SELECT 
  id, 
  name,
  SYS_CONNECT_BY_PATH(url_path, '/') AS url, 
  LEVEL, 
  parent_node_id, 
  seq 
FROM menu_node 
START WITH id IN 
  (SELECT id FROM menu_node WHERE parent_node_id IS NULL) 
CONNECT BY PRIOR id = parent_node_id;

One thing worth noting is that this query will go truly recursively, that is – in a depth-first order. The “recursive” WITH clause in PostgreSQL and (by default) in Oracle traverse the structure in a breadth-first order.

As you can see, recursive traversal queries may save some of our precious time (and a few lines of code). It’s your choice if you use them or not, but it’s definitely worth while to know the alternative.

Michał Kołodziejski

Senior Software Engineer, Team Leader

comments powered by Disqus

GET ACCESS TO EXPERT CONTENT!

Over 85.000 happy students
and counting!