How to handle hierarchical data using advanced queries

Overview

Have you ever been faced with a hierarchical data tree where you need to get a specific path or make a validation before adding/editing a node? If you have, you probably have questioned yourself: “how can I go through the whole tree?” or “how do I know the depth of the tree?”.

We can build an algorithm to go through all the levels, but that way, we are not only making the task more complex and harder to maintain but also less performant.

In this post, we’ll show you how to do that by using CTEs or CONNECT BY on advanced queries on SQL and Oracle.

 

What is a CTE?

A CTE (Common Table Expression) is a named temporary result set for use within the statement that defines it. You may also know it as the WITH clause. It can be recursive or non-recursive. We will use the recursive CTE for hierarchical data as it allows joining all the hierarchy levels without knowing in advance how many levels there are.

Note: Avoid choosing CTE names that match the names of tables or views. If a query defines a CTE with a particular name, the CTE takes precedence over tables, views, etc. 

For more information about CTEs, check out the following link: 
Understanding SQL Server Recursive CTE By Practical Examples

What is CONNECT BY?

CONNECT BY is a hierarchical query clause in Oracle that specifies the relationship between parent rows and child rows of the hierarchy. The relationship condition must be qualified with the PRIOR operator to refer to the parent row.

Note: It can be any condition, and you can have multiple CONNECT BY. However, at least one must use the PRIOR operator to refer to the parent row.

For more information about CONNECT BY, you can check out the following link:

Hierarchical Queries

Terms to know when using CONNECT BY

START WITH - specifies the root row(s) of the hierarchy.

CONNECT BY - specifies the relationship between parent rows and child rows of the hierarchy.

PRIOR - evaluates the immediately following expression for the parent row of the current row in a hierarchical query.

LEVEL - pseudocolumn, returns 1 for a root row, 2 for a child of a root, and so on.

Example of a hierarchical data tree

 
Hierarchical Data Tree
 
 

Case Studies

Case study I - get the hierarchical tree

To retrieve the full hierarchical tree, we need to define the start point on the root node (1) and the connection condition (2) for the descendant nodes.

Using CTE (SQL or Oracle)

get the hierarchical tree - CTE - Query

CTE – Query

get the hierarchical tree - CTE - Output

CTE – Output

Using CONNECT BY (Oracle)

CONNECT BY – Query

CONNECT BY – Output

Case study II - get path to child

Imagine that you must have a dropdown to add a node where you can select a parent value and show the path for the selected value. That can be achieved by setting the start point to the selected node (1) and inverting the connection condition (2). Hence, it will read every node from the start point to the root.

 
Advanced Queries - get path to child - example

Example end goal

 

Using CTE (SQL or Oracle)

CTE – Query

CTE – Output

Using CONNECT BY (Oracle)

CONNECT BY – Query

Note: to invert the connection condition on the CONNECT BY, you can simply move the side where the PRIOR keyword is placed.

CONNECT BY – Output

Case study III - validate parent change

Picking up on the previous case study, now you must allow the user to edit the parent of a node.   

How can you prevent the user from creating a cycle between nodes and breaking the tree? The idea is the same as in the second case study, with a slight change to prevent this scenario. 

We will start from the target node (1), but on the connection condition, we just want to connect if the Id is different from the one we are editing (2). If we find the node we are editing on the path to the root, we stop because it will break the CONNECT BY condition.

We can also add a WHERE to retrieve only the node that is directly connected to the one being edited (3), and with this, return only one row as shown on the example output or none if there’s no conflict.

Using CTE (SQL or Oracle)

CTE – Query

CTE – Output

Using CONNECT BY (Oracle)

CONNECT BY – Query

CONNECT BY – Output


Tips and Reminders

Reminder: on Oracle CTE we can’t have a column named ‘level’ since it’s a reserved keyword of the CONNECT BY clause.

For more information about CTEs on Oracle and how you can implement equivalent functionalities of CONNECT BY on CTEs, you can check the following link:

ORACLE-BASE - Recursive Subquery Factoring : Hierarchical Queries Using Recursive WITH Clauses

For more information about CONNECT BY pseudocolumns, you can check the following link:

Hierarchical Query Pseudocolumns


Thank you for reading! 

I hope you enjoyed this article on handling hierarchical data using advanced queries and that it made your life easier.

 

Like this article? Share it:

Previous
Previous

Let’s talk indexes

Next
Next

Concepts for BIG OutSystems Projects (Part III)