How to handle hierarchical data using advanced queries
Overview
Have you ever been faced with a hierarchical data tree where you need to get a specific path or make a validation before adding/editing a node? If you have, you probably have questioned yourself: “how can I go through the whole tree?” or “how do I know the depth of the tree?”.
We can build an algorithm to go through all the levels, but that way, we are not only making the task more complex and harder to maintain but also less performant.
In this post, we’ll show you how to do that by using CTEs or CONNECT BY on advanced queries on SQL and Oracle.
What is a CTE?
A CTE (Common Table Expression) is a named temporary result set for use within the statement that defines it. You may also know it as the WITH clause. It can be recursive or non-recursive. We will use the recursive CTE for hierarchical data as it allows joining all the hierarchy levels without knowing in advance how many levels there are.
Note: Avoid choosing CTE names that match the names of tables or views. If a query defines a CTE with a particular name, the CTE takes precedence over tables, views, etc.
For more information about CTEs, check out the following link:
Understanding SQL Server Recursive CTE By Practical Examples
What is CONNECT BY?
CONNECT BY is a hierarchical query clause in Oracle that specifies the relationship between parent rows and child rows of the hierarchy. The relationship condition must be qualified with the PRIOR operator to refer to the parent row.
Note: It can be any condition, and you can have multiple CONNECT BY. However, at least one must use the PRIOR operator to refer to the parent row.
For more information about CONNECT BY, you can check out the following link:
Terms to know when using CONNECT BY
START WITH - specifies the root row(s) of the hierarchy.
CONNECT BY - specifies the relationship between parent rows and child rows of the hierarchy.
PRIOR - evaluates the immediately following expression for the parent row of the current row in a hierarchical query.
LEVEL - pseudocolumn, returns 1 for a root row, 2 for a child of a root, and so on.
Example of a hierarchical data tree
Case Studies
Case study I - get the hierarchical tree
To retrieve the full hierarchical tree, we need to define the start point on the root node (1) and the connection condition (2) for the descendant nodes.
Using CTE (SQL or Oracle)
Using CONNECT BY (Oracle)
Case study II - get path to child
Imagine that you must have a dropdown to add a node where you can select a parent value and show the path for the selected value. That can be achieved by setting the start point to the selected node (1) and inverting the connection condition (2). Hence, it will read every node from the start point to the root.
Using CTE (SQL or Oracle)
Using CONNECT BY (Oracle)
Case study III - validate parent change
Picking up on the previous case study, now you must allow the user to edit the parent of a node.
How can you prevent the user from creating a cycle between nodes and breaking the tree? The idea is the same as in the second case study, with a slight change to prevent this scenario.
We will start from the target node (1), but on the connection condition, we just want to connect if the Id is different from the one we are editing (2). If we find the node we are editing on the path to the root, we stop because it will break the CONNECT BY condition.
We can also add a WHERE to retrieve only the node that is directly connected to the one being edited (3), and with this, return only one row as shown on the example output or none if there’s no conflict.
Using CTE (SQL or Oracle)
Using CONNECT BY (Oracle)
Tips and Reminders
Reminder: on Oracle CTE we can’t have a column named ‘level’ since it’s a reserved keyword of the CONNECT BY clause.
For more information about CTEs on Oracle and how you can implement equivalent functionalities of CONNECT BY on CTEs, you can check the following link:
ORACLE-BASE - Recursive Subquery Factoring : Hierarchical Queries Using Recursive WITH Clauses
For more information about CONNECT BY pseudocolumns, you can check the following link:
Thank you for reading!
I hope you enjoyed this article on handling hierarchical data using advanced queries and that it made your life easier.
Like this article? Share it: