How to store tree structures in relational databases
A common task is to store a tree structure with all relationships between its nodes in a relational database. I was always looking for a solution to select and update entire subtrees without recursion. This can be achieved if the tree is saved as a nested set graph. I had to figure out by myself how the nested set representation works, because I couldn't find any clean and simple implementation, so hopefully this page will be googled.
Adjacency Lists
Usually trees are modeled using adjacency lists: each node in the tree has a unique ID and saves a foreign key for the ID of it's parent node. Adjacency lists are efficient for insert and delete operations. But operations to select/update/delete the nodes of a subtree require recursion. If the number of levels in the tree is large, this method will become increasingly slow.Nested Sets
The nested set model allows you to select and update the nodes of a subtree without recursion in one SQL query. Unfortunately insert and delete operations have higher costs, because the nodes of the right neightbour tree need to be updated, too. However, this could be done in two SQL queries, still without any recursion.In the nested set model, each node requires two additional attributes right visit and left visit. The tree is then traversed in a modified pre-order: starting with the root node, each node is visited twice. Whenever a node is entered or exited during the traversal, the sequence number of all visits is saved in the current node's right visit and left visit:
![]()
To fetch all nodes in the subtree of C in the sample tree above, you would simply have to select all nodes from the tree which have a right visit < 21 and a left visit > 14.
I will introduce the steps to insert and delete nodes in a nested set tree as pseudo code:
Insert a new node into the tree as a leave node:
Delete an exisiting leave node from the tree:
- select root.rightvisit/2 to get the total node count
- select parent.leftvisit and parent.rightvisit
- set rightvisit=rightvisit+2 where rightvisit>=parent.rightvisit and rightvisit<=nodecount*2
- set leftvisit=leftvisit+2 where leftvisit>parent.rightvisit and leftvisit<(nodecount+1)*2
- insert the new node with leftvisit=parent.rightvisit+1 and rightvisit=leftvisit+1
(Deleting a leave node is simply done in the reverse order of inserting a node!)Prove if a node is a leave node:
- select root.rightvisit/2 to get the total node count
- select node.leftvisit and node.rightvisit of the node that should be deleted
- set leftvisit=leftvisit-2 where leftvisit>node.rightvisit and leftvisit<(nodeCount*2)+1
- set rightvisit=rightvisit-2 where rightvisit>=node.rightvisit of the parent node and rightvisit<=nodecount*2
- delete the node
Select the total node count in the tree:
- prove if node.leftvisit - node.rightvisit = 1
Select the subtree of a node:
- select root.rightvisit/2
Select the path from a node up to the root node:
- select all nodes where child.leftvisit>=node.leftvisit and child.rightvisit<=node.rightvisit
- order by child.leftvisit gives tree view
- select all nodes where parent.leftvisit<=node.leftvisit and parent.rightvisit>=node.rightvisit
- order by parent.leftvisit descending gives bottom up path
Nested Set Java Sample Code
I did a sample implementation in Java which you can download. This code has further functions implemented to select and update nodes, subtrees and paths in the tree. I used MySQL as the database for my sample code, and MySQL Connector/J as the JDBC driver. The following SQL statements will setup the database schema for the sample code:
DROP TABLE IF EXISTS TREENODES; CREATE TABLE TREENODES ( # these attributes represent the adjacency list # to select child nodes ID MEDIUMINT UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT, PARENT_ID MEDIUMINT UNSIGNED NOT NULL, # these attributes represent the nested set model LEFT_VISIT MEDIUMINT UNSIGNED NOT NULL, RIGHT_VISIT MEDIUMINT UNSIGNED NOT NULL, # misc. attribs. of the node LEVEL MEDIUMINT UNSIGNED NOT NULL, NAME VARCHAR(255) NOT NULL, VALUE VARCHAR(255) NOT NULL, UNIQUE (NAME), # the indices INDEX (NAME), INDEX (LEFT_VISIT), INDEX (RIGHT_VISIT) ); INSERT INTO TREENODES \ (NAME,VALUE,PARENT_ID,LEFT_VISIT,RIGHT_VISIT,LEVEL) \ VALUES ('root','none',0,1,2,0);Please note that the LEVEL attribute is not required by the nested set model. Its just required because I couldn't figure out yet how to calculate a node's level when I do selects for a subtree of a node.
All materials on these pages are sole property of the owner of the web site. Reproduction of any element is not allowed without prior agreement.
© 2000-2004 www.weckert.org - Disclaimer