Skip to content

Commit

Permalink
docs: Complete AVL tree README
Browse files Browse the repository at this point in the history
  • Loading branch information
4ndrelim committed Feb 8, 2024
1 parent 076a6b6 commit c889b6b
Show file tree
Hide file tree
Showing 6 changed files with 106 additions and 23 deletions.
Binary file added docs/assets/images/AvlTree.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/images/BalancedProof.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/images/TreeRotation.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
32 changes: 13 additions & 19 deletions src/main/java/dataStructures/avlTree/AVLTree.java
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ public int height(T key) {
}

/**
* Update height of node in avl tree during rebalancing.
* Update height of node in avl tree for re-balancing.
*
* @param n node whose height is to be updated
*/
Expand Down Expand Up @@ -372,6 +372,10 @@ private T successor(Node<T> node) {
return null;
}


// ---------------------------------------------- NOTE ------------------------------------------------------------
// METHODS BELOW ARE NOT NECESSARY; JUST FOR VISUALISATION PURPOSES

/**
* prints in order traversal of the entire tree.
*/
Expand All @@ -390,13 +394,9 @@ private void printInorder(Node<T> node) {
if (node == null) {
return;
}
if (node.getLeft() != null) {
printInorder(node.getLeft());
}
printInorder(node.getLeft());
System.out.print(node + " ");
if (node.getRight() != null) {
printInorder(node.getRight());
}
printInorder(node.getRight());
}

/**
Expand All @@ -408,7 +408,6 @@ public void printPreorder() {
System.out.println();
}


/**
* Prints out pre-order traversal of tree rooted at node
*
Expand All @@ -419,12 +418,8 @@ private void printPreorder(Node<T> node) {
return;
}
System.out.print(node + " ");
if (node.getLeft() != null) {
printPreorder(node.getLeft());
}
if (node.getRight() != null) {
printPreorder(node.getRight());
}
printPreorder(node.getLeft());
printPreorder(node.getRight());
}

/**
Expand All @@ -442,12 +437,11 @@ public void printPostorder() {
* @param node node which the tree is rooted at
*/
private void printPostorder(Node<T> node) {
if (node.getLeft() != null) {
printPostorder(node.getLeft());
}
if (node.getRight() != null) {
printPostorder(node.getRight());
if (node == null) {
return;
}
printPostorder(node.getLeft());
printPostorder(node.getRight());
System.out.print(node + " ");
}

Expand Down
7 changes: 3 additions & 4 deletions src/main/java/dataStructures/avlTree/Node.java
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,9 @@ public class Node<T extends Comparable<T>> {
private Node<T> parent;
private int height;
/*
* Can insert more properties here.
* If key is not unique, introduce a value property
* so when nodes are being compared, a distinction
* can be made
* Can insert more properties here for augmentation
* e.g. If key is not unique, introduce a value property as a tie-breaker
* or weight property for order statistics
*/

public Node(T key) {
Expand Down
90 changes: 90 additions & 0 deletions src/main/java/dataStructures/avlTree/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# AVL Trees

## Background
Is the fastest way to search for data to store them in an array, sort them and perform binary search? No. This will
incur minimally O(nlogn) sorting cost, and O(n) cost per insertion to maintain sorted order.

We have seen binary search trees (BSTs), which always maintains data in sorted order. This allows us to avoid the
overhead of sorting before we search. However, we also learnt that unbalanced BSTs can be incredibly inefficient for
insertion, deletion and search operations, which are O(h) in time complexity (in the case of degenerate trees,
operations can go up to O(n)).

Here we discuss a type of self-balancing BST, known as the AVL tree, that avoids the worst case O(n) performance
across the operations by ensuring careful updating of the tree's structure whenever there is a change
(e.g. insert or delete).

### Definition of Balanced Trees
Balanced trees are a special subset of trees with **height in the order of log(n)**, where n is the number of nodes.
This choice is not an arbitrary one. It can be mathematically shown that a binary tree of n nodes has height of at least
log(n) (in the case of a complete binary tree). So, it makes intuitive sense to give trees whose heights are roughly
in the order of log(n) the desirable 'balanced' label.

<div align="center">
<img src="../../../../../docs/assets/images/BalancedProof.png" width="40%">
<br>
Credits: CS2040s Lecture 9
</div>

### Height-Balanced Property of AVL Trees
There are several ways to achieve a balanced tree. Red-black tree, B-Trees, Scapegoat and AVL trees ensure balance
differently. Each of them relies on some underlying 'good' property to maintain balance - a careful segmenting of nodes
in the case of RB-trees and enforcing a depth constraint for B-Trees. Go check them out in the other folders! <br>
What is important is that this **'good' property holds even after every change** (insert/update/delete).

The 'good' property in AVL Trees is the **height-balanced** property. Height-balanced on a node is defined as
**difference in height between the left and right child node being not more than 1**. <br>
We say the tree is height-balanced if every node in the tree is height-balanced. Be careful not to conflate
the concept of "balanced tree" and "height-balanced" property. They are not the same; the latter is used to achieve the
former.

<details>
<summary> <b>Ponder..</b> </summary>
Consider any two nodes (need not have the same immediate parent node) in the tree. Is the difference in height
between the two nodes <= 1 too?
</details>

It can be mathematically shown that a **height-balanced tree with n nodes, has at most height <= 2log(n)**. Therefore,
following the definition of a balanced tree, AVL trees are balanced.

<div align="center">
<img src="../../../../../docs/assets/images/AvlTree.png" width="40%">
<br>
Credits: CS2040s Lecture 9
</div>

## Complexity Analysis
**Search, Insertion, Deletion, Predecessor & Successor queries Time**: O(height) = O(logn)

**Space**: O(n) <br>
where n is the number of elements (whatever the structure, it must store at least n nodes)

## Operations
Minimally, an implementation of AVL tree must support the standard **insert**, **delete**, and **search** operations.
**Update** can be simulated by searching for the old key, deleting it, and then inserting a node with the new key.

Naturally, with insertions and deletions, the structure of the tree will change, and it may not satisfy the
"height-balance" property of the AVL tree. Without this property, we may lose our O(log(n)) run-time guarantee.
Hence, we need some re-balancing operations. To do so, tree rotation operations are introduced. Below is one example.

<div align="center">
<img src="../../../../../docs/assets/images/TreeRotation.png" width="40%">
<br>
Credits: CS2040s Lecture 10
</div>

Prof Seth explains it best! Go re-visit his slides (Lecture 10) for the operations :P <br>
Here is a [link](https://www.youtube.com/watch?v=dS02_IuZPes&list=PLgpwqdiEMkHA0pU_uspC6N88RwMpt9rC8&index=9)
for prof's lecture on trees. <br>
_We may add a summary in the near future._

## Application
While AVL trees offer excellent lookup, insertion, and deletion times due to their strict balancing,
the overhead of maintaining this balance can make them less preferred for applications
where insertions and deletions are significantly more frequent than lookups. As a result, AVL trees often find itself
over-shadowed in practical use by other counterparts like RB-trees,
which boast a relatively simple implementation and lower overhead, or B-trees which are ideal for optimizing disk
accesses in databases.

That said, AVL tree is conceptually simple and often used as the base template for further augmentation to tackle
niche problems. Orthogonal Range Searching and Interval Trees can be implemented with some minor augmentation to
an existing AVL tree.

0 comments on commit c889b6b

Please sign in to comment.