Chapter 12 – Binary Search Trees

 

Binary Search Tree Overview

·         Search trees are data structures that support many dynamic-set operations.

o        Can be used as both a dictionary and as a priority queue.

o        Basic operations take time proportional to the height of the tree…for

complete binary tree with n nodes: worst case (lg n).

o        For linear chain of n nodes: worst case (n).

·         Binary search trees are an important data structure for dynamic sets.

·         Different types of search trees include binary search trees, red-black trees (covered in Chapter 13), and B-trees (covered in Chapter 18).

·         We will cover binary search trees, tree walks, and operations on binary search trees now, then move to red-black trees and B-trees

 

Binary Search Tree Basics

·         Binary search trees accomplish many dynamic-set operations in O(h) time, where

h = height of tree.

·         As mentioned in previous lectures, we represent a binary tree by a linked data

structure in which each node is an object…root[T ] points to the root of tree T .

·         Each node contains the fields

o        key (and possibly other satellite data).

o        left: points to left child.

o        right: points to right child.

o        p: points to parent. p[root[T ]] = NIL.

·         Stored keys must satisfy the binary-search-tree property:

o        If y is in left subtree of x, then key[y] ≤ key[x].

o        If y is in right subtree of x, then key[y] ≥ key[x].

·         The root node is the only node in the tree whose parent field is NIL.

 

 

 

 

Inorder Tree Walk

·         The binary-search-tree property allows us to print keys in a binary search tree in

order, recursively, using an algorithm called an inorder tree walk. Elements are

printed in monotonically increasing order.

·         How INORDER-TREE-WALK works:

o        Check to make sure that x is not NIL.

o        Recursively, print the keys of the nodes in x’s left subtree.

o        Print x’s key.

o        Recursively, print the keys of the nodes in x’s right subtree.

 

 

INORDER-TREE-WALK(x)

if x NIL then

INORDER-TREE-WALK(left[x])

print key[x]

INORDER-TREE-WALK(right[x])

 

 

·         Example: Do the inorder tree walk on the example above, getting the output ABDFHK

 

·         Theorem: If x is the root of an n-node subtree, then the call INORDER-TREE-

WALK(x) takes Θ (n) time.

  • Correctness: Follows by induction directly from the binary-search-tree property.
  • Time: Intuitively, the walk takes (n) time for a tree with n nodes, because we visit

and print each node once. [Book has formal proof.]

 

Querying a Binary Search Tree

 

TREE-SEARCH(x, k)

if x = NIL or k = key[x]

then return x

if k < key[x]

then return TREE-SEARCH(left[x], k)

else return TREE-SEARCH(right[x], k)

 

Initial call is TREE-SEARCH(root[T ], k).

 

Running Time: The algorithm recurses, visiting nodes on a downward path from the

root.  Thus, running time is O(h), where h is the height of the tree.

 

 

 

 

 

Iterative Search:

To search from the root, the call is Tree-search(root(t),k).

 

Running Time: Like the recursive procedure, this algorithm operates by visiting nodes

on a downward path from the root.  Thus, running time is still O(h).  It is more efficient on

most computers, since the high cost of recursion isn’t involved.

 

Minimum and maximum

The binary-search-tree property guarantees that

·         the minimum key of a binary search tree is located at the leftmost node, and

·         the maximum key of a binary search tree is located at the rightmost node.

Traverse the appropriate pointers (left or right) until NIL is reached.

 

TREE-MINIMUM(x)

while left[x] ≠ NIL

do x left[x]

return x

 

TREE-MAXIMUM(x)

while right[x] ≠ NIL

do x right[x]

return x

 

Time: Both procedures visit nodes that form a downward path from the root to a leaf.

Both procedures run in O(h) time, where h is the height of the tree.

 

Successors and Predecessors in a Binary Search Tree

Assuming that all keys are distinct, the successor of a node x is the node y such that

key[y] is the smallest key > key[x]. (We can find x’s successor based entirely on the tree

structure. No key comparisons are necessary.) If x has the largest key in the binary

search tree, then we say that x’s successor is NIL.  Otherwise, there are two cases:

  1. If node x has a non-empty right subtree, then x’s successor is the minimum in x’s

right subtree.

  1. If node x has an empty right subtree, notice that:

·         As long as we move to the left up the tree (move up through right children),

we’re visiting smaller keys.

·         x’s successor y is the node that x is the predecessor of (x is the maximum in

y’s left subtree).


The first two lines handle the case where there is a right child (find the minimum on the

right side)

 

The last five lines walk up the tree, finding the first ancestor of x that has a left child that is also an ancestor of x

 

TREE-SUCCESSOR(x)

if right[x] ≠ NIL

then return TREE-MINIMUM(right[x])

y p[x]

while y ≠ NIL and x = right[y]

do x y

y p[y]

return y

 

TREE-PREDECESSOR is symmetric to TREE-SUCCESSOR.

 

 

Find the successor of the node with key value 15.

Find the successor of the node with key value 6.

Find the successor of the node with key value 4.

Find the predecessor of the node with key value 6.

 

Time: For both the TREE-SUCCESSOR and TREE-PREDECESSOR procedures, in both cases, we visit nodes on a path down the tree or up the tree. Thus, running time is O(h), where h is the height of the tree.

 

Inserting and Deleting Elements

Insertion and deletion allows the dynamic set represented by a binary search tree to

change. The binary-search-tree property must hold after the change. Insertion is more

straightforward than deletion.

 

 

·         To insert value v into the binary search tree, the procedure is given node z, with

      key[z] = v, left[z] = NIL, and right[z] = NIL.

·         Beginning at root of the tree, trace a downward path, maintaining two pointers.

·         Pointer x: traces the downward path.

·         Pointer y: “trailing pointer” to keep track of parent of x.

·         Traverse the tree downward by comparing the value of node at x with v, and

      move to the left or right child accordingly.

·         When x is NIL, it is at the correct position for node z.

·         Compare z’s value with y’s value, and insert z at either y’s left or right, appropriately.

 

·         Time: Same as TREE-SEARCH. On a tree of height h, procedure takes O(h) time.

·         TREE-INSERT can be used with INORDER-TREE-WALK to sort a given set of numbers.

 

Example: Run TREE-INSERT(C) on the first sample binary search tree.

 

Result:


Deletion

TREE-DELETE is broken into three cases.

Case 1: z has no children.

·         Delete z by making the parent of z point to NIL, instead of to z.

Case 2: z has one child.

·         Delete z by making the parent of z point to z’s child, instead of to z.

Case 3: z has two children.

·         z’s successor y has either no children or one child. (y is the minimum

node—with no left child—in z’s right subtree.)

·         Delete y from the tree (via Case 1 or 2).

·         Replace z’s key and satellite data with y’s.

 

 

Example: We can demonstrate on the above sample tree.

·         For Case 1, delete K.

·         For Case 2, delete H.

·         For Case 3, delete B, swapping it with C.

 

Time: O(h), on a tree of height h.

 

Minimizing Running Time

We’ve been analyzing running time in terms of h (the height of the binary search tree),

instead of n (the number of nodes in the tree).

·         Problem: Worst case for binary search tree is (n) – no better than linked list.

·         Solution: Guarantee small height (balanced tree) – h = O(lg n).

 

In later chapters, by varying the properties of binary search trees, we will

be able to analyze running time in terms of n.

·         Method: Restructure the tree if necessary. Nothing special is required for querying,

but there may be extra work when changing the structure of the tree (inserting or

deleting).

·         Red-black trees are a special class of binary trees that avoids the worst-case

behavior of O(n) like “plain” binary search trees.