Trees
Poll: What's wrong with this recursive function?
def find_max(nums: list[int], index: int = 0) -> int:
"""Returns the maximum of the numbers in nums, starting from the given index"""
if index == len(nums) - 1:
return nums[index]
else:
max_of_rest: int = find_max(nums, index)
if nums[index] < max_of_rest:
return max_of_rest
else:
return nums[index]
- It's missing a base case
- It's missing a recursive case
- The recursive case doesn't progress towards the base case
- It has an "off-by-one bug"
Poll: What's wrong with this recursive function?
def countdown(n: int) -> None:
"""Prints numbers from n down to 0, one on each line"""
if n > 0:
print(n)
countdown(n - 1)
- It's missing a base case
- It's missing a recursive case
- The recursive case doesn't progress towards the base case
- It has an "off-by-one bug"
Trees
It's a tree!
You may have seen trees like this decision tree:
Some tree terminology / rules:
- Each node may have data
- There is a root node (the start)
- Each node points to any number of child nodes
- Cycles are not permitted
- There are any number of leaf nodes (node with no children)
In computer science, we draw our trees with the root at the top and the leaves at the bottom.
Poll: Is this a tree?
Source: https://www.canva.com/graphs/decision-trees/
- Yes
- No
Poll: Is this a tree?
Source: https://xkcd.com/1924/
- Yes
- No
Poll: Which of these are trees?
- A
- B
- C
- D
Poll: Which of these are trees? (B and B' are equivalent.)
- A
- B and B'
- C
Let's write a tree class! For each node, we will give it two children: left
and right
.
from typing import Optional, TypeVar, Generic
T = TypeVar('T')
class Node(Generic[T]):
def __init__(self, data: T):
self.data = data
self.left: Optional[Node[T]] = None
self.right: Optional[Node[T]] = None
def __eq__(self, other: object) -> bool:
if not isinstance(other, Node):
raise NotImplementedError
return self.data.__eq__(other.data)
def __str__(self) -> str:
value: str = f'{self.data}'
if self.left is not None:
value += f' {self.left}'
else:
value += ' *'
if self.right is not None:
value += f' {self.right}'
else:
value += ' *'
return f'({value})'
class Tree(Generic[T]):
def __init__(self, root_data: Optional[T] = None) -> None:
if root_data is None:
self.root: Optional[Node[T]] = None
else:
self.root = Node[T](root_data)
def __str__(self) -> str:
return self.root.__str__()
tree: Tree[str] = Tree[str]('Entry way')
assert tree.root is not None
tree.root.left = Node[str]('Living room')
tree.root.left.right = Node[str]('Kitchen')
print(tree) # (Entry way (Living room * (Kitchen * *)) *)
If we split that printed output into multiple lines:
(Entry way
(Living room
*
(Kitchen * *))
*
)
Binary Search Trees
Binary Tree: A tree in which each node has at most 2 children
The first and second child of a node are called the left and right child, respectively.
Binary Search Tree: A binary tree in which each node's data is greater than everything in its left subtree and less than everything in its right subtree
Here is an example of a binary search tree:
Poll: This is a Binary Search Tree. Where should the 12 go?
- A
- B
- C
- D
Poll: Which of these are Binary Search Trees?
- A
- B
- C
Poll: Is this a BST?
- Yes
- No
SortedSet
What's a Binary Search Tree good for?
Let's say I want to check for whether the tree contains 12. How do we do that?
What about searching for 11 (with the same BST)?
Hey, this is more efficient than searching a list! It's not as efficient as hashing, but it does make a pretty good set. And it's more useful if you care about the order of things.
The Python type sortedcontainers.SortedSet
stores elements using a Binary Search Tree.
(You may need to pip install sortedcontainers
)
ss = SortedSet([3, 1, 4, 1, 5])
print(ss) # SortedSet([1, 3, 4, 5])
ss.add(2)
print(ss) # SortedSet([1, 2, 3, 4, 5])
And there is a corresponding SortedDict
:
sm = SortedDict({2: [1, 2, 3], 1: [0, 0, 0]})
print(sm) # SortedDict({1: [0, 0, 0], 2: [1, 2, 3]})
set | sortedcontainers.SortedSet |
---|---|
Stored as a hash table (list of lists) Index of each element is calculated using __hash__() | Stored as a Binary Search Tree Elements must implement Comparable protocol |
Constant time to look up / add / remove | Logarithmic time to look up / add / remove |
Use when care more about speed than order | Use when care more about order than speed |
Poll: Which are true?
- For a
SortedDict
, it is constant time to check whether it contains a key - For a
SortedDict
, it is constant time to check whether it contains a value set
s can store things which don't implement the Comparable protocolset
s can store things which aren't hashable- When we iterate over a
set
, the elements will be increasing in size - When we iterate over a
SortedSet
the elements will be increasing in size
Search Algorithms
Let's use recursion to search for an element in the Tree
we wrote earlier. (See slides at the end for images of the tree)
class Tree(Generic[T]):
def __init__(self, root_data: Optional[T] = None) -> None:
if root_data is None:
self.root: Optional[Node[T]] = None
else:
self.root = Node[T](root_data)
def __str__(self) -> str:
return self.root.__str__()
def __contains__(self, item: T) -> bool:
return self.contains(item, self.root)
def contains(self, item: T, node: Optional[Node[T]]) -> bool:
if node is None:
return False
elif node.data == item:
return True
else:
return self.contains(item, node.left) or self.contains(item, node.right)
tree: Tree[str] = Tree[str]('Entry way')
assert tree.root is not None
tree.root.left = Node[str]('Living room')
tree.root.left.right = Node[str]('Kitchen')
print('Kitchen' in tree) # True
print('Bathroom' in tree) # False
The algorithm we just wrote is called a Depth-First Search, since it explores "deep" in one area before moving on to other unexplored "shallow" places (closer to the root).
Poll: This is pseudocode for a Depth-First Search on a graph that is not a tree (because it has cycles). What's a good base case?
DFS(node):
Base case:
???
Recursive case:
For each child:
Add child to explored nodes
DFS(child)
- If the node is in the set of explored nodes, do nothing
- If the node is a leaf, add it to the set of explored nodes
- If the node is
None
, do nothing - If the node is
None
, add it to the set of explored nodes