Skip to main content

Recitation 9 -- Trees

Skills: 6

Reading: 7.1

Objectives

  • Understand hierarchical data and why tables fall short
  • Define recursive data types for tree structures
  • Write recursive functions to process trees
  • Implement tree traversal and transformation functions

I. Introduction (5 minutes)

Explain that trees are hierarchical data -- data that nest sub-entities within a larger entity. Introduce a real-world scenario: a file system where folders contain files and other folders. A table represents flat rows. While you can include a "parent folder" field in each row, you lose the nesting structure.

II. A File System Table Example (10 minutes)

Present a file system table:

filesystem-table = table: name, type, parent, size
row: "Documents", "folder", "", 0
row: "Photos", "folder", "", 0
row: "resume.pdf", "file", "Documents", 150
row: "notes.txt", "file", "Documents", 25
row: "School", "folder", "Documents", 0
row: "homework.doc", "file", "School", 75
row: "vacation.jpg", "file", "Photos", 500
row: "family.jpg", "file", "Photos", 300
end

Check Understanding: Ask, "What happens if we want to calculate the total size of the Documents folder including all subfolders?"

Explain that while tables are great for flat data, they make representing and processing hierarchical (tree) data hard. The recursive nature of trees (folders contain folders contain folders...) suggests we need a recursive data structure.

III. Defining a Recursive Datatype for File System Trees (15 minutes)

A. Data Definition

Present a new datatype called FileTree:

data FileTree:
| file(name :: String, size :: Number)
| folder(name :: String, contents :: List<FileTree>)
end

Explain that we have two variants: files (with name and size) and folders (with name and list of contents).

B. Example

Build a simple file system tree:

resume = file("resume.pdf", 150)
notes = file("notes.txt", 25)
homework = file("homework.doc", 75)
vacation-pic = file("vacation.jpg", 500)
family-pic = file("family.jpg", 300)

school-folder = folder("School", [list: homework])
docs-folder = folder("Documents", [list: resume, notes, school-folder])
photos-folder = folder("Photos", [list: vacation-pic, family-pic])

root = folder("Root", [list: docs-folder, photos-folder])

IV. Discovering Recursive Functions on Trees (10 minutes)

Notice that a FileTree can be either a file (base case) or a folder containing a list of other FileTrees (recursive case).

A. Example

  • For a file, count is 1
  • For a folder, count is 1 (for the folder itself) plus the count of each item in contents
fun count-items(ft :: FileTree) -> Number:
cases (FileTree) ft:
| file(n, s) => 1
| folder(n, contents) => 1 + count-items-list(contents)
end
where:
count-items(resume) is 1
count-items(school-folder) is 2
end

fun count-items-list(fts :: List<FileTree>) -> Number:
cases (List) fts:
| empty => 0
| link(first, rest) => count-items(first) + count-items-list(rest)
end
end

V. Traversing Trees Recursively (10 minutes)

Write a function that returns only file names (not folder names):

fun list-files-only(ft :: FileTree) -> List<String>:
cases (FileTree) ft:
| file(n, s) => [list: n]
| folder(n, contents) => list-files-only-list(contents)
end
where:
list-files-only(resume) is [list: "resume.pdf"]
list-files-only(docs-folder) is [list: "resume.pdf", "notes.txt", "homework.doc"]
end

fun list-files-only-list(fts :: List<FileTree>) -> List<String>:
cases (List) fts:
| empty => empty
| link(first, rest) => list-files-only(first) + list-files-only-list(rest)
end
end

VI. Transforming Trees (10 minutes)

Explain that for each loops require linear data (lists) because they process elements sequentially. Trees have branching structure, so you must reconstruct the structure recursively.

A. Example

Write a function that keeps only files larger than a certain size:

fun keep-large-files(ft :: FileTree, min-size :: Number) -> FileTree:
cases (FileTree) ft:
| file(n, s) =>
if s >= min-size:
file(n, s)
else:
folder("empty", empty)
end
| folder(n, contents) => folder(n, keep-large-files-list(contents, min-size))
end
end

fun keep-large-files-list(fts :: List<FileTree>, min-size :: Number) -> List<FileTree>:
cases (List) fts:
| empty => empty
| link(first, rest) =>
filtered-first = keep-large-files(first, min-size)
link(filtered-first, keep-large-files-list(rest, min-size))
end
end

Emphasize that recursive functions let us rebuild the tree's structure as we process each node.

VII. Recap and Wrap-Up (5 minutes)

Key Points Discussion:

  • Hierarchical Data: Trees naturally represent nested, branching structures
  • Recursive Data Types: Define tree types that refer to themselves
  • Recursive Functions: Process trees by handling base cases (leaves) and recursive cases (branches)
  • Transformation: Rebuild tree structure while modifying contents

Reflection Questions:

  • "How is processing trees different from processing lists?"
  • "Why can't we use for-each loops on trees?"