Skip to main content

Mutable Data Structures and the Heap

1. Introduction & Motivation (5 minutes)

A. Setting the Stage

  • Context:
    • Today we’ll explore mutable data structures—data that can change over time—and the underlying memory model that supports mutation.
  • Learning Goals:
    • Understand how both Python and Pyret represent mutable data.
    • Learn that in Pyret, mutable fields in structured data must be explicitly marked with ref.
    • Update our “program directory” to add a “heap”, that helps to explain aliasing and mutation.

B. The Program Directory & Heap

  • Program Directory:
    • Maps names to values (or, for structured data, to addresses in the heap).
  • The Heap:
    • Structured data (like our library book) are stored in the heap—a region of memory with many slots.
    • In our model, the directory holds a name’s address (e.g., book1 → @1001), and the heap holds the actual value (e.g., @1001: LibraryBook(...)).
  • Key Contrast:
    • In Pyret, immutable bindings are declared with = while mutable ones use var (and later updated with := or, for fields, with the special ! syntax).
    • In Python, the assignment operator = both declares and updates variables, and every field is mutable by default.

2. The Library Book Example (15 minutes)

A. Data Design

  • Scenario:
    • Each library book has an id, a title, and a number of available copies.
    • We want to update the available copies (e.g., when a book is borrowed).

B. Python Version

  • Python Code using dataclasses:
    from dataclasses import dataclass

    @dataclass
    class LibraryBook:
    id: int
    title: str
    copies: int

    # Create a book record.
    book1 = LibraryBook(101, "The Curious Incident", 5)
  • Mutation in Python:
    • To update copies, simply do:
      book1.copies = book1.copies - 1  # e.g., borrowing a copy
    • Discussion:
      • Python does not require extra syntax to mark a field mutable; all fields are mutable.

C. Pyret Version

  • Pyret Code using a datatype with mutable fields:

    # Define a LibraryBook type.
    data LibraryBook:
    library-book(id :: Number,
    title :: String,
    ref copies :: Number)
    end

    # Create a book record.
    book1 = library-book(101, "The Curious Incident", 5)
  • Accessing mutable fields

    • To access a mutable field, you must use !, i.e.,:
    book1!copies

    This is to highlight that the value that is returned is the current value, but depending on other parts of the program, it may change.

  • Mutation in Pyret:

    • To update copies, you must use the special update syntax (the !{...} syntax):
      book1!{copies: book1!copies - 1}
    • Discussion:
      • Pyret’s syntax requires mutable fields to be explicitly marked with ref, guarding against accidental mutation.
      • While updating one field is more verbose this way, many can be updated at once.

3. The Heap and Aliasing (20 minutes)

A. Understanding the Heap

  • Program Directory Recap:
    • In our program directory, names map to values. For structured (mutable) data, the directory maps names to addresses in the heap.
    • Example:
      • Program Directory:
        • book1 → @1001
      • Heap:
        • @1001: LibraryBook(101, "The Curious Incident", 5)
  • Contrast in Python vs. Pyret:
    • In both Python and Pyret, when you assign book1 = LibraryBook(101, "The Curious Incident", 5), the program directory holds a reference (address) to the object in the heap; however, Python makes no distinction between mutable or immutable fields, since all fields are mutable!
    • In Pyret, when you print book1, you see an indicator (e.g., a special marker like a “caution tape”) next to mutable fields to remind you that their value can change.

B. Aliasing

  • Definition:
    • Aliasing occurs when two different names refer to the same object (same heap address).
  • Demonstration in Python:
    book1 = LibraryBook(101, "The Curious Incident", 5)
    book2 = book1 # aliasing: both names refer to the same object
    book1.copies = book1.copies - 1
    print(book2.copies) # prints 4 because book2 is an alias for book1
  • Demonstration in Pyret:
    book1 = library-book(101, "The Curious Incident", 5)
    book2 = book1 # book2 now refers to the same object in the heap as book1
    book1!{copies: book1!copies - 1}
    book2!copies # Will now produce 4 since book2 is an alias to book1
  • Discussion:
    • Explain that in both languages, mutation through one alias affects all aliases.
    • Note that, in Pyret, caution tape & ! are both indicators to watch out for effects of aliasing.
    • Ask: “Why is it important to have a clear mental model (the program directory and heap) when working with mutable data?”

C. Implications for Testing

  • Testing Mutable Data:
    • Show that tests for functions that mutate data must consider state changes.
    • Example (briefly): If a test expects book1.copies to be 5 but a return or checkout operation was run, the test might fail unless state is reset.
  • Takeaway:
    • Mutable data and aliasing introduce “state” that must be managed carefully during testing.

4. Equality of Data: Concepts and the Program Directory/Heap (15 minutes)

A. Two Notions of Equality

  • Structural Equality:
    • Two data values are structurally equal if their contents (fields) are the same.
  • Reference Equality:
    • Two names refer to the same object (i.e., share the same address in the heap).

B. A Library Book Record Example

  • Suppose We Have:
    • In both languages:
      book1 = LibraryBook(101, "The Curious Incident", 5)
      book2 = LibraryBook(101, "The Curious Incident", 5)
      book3 = LibraryBook(101, "The Curious Incident", 3)
      book4 = book2 # Alias for book2
  • Program Directory / Heap Model:
    • Directory (mapping names to addresses):
      • book1 → @1001
      • book2 → @1002
      • book3 → @1003
      • book4 → @1002
    • Heap (values stored at addresses):
      • @1001: LibraryBook(101, "The Curious Incident", 5)
      • @1002: LibraryBook(101, "The Curious Incident", 5)
      • @1003: LibraryBook(101, "The Curious Incident", 3)
  • Question:
    • “Which books would you consider equal? How should equality behave if we update a mutable field?”

C. Mutation and Its Effect on Equality

  • Update Scenario:
    • Suppose we update book2’s copies:
      • In Python:
        book2.copies = book2.copies + 2
      • In Pyret:
        book2!{copies: book2!copies + 2}
    • New Heap State:
      • @1002 now holds LibraryBook(101, "The Curious Incident", 7)
    • Observations:
      • book1 and book2 now have different contents, so structurally they differ.
      • book2 and book4 remain aliases—both reflect the updated value.
  • Discussion:
    • Ask: “After the update, should we consider book1 and book2 equal?”
      • Expected answer: No, because their contents now differ.
    • Ask: “What about book2 and book4?”
      • Expected answer: Yes, because they refer to the same object.

5. Equality Operations in Python (10 minutes)

  • Operators:
    • == checks for structural equality.
    • is checks for reference equality.
  • Examples (using our library book records):
    # Before mutation:
    print(book1 == book2) # True, since contents are identical.
    print(book2 is book4) # True, because book2 and book4 alias the same object.

    # After updating book2:
    book2.copies = book2.copies + 2
    print(book1 == book2) # Now False, since book1.copies is 5 and book2.copies is 7.
    print(book2 is book4) # Remains True.
  • Interactive Exercise:
    • Ask students to predict outputs of such comparisons before and after mutation.

4. Wrap-Up and Reflection (5 minutes)

A. Recap Key Points

  • In Pyret:
    • Mutable fields in data must be marked with ref, are accessed with ! instead of ., and are updated using (!{field: new_value}).
    • The program directory maps names to addresses; the heap holds mutable structured data.
  • In Python:
    • All fields in objects (like those created with @dataclass) are mutable by default.
    • The same syntax to update variables = also serves to update fields.
    • Python’s namespace (program directory) works similarly, but the syntax does not distinguish between declaration and mutation.
  • Aliasing:
    • In both languages, multiple names may refer to the same mutable object, so changes via one name affect all.