Skip to main content

Day 14 - Operating on Lists

Skills: None

Pre-reading: 5.1.4 (aside from 5.1.4.1), 5.1.6

Reference: For all work with tables, refer to the Tables page in the menu at the top of the page!

Intro (20 mins)

  • Last time we saw operations on lists of numbers from math and statistics. Today we are going to dive into many more built in operations on lists that come from lists.
  • Imagine a list of discount codes used by a magazine:
    discount-codes = [list: "NEWYEAR", "student", "NONE", "student", "VIP", "none"]
  • Which may have come from a column of a table. As part of data cleaning, one thing you might want to do first is figure out what are the distinct codes used.
    import lists as L

    unique-codes = L.distinct(discount-codes)
  • We also have operations on lists that are similar to operations you saw previously on tables. For example, we can L.filter to remove codes that represent no discount (in this case, normalized "none").
    fun is-real-code(code :: String) -> Boolean:
    not(string-to-lower(code) == "none")
    end

    real-codes = L.filter(is-real-code, unique-codes)
  • Like how tables have row-n(), lists have a way of getting an element by position:
    first-code = real-codes.get(0)
  • For tables, we had a few ways of transforming the existing data -- adding a new column with build-column, or transforming a single column with transform-column. Since lists have only a single value, there is only one version, called L.map.
  • We can use this to, e.g., get lowercase versions of each discount code:
    lower-codes = L.map(string-to-lower, codes-real)

Class Exercise (35 mins)

  • Using the discount codes from the intro, first apply L.distinct to remove duplicates, then use L.map with string-to-upper to convert all codes to uppercase. How many unique codes do you have after cleaning?
  • Given this list of survey responses: [list: "yes", "NO", "maybe", "Yes", "no", "Maybe"], use L.distinct and L.map to create a list of unique responses in lowercase. Then use L.filter to keep only definitive answers (filter out "maybe").
  • Create a table of product prices:
    products =
    table: name, price
    row: "laptop", 999.99
    row: "mouse", 25.50
    row: "keyboard", 75.00
    row: "monitor", 299.99
    end
    Extract the price column and use L.filter to find products under $100. Then use L.map to apply a 10% discount to those prices.
  • Take the list [list: "apple", "banana", "cherry", "date", "elderberry"] and use .get() to access the third item (remember: 0-indexed). Separately, use L.filter to keep only fruits with names longer than 5 characters.
  • Load the Boston employees dataset (as in previous days), extract the "NAME" column, and use L.filter to find names that contain "Smith".
  • Create a list of student names: [list: "alice", "Bob", "CHARLIE", "diana"]. Use L.map to normalize all names to proper case (first letter uppercase, rest lowercase). You'll need to write a helper function that uses string-to-upper on the first character and string-to-lower on the rest, and use string-substring to extract the first characetr and rest.

Wrap-up (5 mins)