Day 14 - Operating on Lists
Skills: None
Pre-reading: 5.1.4 (aside from 5.1.4.1), 5.1.6
Reference: For all work with tables, refer to the Tables page in the menu at the top of the page!
Intro (20 mins)
- Last time we saw operations on lists of numbers from math and statistics. Today we are going to dive into many more built in operations on lists that come from lists.
- Imagine a list of discount codes used by a magazine:
discount-codes = [list: "NEWYEAR", "student", "NONE", "student", "VIP", "none"]
- Which may have come from a column of a table. As part of data cleaning, one
thing you might want to do first is figure out what are the distinct codes
used.
import lists as L
unique-codes = L.distinct(discount-codes) - We also have operations on lists that are similar to operations you saw
previously on tables. For example, we can
L.filter
to remove codes that represent no discount (in this case, normalized "none").fun is-real-code(code :: String) -> Boolean:
not(string-to-lower(code) == "none")
end
real-codes = L.filter(is-real-code, unique-codes) - Like how tables have
row-n()
, lists have a way of getting an element by position:first-code = real-codes.get(0)
- For tables, we had a few ways of transforming the existing data -- adding a
new column with
build-column
, or transforming a single column withtransform-column
. Since lists have only a single value, there is only one version, calledL.map
. - We can use this to, e.g., get lowercase versions of each discount code:
lower-codes = L.map(string-to-lower, codes-real)
Class Exercise (35 mins)
- Using the discount codes from the intro, first apply
L.distinct
to remove duplicates, then useL.map
withstring-to-upper
to convert all codes to uppercase. How many unique codes do you have after cleaning? - Given this list of survey responses:
[list: "yes", "NO", "maybe", "Yes", "no", "Maybe"]
, useL.distinct
andL.map
to create a list of unique responses in lowercase. Then useL.filter
to keep only definitive answers (filter out "maybe"). - Create a table of product prices:
Extract the price column and use
products =
table: name, price
row: "laptop", 999.99
row: "mouse", 25.50
row: "keyboard", 75.00
row: "monitor", 299.99
endL.filter
to find products under $100. Then useL.map
to apply a 10% discount to those prices. - Take the list
[list: "apple", "banana", "cherry", "date", "elderberry"]
and use.get()
to access the third item (remember: 0-indexed). Separately, useL.filter
to keep only fruits with names longer than 5 characters. - Load the Boston employees dataset (as in previous days), extract the "NAME"
column, and use
L.filter
to find names that contain "Smith". - Create a list of student names:
[list: "alice", "Bob", "CHARLIE", "diana"]
. UseL.map
to normalize all names to proper case (first letter uppercase, rest lowercase). You'll need to write a helper function that usesstring-to-upper
on the first character andstring-to-lower
on the rest, and usestring-substring
to extract the first characetr and rest.
Wrap-up (5 mins)
- Lists have many built-in operations -- we showed a few, but there are more at https://pyret.org/docs/latest/lists.html.