Day 11 - Task planning & visualization
TODO: This requires Options... which I had wanted to defer, but maybe it's impossible. Normalization involves the possibility of failure...
Skills: None
Pre-reading: 4.2.2, 4.2.3, 4.2.4, 4.2.5, 4.2.6
Intro (25 min)
- Often, what you want to do is big enough that jumping right into code is a mistake.
- Instead, good idea to develop a task plan -- as text, pictures, etc.
- A plan has concrete inputs & outputs, notes about what functions you might use, and a concrete sequence of steps and how the data is transformed through them.
- You might repeat this process at a smaller level if one of the steps still seems complex.
- Let's say we had the following table of weather data:
And we wanted to visualize the number of cold, mild, and hot days.
weather-data =
table: date, temperature, precipitation
row: "2025-01-01", 62, 0.1
row: "2025-01-02", "45", 3
row: "2025-01-03", 28, 0.2
row: "2025-01-04", 55, -1
row: "2025-01-05", 90, 0
end - The output is going to be a chart, which has bars for counts of cold, mild, and hot days. Let's say cold is below 40F, mild is 40F to 60F, and hot is above 60F.
- The steps we need to do are:
- Normalize the data -- in particular, the temperature column needs to be all numbers! While the precipitation column seems to have problems (a negative number) since our task doesn't have anything to do with precipitation, we don't need to deal with that.
- Create a new column with the three different bins -- cold, mild, or hot.
- Chart that new column.
- For the first step, we can use
is-string
andstring-to-number
with.or-else
to convert the errant values. First, let's define a helper that does the conversion:Then we can use that to transform the column:fun normalize-temp(v) -> Number:
doc: "given a number or a string that represents a number, converts to a number"
if is-string(t):
string-to-number(t).or-else(0)
else:
t
end
where:
normalize-temp(10) is 10
normalize-temp("13") is 13
endfixed-data = transform-column(weather-data, "temperature", normalize-temp)
- For the second column, we can define a helper function that does the bucket conversion:
Now we can use that to add a new column:
fun bucket-temp(t :: Number) -> String:
doc: "numbers < 40 turn into 'cold', >=40 and < 60 to 'mild' and >=60 to 'hot'"
if t < 40:
"cold"
else if t > 60:
"mild"
else:
"hot"
end
where:
bucket-temp(-10) is "cold"
bucket-temp(0) is "cold"
bucket-temp(39.9) is "cold"
bucket-temp(40) is "mild"
bucket-temp(58) is "mild"
bucket-temp(60) is "hot"
bucket-temp(100) is "hot"
endNote how I made my helper function just take a number, so it was easy to test, and then used awith-buckets = build-column(fixed-data, "temp-category", lam(r): bucket-temp(r["temperature"]) end)
lam
to pass tobuild-column
, which handled extracting the column value and passing it to the helper. I could have had the helper take in aRow
, but it would have make testing more complex. - Now, our last task is to visualize the results. We can do this with
freq-bar-chart
:freq-bar-chart(with-buckets, "temp-category")
Class Activity (30 min)
- Using the same data, visualize the number of days that were "dry" (no rain), "drizzly" (< 1" of rain), and "wet" (>= 1" of rain). It's up to you to decide how to handle the bad data! Think about what it might represent! First, make a plan!
- Consider you have the following table:
You want to have in your data, in addition to "full name", columns for given name (or first name) and family name (or last name).
employees =
table: full-name, department
row: "Jordan Smith", "Sales"
row: "Alexandra Lee", "Engineering"
row: "Sam", "Marketing"
row: "Ng, Alice", "Operations"
end
Wrap up (5 mins)
- Planning is useful for all programming, and all good programmers do it -- probably, as you gain experience, simpler things you will plan only in your head, and more complex things you will plan on paper (or on whiteboards, etc). But the process of planning remains the same, whatever the scale!