Day 12 - Task planning & visualization
Skills: None
Pre-reading: 4.2.2, 4.2.3, 4.2.4, 4.2.5, 4.2.6
Reference: For all work with tables, refer to the Tables page in the menu at the top of the page!
Intro (25 mins)
-
Often, what you want to do is big enough that jumping right into code is a mistake.
-
Instead, good idea to develop a task plan -- as text, pictures, etc.
-
A plan has concrete inputs & outputs, notes about what functions you might use, and a concrete sequence of steps and how the data is transformed through them.
-
You might repeat this process at a smaller level if one of the steps still seems complex.
-
Let's say we had the following table of weather data:
weather-data =
table: date, temperature, precipitation
row: "2025-01-01", 62, 0.1
row: "2025-01-02", "45", 3
row: "2025-01-03", 28, 0.2
row: "2025-01-04", 55, -1
row: "2025-01-05", 90, 0
endAnd we wanted to visualize the number of cold, mild, and hot days.
-
The output is going to be a chart, which has bars for counts of cold, mild, and hot days. Let's say cold is below 40F, mild is 40F to 60F, and hot is above 60F.
-
The steps we need to do are:
- Normalize the data -- in particular, the temperature column needs to be all numbers! While the precipitation column seems to have problems (a negative number) since our task doesn't have anything to do with precipitation, we don't need to deal with that.
- Create a new column with the three different bins -- cold, mild, or hot.
- Chart that new column.
-
For the first step, we can use
is-string
andstring-to-number
with.or-else
to convert the errant values. First, let's define a helper that does the conversion:fun normalize-temp(v) -> Number:
doc: "given a number or a string that represents a number, converts to a number"
if is-string(v):
string-to-number(v).or-else(0)
else:
v
end
where:
normalize-temp(10) is 10
normalize-temp("13") is 13
end -
string-to-number
returns a special type of value called anOption
, which either has a value in it, assome(N)
for some numberN
(indicating that, indeed, the conversion succeeded), or is the valuenone
, which means that there is no value -- in this case,none
means the string could not be successfully converted.Option
s are a special case of structured, conditional data which we will learn about in a few weeks, but here we use them in a particular way -- by accessing the.or-else(M)
functionality of theOption
so get back either the number inside thesome
(N
above), or what is passed to.or-else
(M
).Then we can use that to transform the column:
fixed-data = transform-column(weather-data, "temperature", normalize-temp)
-
For the second column, we can define a helper function that does the bucket conversion:
fun bucket-temp(t :: Number) -> String:
doc: "numbers < 40 turn into 'cold', >=40 and < 60 to 'mild' and >=60 to 'hot'"
if t < 40:
"cold"
else if t > 60:
"mild"
else:
"hot"
end
where:
bucket-temp(-10) is "cold"
bucket-temp(0) is "cold"
bucket-temp(39.9) is "cold"
bucket-temp(40) is "mild"
bucket-temp(58) is "mild"
bucket-temp(60) is "hot"
bucket-temp(100) is "hot"
endNow we can use that to add a new column:
with-buckets = build-column(fixed-data, "temp-category", lam(r): bucket-temp(r["temperature"]) end)
Note how I made my helper function just take a number, so it was easy to test, and then used a
lam
to pass tobuild-column
, which handled extracting the column value and passing it to the helper. I could have had the helper take in aRow
, but it would have make testing more complex. -
Now, our last task is to visualize the results. We can do this with
freq-bar-chart
:freq-bar-chart(with-buckets, "temp-category")
Class Activity (30 mins)
- Using the same data, visualize the number of days that were "dry" (no rain), "drizzly" (< 1" of rain), and "wet" (>= 1" of rain). It's up to you to decide how to handle the bad data! Think about what it might represent! First, make a plan!
- Consider you have the following table:
You want to have in your data, in addition to "full name", columns for given name (or first name) and family name (or last name). Make a plan for computing these two columns, thinking carefully about what different cases you might have to consider. Are all of the cases you thought of reflected in the example above? If not, add more examples to the data.
employees =
table: full-name, department
row: "Jordan Smith", "Sales"
row: "Alexandra Lee", "Engineering"
row: "Sam", "Marketing"
row: "Ng, Alice", "Operations"
end
Wrap up (5 mins)
- Planning is useful for all programming, and all good programmers do it -- probably, as you gain experience, simpler things you will plan only in your head, and more complex things you will plan on paper (or on whiteboards, etc). But the process of planning remains the same, whatever the scale!