Skip to main content

Day 12 - Task planning & visualization

Skills: None

Pre-reading: 4.2.2, 4.2.3, 4.2.4, 4.2.5, 4.2.6

Reference: For all work with tables, refer to the Tables page in the menu at the top of the page!

Intro (25 mins)

  • Often, what you want to do is big enough that jumping right into code is a mistake.

  • Instead, good idea to develop a task plan -- as text, pictures, etc.

  • A plan has concrete inputs & outputs, notes about what functions you might use, and a concrete sequence of steps and how the data is transformed through them.

  • You might repeat this process at a smaller level if one of the steps still seems complex.

  • Let's say we had the following table of weather data:

    weather-data =
    table: date, temperature, precipitation
    row: "2025-01-01", 62, 0.1
    row: "2025-01-02", "45", 3
    row: "2025-01-03", 28, 0.2
    row: "2025-01-04", 55, -1
    row: "2025-01-05", 90, 0
    end

    And we wanted to visualize the number of cold, mild, and hot days.

  • The output is going to be a chart, which has bars for counts of cold, mild, and hot days. Let's say cold is below 40F, mild is 40F to 60F, and hot is above 60F.

  • The steps we need to do are:

    1. Normalize the data -- in particular, the temperature column needs to be all numbers! While the precipitation column seems to have problems (a negative number) since our task doesn't have anything to do with precipitation, we don't need to deal with that.
    2. Create a new column with the three different bins -- cold, mild, or hot.
    3. Chart that new column.
  • For the first step, we can use is-string and string-to-number with .or-else to convert the errant values. First, let's define a helper that does the conversion:

    fun normalize-temp(v) -> Number:
    doc: "given a number or a string that represents a number, converts to a number"
    if is-string(v):
    string-to-number(v).or-else(0)
    else:
    v
    end
    where:
    normalize-temp(10) is 10
    normalize-temp("13") is 13
    end
  • string-to-number returns a special type of value called an Option, which either has a value in it, as some(N) for some number N (indicating that, indeed, the conversion succeeded), or is the value none, which means that there is no value -- in this case, none means the string could not be successfully converted. Options are a special case of structured, conditional data which we will learn about in a few weeks, but here we use them in a particular way -- by accessing the .or-else(M) functionality of the Option so get back either the number inside the some (N above), or what is passed to .or-else (M).

    Then we can use that to transform the column:

    fixed-data = transform-column(weather-data, "temperature", normalize-temp)
  • For the second column, we can define a helper function that does the bucket conversion:

    fun bucket-temp(t :: Number) -> String:
    doc: "numbers < 40 turn into 'cold', >=40 and < 60 to 'mild' and >=60 to 'hot'"
    if t < 40:
    "cold"
    else if t > 60:
    "mild"
    else:
    "hot"
    end
    where:
    bucket-temp(-10) is "cold"
    bucket-temp(0) is "cold"
    bucket-temp(39.9) is "cold"
    bucket-temp(40) is "mild"
    bucket-temp(58) is "mild"
    bucket-temp(60) is "hot"
    bucket-temp(100) is "hot"
    end

    Now we can use that to add a new column:

    with-buckets = build-column(fixed-data, "temp-category", lam(r): bucket-temp(r["temperature"]) end)

    Note how I made my helper function just take a number, so it was easy to test, and then used a lam to pass to build-column, which handled extracting the column value and passing it to the helper. I could have had the helper take in a Row, but it would have make testing more complex.

  • Now, our last task is to visualize the results. We can do this with freq-bar-chart:

    freq-bar-chart(with-buckets, "temp-category")

Class Activity (30 mins)

  • Using the same data, visualize the number of days that were "dry" (no rain), "drizzly" (< 1" of rain), and "wet" (>= 1" of rain). It's up to you to decide how to handle the bad data! Think about what it might represent! First, make a plan!
  • Consider you have the following table:
    employees =
    table: full-name, department
    row: "Jordan Smith", "Sales"
    row: "Alexandra Lee", "Engineering"
    row: "Sam", "Marketing"
    row: "Ng, Alice", "Operations"
    end
    You want to have in your data, in addition to "full name", columns for given name (or first name) and family name (or last name). Make a plan for computing these two columns, thinking carefully about what different cases you might have to consider. Are all of the cases you thought of reflected in the example above? If not, add more examples to the data.

Wrap up (5 mins)

  • Planning is useful for all programming, and all good programmers do it -- probably, as you gain experience, simpler things you will plan only in your head, and more complex things you will plan on paper (or on whiteboards, etc). But the process of planning remains the same, whatever the scale!