Skip to main content

Day 7 - Introduction to tables

Skills: 2

Pre-reading: 4.1.1 & 4.1.2

Reference: For all work with tables, refer to the Tables page in the menu at the top of the page!

Intro (15 mins)

Goal Learn about tabular data, creating tables literally, importing data, extracting rows and cell values.

  • Many everyday pieces of data -- like a workout journal, recipe index, or library catalog — are naturally represented as tables, a type of data where there are many rows where each row has the same set of attributes, called columns.

  • Tables are values, just like numbers, strings, images, and booleans, and small ones can be directly typed into Pyret as:

    workouts = table: date :: String, activity :: String, duration :: Number
    row: "2025-04-01", "Running", 30
    row: "2025-04-02", "Yoga", 45
    row: "2025-04-03", "Cycling", 60
    end
  • Note that after the table: comes a list of columns, with optional type annotations. This is then followed by a sequence of rows, that each must have exactly the columns mentioned at the beginning.

  • Since tables are values, they can be the input and output of functions, and can be used in examples. An important detail: when comparing tables for equality (like in test cases) the order of rows matters!

  • We can use check: ... end to write a set of tests not associated with a function, and use that to see:

    check:
    table: date :: String, activity :: String, duration :: Number
    row: "2025-04-01", "Running", 30
    row: "2025-04-02", "Yoga", 45
    row: "2025-04-03", "Cycling", 60
    end
    is-not
    table: date :: String, activity :: String, duration :: Number
    row: "2025-04-03", "Cycling", 60
    row: "2025-04-01", "Running", 30
    row: "2025-04-02", "Yoga", 45
    end
    end
  • To deal with external files, we need to first include a Pyret piece of functionality that is not enabled by default to handle tables represented as "comma separated values" (CSV) files.

  • Then we can use load-table:, rather than table:, and rather than listing the rows, specifying that they come from a csv file (in this case, from a URL, but in HW and lab, often it will be a file in the same project, using csv-table-file).

    include csv
    recipes = load-table:
    title :: String,
    servings :: Number,
    prep-time :: Number
    source: csv-table-url("https://pdi.run/f25-2000-recipes.csv", default-options)
    end
  • IMPORTANT: by default, all CSV data are strings; to convert numeric data to numbers, we can use "sanitizers". We will talk more about data cleaning on Day 11, but for any csv that has numeric columns, you should add clauses, after the source: line, that look like sanitive column-name using num-sanitizer (each on separate lines, no commas between). So the above should be done as:

    include csv
    recipes = load-table:
    title :: String,
    servings :: Number,
    prep-time :: Number
    source: csv-table-url("https://pdi.run/f25-2000-recipes.csv", default-options)
    sanitize servings using num-sanitizer
    sanitize prep-time using num-sanitizer
    end
  • In addition to printing the whole table (or a prefix, if the table is long), you can extract a row from it by writing table-identifier.row-n(N) for some N. The first row is numbered 0, the last is one minus the number of rows in the table.

    second-workout = workouts.row-n(1)
    # -> Row: date = "2025-04-02", activity = "Yoga", duration = 45
  • From a row, you can extract a column values using row-identifier["column-name"], e.g.,

    second-workout["activity"]  # -> "Yoga"
    # or all at once:
    workouts.row-n(1)["duration"] # -> 45
  • There are also built-in functions that compute results -- e.g., mean takes a table, a column name, and returns the average of all values in that column. median, modes, sum, and stdev exist as well!

    mean(workouts, "duration") # -> 45

Class Exercise (40 mins)

  • Find a dataset on https://data.london.gov.uk/dataset?format=csv, click "Export" and then select "API Endpoint" and switch the data format to "CSV". Copy the URL, and create a table from it.
  • The number of columns must match the CSV -- if it doesn't, Pyret will report an error. Fix the table declaration and re-run.
  • Use the interactions window, .row-n, and the column extraction to explore the data a little bit.
  • Check the total number of rows with table-identifier.length(). What happens if you try to do table-identifier.row-n(M) for M that is greater than the total number of rows?
  • Similarly, try extracting a column that doesn't exist.

Wrap-up (5 mins)

  • Tables are an extremely common and powerful form of data. Today we just learned how to look at them; in the upcoming days we will see how to program with them!