Skip to main content

Day 32 - File I/O

Skills: None

Pre-reading: NEED TO WRITE! -- New section in DCIC about manual file I/O, to supplement existing content loading CSVs from files.

Intro (10 mins)

  • Today we’ll learn how to read and write CSV files "by hand" in Python, using only basic file I/O and string methods—no Pandas.
  • This helps you understand what's happening under the hood when you use high-level libraries.
  • Example: Reading a CSV file and filtering for dine-in orders.
    # Read the file
    # First download the file from the URL:
    # https://pdi.run/f25-restaurant-orders.csv
    # and save it as 'restaurant_orders.csv' in the same directory as this file.
    file = open('restaurant_orders.csv', 'r')
    lines = file.readlines()
    file.close()

    # Convert lines to list of lists
    data = []
    for line in lines:
    cells = line.strip().split(',')
    data.append(cells)

    # Separate header and data
    headers = data[0]
    orders = data[1:]

    # Convert quantity to int and filter for dine-in
    dine_in_orders = []
    for row in orders:
    row[2] = int(row[2])
    if row[3] == 'dine-in':
    dine_in_orders.append(row)

    # Write filtered data to a new CSV
    output = [headers] + dine_in_orders
    out_file = open('dine_in_orders.csv', 'w')
    for row in output:
    out_file.write(','.join([str(cell) for cell in row]) + '\n')
    out_file.close()
  • All of these steps—reading, splitting, cleaning, filtering, and writing—are done manually here, but are automated by Pandas.

Class Exercises (35 mins)

  • Filter the data to include only orders where order_type is "takeout". Print the filtered rows.
  • Write the filtered data (including the header) to a new CSV file called takeout_orders.csv.
  • What happens if you forget to close the file after writing? Try it and see if you get an error or warning.
  • Add a new column to each row called total_price (assume each dish costs $10). Write the updated data to a new CSV file.
  • Try reading a CSV file that is missing a column in one row. What happens? How could you handle this?
  • Change your code to skip rows that are missing data, and write only the complete rows to a new file.
  • Write code to count the total number of orders in the file (excluding the header).
  • Write code to count how many orders are for each unique order_type (e.g., "dine-in", "takeout").
  • Write code to find the dish with the highest total quantity ordered across all rows.
  • Write code to filter for orders where quantity is greater than 2, and write these rows (with the header) to a new CSV file called large_orders.csv.
  • Write code to compute the total revenue (sum of total_price) for all orders, assuming each dish costs $10.
  • Write code to find and print all unique dish names in the file.
  • Write code to sort the orders by quantity (largest first) and write the sorted data (with header) to a new CSV file.
  • Write code to handle a file where some rows are missing the order_type column: skip those rows and print a warning for each skipped row.
  • (Optional) Write code to add a new column called order_label that is "big" if quantity ≥ 3, "medium" if quantity is 2, and "small" otherwise, and write the updated data to a new CSV file.

Wrap-up (5 mins)

  • Manual file I/O gives you insight into how data is read, cleaned, filtered, and written in Python.
  • Libraries like Pandas automate these steps, but it’s important to understand what’s happening behind the scenes.