Day 32 - File I/O
Skills: None
Pre-reading: NEED TO WRITE! -- New section in DCIC about manual file I/O, to supplement existing content loading CSVs from files.
Intro (10 mins)
- Today we’ll learn how to read and write CSV files "by hand" in Python, using only basic file I/O and string methods—no Pandas.
- This helps you understand what's happening under the hood when you use high-level libraries.
- Example: Reading a CSV file and filtering for dine-in orders.
# Read the file
# First download the file from the URL:
# https://pdi.run/f25-restaurant-orders.csv
# and save it as 'restaurant_orders.csv' in the same directory as this file.
file = open('restaurant_orders.csv', 'r')
lines = file.readlines()
file.close()
# Convert lines to list of lists
data = []
for line in lines:
cells = line.strip().split(',')
data.append(cells)
# Separate header and data
headers = data[0]
orders = data[1:]
# Convert quantity to int and filter for dine-in
dine_in_orders = []
for row in orders:
row[2] = int(row[2])
if row[3] == 'dine-in':
dine_in_orders.append(row)
# Write filtered data to a new CSV
output = [headers] + dine_in_orders
out_file = open('dine_in_orders.csv', 'w')
for row in output:
out_file.write(','.join([str(cell) for cell in row]) + '\n')
out_file.close() - All of these steps—reading, splitting, cleaning, filtering, and writing—are done manually here, but are automated by Pandas.
Class Exercises (35 mins)
- Filter the data to include only orders where
order_type
is"takeout"
. Print the filtered rows. - Write the filtered data (including the header) to a new CSV file called
takeout_orders.csv
. - What happens if you forget to close the file after writing? Try it and see if you get an error or warning.
- Add a new column to each row called
total_price
(assume each dish costs $10). Write the updated data to a new CSV file. - Try reading a CSV file that is missing a column in one row. What happens? How could you handle this?
- Change your code to skip rows that are missing data, and write only the complete rows to a new file.
- Write code to count the total number of orders in the file (excluding the header).
- Write code to count how many orders are for each unique
order_type
(e.g., "dine-in", "takeout"). - Write code to find the dish with the highest total quantity ordered across all rows.
- Write code to filter for orders where
quantity
is greater than 2, and write these rows (with the header) to a new CSV file calledlarge_orders.csv
. - Write code to compute the total revenue (sum of
total_price
) for all orders, assuming each dish costs $10. - Write code to find and print all unique dish names in the file.
- Write code to sort the orders by
quantity
(largest first) and write the sorted data (with header) to a new CSV file. - Write code to handle a file where some rows are missing the
order_type
column: skip those rows and print a warning for each skipped row. - (Optional) Write code to add a new column called
order_label
that is"big"
ifquantity
≥ 3,"medium"
ifquantity
is 2, and"small"
otherwise, and write the updated data to a new CSV file.
Wrap-up (5 mins)
- Manual file I/O gives you insight into how data is read, cleaned, filtered, and written in Python.
- Libraries like Pandas automate these steps, but it’s important to understand what’s happening behind the scenes.