Homework 5 -- NYC Housing and School Analysis (Course Project Part 2)
Skills Practiced:
Introduction
Building on your work with housing data from Homework 4, you'll now analyze the relationship between NYC housing patterns and school demographics by working with 2 datasets. You'll be working with two CSV files:
- housing.csv, which you worked with in Homework 4 and has the following columns:
- borough: The NYC borough (Manhattan, Brooklyn, Queens, Bronx, Staten Island)
- zipcode: ZIP code of the property
- address: Street address of the property
- landuse: Type of land use (residential, commercial, etc.)
- ownertype: Category of property ownership
- lotarea: Total lot area in square feet
- bldgarea: Total building area in square feet
- latitude: Latitude coordinate of the property
- longitude: Longitude coordinate of the property
- schools.csv, which represents NYC's public schools and has the following columns:
- schoolname: Name of the school
- latitude: Latitude coordinate of the school
- longitude: Longitude coordinate of the school
- address: Street address of the school
- city: City in which the school is located
- zip: ZIP code of the school
- percentasian, percentblack, percenthispanic, percentblackhispanic, percentwhite: Demographics (as percentage strings like "16%")
Problem 1
Load the NYC schools data from the CSV file into a table. Define a table called schools-table
with the appropriate column names and types.
Problem 2
Part A
Find which ZIP codes have both housing and schools to identify complete neighborhoods. Extract the "zipcode" column from housing-table
into a list called housing-zips
and the "zip" column from schools-table
into a list called school-zips
. Create unique-housing-zips
and unique-school-zips
containing only distinct ZIP codes from each dataset.
Part B
Design a function has-both-housing-and-schools
that takes a ZIP code and returns true if it appears in both ZIP code lists. Use this function with the appropriate list operation to create complete-neighborhoods
containing ZIP codes that have both housing and schools.
Problem 3
Analyze residential building sizes in areas with schools by filtering housing-table
to create residential-properties
containing only properties with landuse codes "1", "2", or "3" (residential buildings). Extract the ZIP codes from these residential properties into residential-zips
. Use the appropriate list operation to create residential-zips-with-schools
containing only residential ZIP codes that also appear in your school ZIP codes list (unique-school-zips
) from Problem 2.
Problem 4
Part A
Calculate simple distances between housing and schools using coordinates by extracting the latitude
and longitude
coordinates from both datasets. Then, calculate the geographic centers:
center-housing-lat
: Average latitude of all housingcenter-school-lat
: Average latitude of all schools
Part B
Design a function simple-distance
that takes two latitudes and returns their absolute difference. Use the appropriate list operation to create distances-from-housing-center
showing how far each individual school is from the average housing location.