Skip to main content

Homework 5 -- NYC Housing and School Analysis (Course Project Part 2)

Skills Practiced:


Introduction

Building on your work with housing data from Homework 4, you'll now analyze the relationship between NYC housing patterns and school demographics by working with 2 datasets. You'll be working with two CSV files:

  1. housing.csv, which you worked with in Homework 4 and has the following columns:
    • borough: The NYC borough (Manhattan, Brooklyn, Queens, Bronx, Staten Island)
    • zipcode: ZIP code of the property
    • address: Street address of the property
    • landuse: Type of land use (residential, commercial, etc.)
    • ownertype: Category of property ownership
    • lotarea: Total lot area in square feet
    • bldgarea: Total building area in square feet
    • latitude: Latitude coordinate of the property
    • longitude: Longitude coordinate of the property
  2. schools.csv, which represents NYC's public schools and has the following columns:
    • schoolname: Name of the school
    • latitude: Latitude coordinate of the school
    • longitude: Longitude coordinate of the school
    • address: Street address of the school
    • city: City in which the school is located
    • zip: ZIP code of the school
    • percentasian, percentblack, percenthispanic, percentblackhispanic, percentwhite: Demographics (as percentage strings like "16%")

Problem 1

Load the NYC schools data from the CSV file into a table. Define a table called schools-table with the appropriate column names and types.

Problem 2

Part A

Find which ZIP codes have both housing and schools to identify complete neighborhoods. Extract the "zipcode" column from housing-table into a list called housing-zips and the "zip" column from schools-table into a list called school-zips. Create unique-housing-zips and unique-school-zips containing only distinct ZIP codes from each dataset.

Part B

Design a function has-both-housing-and-schools that takes a ZIP code and returns true if it appears in both ZIP code lists. Use this function with the appropriate list operation to create complete-neighborhoods containing ZIP codes that have both housing and schools.

Problem 3

Analyze residential building sizes in areas with schools by filtering housing-table to create residential-properties containing only properties with landuse codes "1", "2", or "3" (residential buildings). Extract the ZIP codes from these residential properties into residential-zips. Use the appropriate list operation to create residential-zips-with-schools containing only residential ZIP codes that also appear in your school ZIP codes list (unique-school-zips) from Problem 2.

Problem 4

Part A

Calculate simple distances between housing and schools using coordinates by extracting the latitude and longitude coordinates from both datasets. Then, calculate the geographic centers:

  • center-housing-lat: Average latitude of all housing
  • center-school-lat: Average latitude of all schools

Part B

Design a function simple-distance that takes two latitudes and returns their absolute difference. Use the appropriate list operation to create distances-from-housing-center showing how far each individual school is from the average housing location.