Homework 10 -- Urban Planning Analytics (Course Project Part 3)
Skills Practiced:
*elaine note: this seems hella long but idk exactly which problems to cut off
Introduction
You'll analyze NYC housing development opportunities by working with 3 interconnected datasets to identify optimal locations for new housing. You'll evaluate these opportunities using key urban planning factors: proximity to schools for families and access to public transit for citywide connectivity.
You'll be working with 3 CSV files, 2 of which you've worked with before:
- housing.csv, which you worked with in Homework 4 and 5 and has the following columns:
- borough: The NYC borough (Manhattan, Brooklyn, Queens, Bronx, Staten Island)
- zipcode: ZIP code of the property
- address: Street address of the property
- landuse: Type of land use (residential, commercial, etc.)
- ownertype: Category of property ownership
- lotarea: Total lot area in square feet
- bldgarea: Total building area in square feet
- latitude: Latitude coordinate of the property
- longitude: Longitude coordinate of the property
- schools.csv, which you worked with in Homework 5 and has the following columns:
- schoolname: Name of the school
- latitude: Latitude coordinate of the school
- longitude: Longitude coordinate of the school
- address: Street address of the school
- city: City in which the school is located
- zip: ZIP code of the school
- percentasian, percentblack, percenthispanic, percentblackhispanic, percentwhite: Demographics (as percentage strings like "16%")
- stations.csv, which represents NYC subway stations:
- division: Transit division (BMT, IRT, IND)
- line: Subway line name
- stopname: Station name
- latitude: Latitude of the station
- longitude: Longitude of the station
Problem 1
Load the NYC stations data from the CSV file into a table. Define a table called stations-table
with the appropriate column names and types.
Problem 2
Create structured data types to represent development opportunities and analyze public property using recursion with transit accessibility.
Define a data type called PropertyOpportunity
that contains:
- address: String
- zipcode: String
- lot-area: Number
- building-area: Number
- distance-to-school: Number
- distance-to-transit: Number
- is-public: Boolean
Problem 3
Part A
Design a function called is-underutilized-public
that takes a row from the housing table and returns true if:
- The owner type is "C" (city-owned) OR "O" (other public), and
- The building area is less than 10% of the lot area (severely underutilized)
Part B
Design a function calculate-transit-distance
that takes property coordinates and returns the distance to the nearest subway station using the simple-distance
function you defined in Homework 5.
Part C
Create a function row-to-opportunity
that converts a housing table row into a PropertyOpportunity
, calculating distances to both schools and transit.
Part D
Create a table public-opportunities
that contains only the underutilized public properties from the housing table. Use the is-underutilized-public
function you created in Part A to filter the data. Then, transform the public-opportunities table by adding columns for distance calculations in a new table called opportunities-with-distances
that includes:
- All original columns from public-opportunities
- A school-distance column calculated using the distance to the center of schools
- A transit-distance column calculated using your
calculate-transit-distance
function
To visualize your data, create a scatterplot showing the relationship between school-distance (x-axis) and transit-distance (y-axis) for these underutilized public properties. This will help identify properties that are well-positioned relative to both schools and transit.
Problem 4
Part A
Define a data type called MultiFamily
with:
- address: String
- zipcode: String
- building-area: Number
- tax-value-ratio: Number (building area ÷ lot area)
- transit-accessible: Boolean (within reasonable distance of subway)
Part B
Design a function is-transit-accessible
that takes property coordinates and returns true if the property is within 0.01 coordinate units of any subway station.
Part C
Design a function is-multifamily-residential
that returns true for landuse codes "2" or "3" (multi-family buildings).
Part D
Design a recursive function count-undervalued-near-transit
that takes your list of MultiFamily records and counts how many properties have both a tax-value-ratio < 0.5 (undervalued) AND are transit accessible.