Homework 10 -- Urban Planning Analytics (Course Project Part 3)
Skills: 2, 4, 12
Due
Thursday, November 13, 2025 at 6PM (Oakland) or 9PM (Boston)
Introduction
You'll analyze NYC housing development opportunities by working with 3 interconnected datasets to identify optimal locations for new housing. You'll evaluate these opportunities using key urban planning factors: proximity to schools for families and access to public transit for citywide connectivity.
You'll be working with 3 CSV files, 2 of which you've worked with before:
- housing.csv, which you worked with in Homework 4 and 5 and has the following columns:
- borough: The NYC borough (Manhattan, Brooklyn, Queens, Bronx, Staten Island)
- zipcode: ZIP code of the property
- address: Street address of the property
- landuse: Type of land use (residential, commercial, etc.)
- ownertype: Category of property ownership
- lotarea: Total lot area in square feet
- bldgarea: Total building area in square feet
- latitude: Latitude coordinate of the property
- longitude: Longitude coordinate of the property
- schools.csv, which you worked with in Homework 5 and has the following columns:
- schoolname: Name of the school
- latitude: Latitude coordinate of the school
- longitude: Longitude coordinate of the school
- address: Street address of the school
- city: City in which the school is located
- zip: ZIP code of the school
- percentasian, percentblack, percenthispanic, percentblackhispanic, percentwhite: Demographics (as percentage strings like "16%")
- stations.csv, which represents NYC subway stations:
- division: Transit division (BMT, IRT, IND)
- line: Subway line name
- stopname: Station name
- latitude: Latitude of the station
- longitude: Longitude of the station
Problem 1
Load the NYC stations data from the CSV file into a table. Define a table called stations-table
with the appropriate column names and types.
Problem 2
Create structured data types to represent development opportunities and analyze public property using recursion with transit accessibility.
Define a data type called PropertyOpportunity
that contains:
- address: String
- zipcode: String
- lot-area: Number
- building-area: Number
- distance-to-school: Number
- distance-to-transit: Number
- is-public: Boolean
Problem 3
Part A
Design a function called is-underutilized-public
that takes a row from the housing table and returns true if:
- The owner type is "C" (city-owned) OR "O" (other public), and
- The building area is less than 10% of the lot area (severely underutilized)
Part B
Design a function calculate-transit-distance
that takes property coordinates and returns the distance to the nearest subway station using the simple-distance
function you defined in Homework 5.
Part C
Create a function row-to-opportunity
that converts a housing table row into a PropertyOpportunity
, calculating distances to both schools and transit.
Part D
Create a table public-opportunities
that contains only the underutilized public properties from the housing table. Use the is-underutilized-public
function you created in Part A to filter the data. Then, transform the public-opportunities table by adding columns for distance calculations in a new table called opportunities-with-distances
that includes:
- All original columns from public-opportunities
- A school-distance column calculated using the distance to the center of schools
- A transit-distance column calculated using your
calculate-transit-distance
function
To visualize your data, create a scatterplot showing the relationship between school-distance (x-axis) and transit-distance (y-axis) for these underutilized public properties. This will help identify properties that are well-positioned relative to both schools and transit.
Problem 4
Part A
Define a data type called MultiFamily
with:
- address: String
- zipcode: String
- building-area: Number
- tax-value-ratio: Number (building area ÷ lot area)
- transit-accessible: Boolean (within reasonable distance of subway)
Part B
Design a function is-transit-accessible
that takes property coordinates and returns true if the property is within 0.01 coordinate units of any subway station.
Part C
Design a function is-multifamily-residential
that returns true for landuse codes "2" or "3" (multi-family buildings).
Part D
Design a recursive function count-undervalued-near-transit
that takes your list of MultiFamily records and counts how many properties have both a tax-value-ratio < 0.5 (undervalued) AND are transit accessible.
Problem 5
Assume that, based on the past analysis that you've done, you identified certain lots that might be worth the city acquiring and developing --- ones that have good availability to intrastructure, which not only benefits the residents but also puts less strain on the city (in terms of not needing to build new infrastructure).
Now, considering that you have a potential lot to develop into high-density housing, generate a stakeholder matrix. Identify 3 stakeholders who might be impacted by this development. Identify 2 interests for each of your stakeholders. Identify 2 conflicts in interest.
Write this out as comments. You do not need to format it as a table; you can list them separately, e.g.,:
# first stakeholder: ...
# interests: ...
# second stakeholder: ...
...
# conflicts: ...