Skip to main content

Homework 10 -- Urban Planning Analytics (Course Project Part 3)

Skills Practiced:


*elaine note: this seems hella long but idk exactly which problems to cut off

Introduction

You'll analyze NYC housing development opportunities by working with 3 interconnected datasets to identify optimal locations for new housing. You'll evaluate these opportunities using key urban planning factors: proximity to schools for families and access to public transit for citywide connectivity.

You'll be working with 3 CSV files, 2 of which you've worked with before:

  1. housing.csv, which you worked with in Homework 4 and 5 and has the following columns:
    • borough: The NYC borough (Manhattan, Brooklyn, Queens, Bronx, Staten Island)
    • zipcode: ZIP code of the property
    • address: Street address of the property
    • landuse: Type of land use (residential, commercial, etc.)
    • ownertype: Category of property ownership
    • lotarea: Total lot area in square feet
    • bldgarea: Total building area in square feet
    • latitude: Latitude coordinate of the property
    • longitude: Longitude coordinate of the property
  2. schools.csv, which you worked with in Homework 5 and has the following columns:
    • schoolname: Name of the school
    • latitude: Latitude coordinate of the school
    • longitude: Longitude coordinate of the school
    • address: Street address of the school
    • city: City in which the school is located
    • zip: ZIP code of the school
    • percentasian, percentblack, percenthispanic, percentblackhispanic, percentwhite: Demographics (as percentage strings like "16%")
  3. stations.csv, which represents NYC subway stations:
    • division: Transit division (BMT, IRT, IND)
    • line: Subway line name
    • stopname: Station name
    • latitude: Latitude of the station
    • longitude: Longitude of the station

Problem 1

Load the NYC stations data from the CSV file into a table. Define a table called stations-table with the appropriate column names and types.

Problem 2

Create structured data types to represent development opportunities and analyze public property using recursion with transit accessibility.

Define a data type called PropertyOpportunity that contains:

  • address: String
  • zipcode: String
  • lot-area: Number
  • building-area: Number
  • distance-to-school: Number
  • distance-to-transit: Number
  • is-public: Boolean

Problem 3

Part A

Design a function called is-underutilized-public that takes a row from the housing table and returns true if:

  1. The owner type is "C" (city-owned) OR "O" (other public), and
  2. The building area is less than 10% of the lot area (severely underutilized)

Part B

Design a function calculate-transit-distance that takes property coordinates and returns the distance to the nearest subway station using the simple-distance function you defined in Homework 5.

Part C

Create a function row-to-opportunity that converts a housing table row into a PropertyOpportunity, calculating distances to both schools and transit.

Part D

Create a table public-opportunities that contains only the underutilized public properties from the housing table. Use the is-underutilized-public function you created in Part A to filter the data. Then, transform the public-opportunities table by adding columns for distance calculations in a new table called opportunities-with-distances that includes:

  • All original columns from public-opportunities
  • A school-distance column calculated using the distance to the center of schools
  • A transit-distance column calculated using your calculate-transit-distance function

To visualize your data, create a scatterplot showing the relationship between school-distance (x-axis) and transit-distance (y-axis) for these underutilized public properties. This will help identify properties that are well-positioned relative to both schools and transit.

Problem 4

Part A

Define a data type called MultiFamily with:

  • address: String
  • zipcode: String
  • building-area: Number
  • tax-value-ratio: Number (building area ÷ lot area)
  • transit-accessible: Boolean (within reasonable distance of subway)

Part B

Design a function is-transit-accessible that takes property coordinates and returns true if the property is within 0.01 coordinate units of any subway station.

Part C

Design a function is-multifamily-residential that returns true for landuse codes "2" or "3" (multi-family buildings).

Part D

Design a recursive function count-undervalued-near-transit that takes your list of MultiFamily records and counts how many properties have both a tax-value-ratio < 0.5 (undervalued) AND are transit accessible.