Skip to main content

Homework 4 -- NYC Property Analysis (Course Project Part 1)

Skills: 1, 2, 12

Due

Thursday, October 2, 2025 at 6PM (Oakland) or 9PM (Boston)


Start of course project.

Introduction

You're working with New York City property data to analyze housing patterns and identify development opportunities across the 5 boroughs. This dataset contains property records with information about location, land use, ownership types, and property characteristics that can help understand urban development patterns.

For this assignment, you'll be working with a CSV file containing NYC housing data. The file housing.csv contains the following columns:

  • borough: The NYC borough (Manhattan, Brooklyn, Queens, Bronx, Staten Island)
  • zipcode: ZIP code of the property
  • address: Street address of the property
  • landuse: Type of land use (residential, commercial, etc.)
  • ownertype: Category of property ownership
  • lotarea: Total lot area in square feet
  • bldgarea: Total building area in square feet
  • latitude: Latitude coordinate of the property
  • longitude: Longitude coordinate of the property

Problem 1

Define a table called housing-table with the appropriate column names and types using the NYC housing data from the CSV file.

Be sure to include sanitize clauses to convert columns that should be numeric into numbers; e.g., sanitize lotarea using num-sanitizer.

Problem 2

Vacant land represents potential development opportunities. Create a new table called vacant-land-properties that contains only the properties with land use code "11" (Vacant Land).

Problem 3

There are a lot of vacant properties. In order to understand the table better, construct a table large-vacant-properties that only has large properties -- those with "lotarea" (which represents the number of square feet) over 1 million (1000000).

Look at that table, and then pick two of the properties, and use the latitude / longitude coordinates to look on Google Maps for the properties (you can search "latitude,longitude"). Write, in comments, the coordinates, and what you think the property might be, and why you think it might be vacant.

Problem 4

Now construct a table without-large-properties that has only those vacant properties whose lotarea is below 1 million square feet. Use histogram to graph that, with different bucket sizes. You might notice something odd about the data. Investigate, and then look more at the data to see if your hypothesis is correct. Report your findings in a comment.

Problem 5

NYC has various types of residential properties. Design a function housing-type-category that takes a row from the housing table and returns a string describing the residential category based on the land use code:

  • "single-family": land use code "1" (One & Two Family Buildings)
  • "walk-up": land use code "2" (Multi-Family Walk-Up Buildings)
  • "high-rise": land use code "3" (Multi-Family Elevator Buildings)
  • "mixed-use": land use code "4" (Mixed Residential & Commercial Buildings)
  • "non-residential": any other land use code

Then, create a new table called categorized-housing that adds a "housing-type" column to the original housing-table.

Problem 6

Understanding property ownership patterns is crucial for housing policy. Design a function ownership-category that takes an owner type code and returns a more descriptive string:

  • "C" becomes "City-owned"
  • "M" becomes "Mixed ownership"
  • "O" becomes "Other public"
  • "P" becomes "Private"
  • "X" becomes "Tax-exempt"

Then, create a new table called table-with-ownership that replaces the values in the "ownertype" column in the original housing-table with the descriptive ownership categories.

Problem 7

Vacant land might immediately strike one as a great opportunity for development (housing, commercial, otherwise). Leaving certain areas undeveloped, however, can serve to promote other important environmental and human values. Can you think of two values that might be promoted when certain land is left undeveloped?

Answer as a comment.

Problem 8

NYC is defined as much by its natural landscapes as by its buildings. Wetlands are an important part of the natural landscape. These areas promote a number of different environmental and human values (read here: https://www.nycgovparks.org/learn/ecosystems/wetlands-in-new-york-city-parks). After reading that website, complete the following stakeholder matrix.

StakeholdersInterests/Values
Wildlife
People living in coastal communities
Reduce Urban Heat Island Effect
Recreation

You do not need to try to recreate a table in your file. You can just write comments like:

# wildlife: ...
# people living in coastal communities: ...
# ...