-
Notifications
You must be signed in to change notification settings - Fork 42
Data
For this project, we used three primary sources of data, detailed below.
The City of Chicago publishes the twelve most popular service requests on it's open data portal:
- Tree trims
- Tree debris
- Graffiti removal
- Rodent baiting
- Garbage cart requests
- Abandoned vehicles
- Sanitation code complaints
- Potholes
- Alley lights out
- Street lights out (one or two)
- Street lights out (row of three or more)
The data is available starting from January 2011, and is updated daily with new entries. As of summer 2013, more than 4,000,000 311 are available on the data portal.
Most service requests types have the same data model, as illustrated by this example of a pothole filling request from 2012:
field | value |
---|---|
CREATION DATE | 07/17/2012 |
STATUS | Completed |
COMPLETION DATE | 07/30/2012 |
SERVICE REQUEST NUMBER | 12-01275916 |
TYPE OF SERVICE REQUEST | Pot Hole in Street |
CURRENT ACTIVITY | Final Outcome |
MOST RECENT ACTION | Pothole Patched |
NUMBER OF POTHOLES FILLED ON BLOCK | 1 |
STREET ADDRESS | 5100 S NATCHEZ AVE |
ZIP | 60638 |
X COORDINATE | 1133903.99013187 |
Y COORDINATE | 1870063.02512769 |
Ward | 23 |
Police District | 8 |
Community Area | 56 |
LATITUDE | 41.7996563192028 |
LONGITUDE | -87.78447500121634 |
LOCATION | (41.7996563192028°, -87.78447500121634°) |
Some of the fields are redundant (such as the pair latitude and longitude, and location), but provide enough flexibility to be used without preprocessing in many situations. For instance, the X and Y coordinates contain the same information as longitude and latitude, but using a different coordinate reference system, that is usually employed in local maps of the Chicago area.
Some fields are specific to the kind of service request, such as "number of potholes filled on block." We do not get into every detail of these ad-hoc fields, as they are self-explanatory.
The Python script munging/get_portal_311.py
allows to automatically download the most updated databases of requests in JSON format for all 12 types.
Chapin Hall has a much larger dataset 311 service requests for academic research purposes. It dates back to 1998 and contains more than 600 requests types - a much longer time period and set of data than is available on the data portal, making this dataset a goldmine for researchers.
Like the portal datasets, this one is a subset of the City's official 311 database. Because of the way the data was extracted from the city's system, it actually has fewer fields than the open datasets. This could be a big drawback depending on the questions you're trying to answer.
field | value |
---|---|
Date | 01-JAN-08, |
Type code | WBT |
Type | Hydrant Open |
Ward | 3X |
Community Area | 2X |
Address | 2XXX N XXXXXX AVE CHICAGO, IL 606XX |
Location | (113330X.XXXXXXXX,191658X.XXXXXXXX) |
(Note: the X's are added to mask the real entry, since it is not part of the City's open data.)
Neither of these sources of 311 data differentiate between requests coming from citizens and requests entered as work orders by City of Chicago employees, and there's no obvious way to distinguish between the two. This means that any analysis of these requests will pick up on service requests trends produced by both residents and city employees.
The demographics data we used are openly available and were obtained from the U.S. Census Bureau and other indirect sources, and come from the last two censuses (2000, 2010) as well as from American Community Surveys (2007-1012).
We used Census data at different levels of aggregation, as described below.
For each of the 77 Chicago community areas, we collected the following information:
- Number ID
- Name
- Total population
- Median household income
- Proportion of Hispanic population
- Proportion of Black population
- Proportion of White population
- Proportion of Asian Population
- Proportion of population of other ethnicity/race
This data comes from the 2010 Census was downloaded from the website of Rob Paral and Associates.
For each of the 800 census tracts in the City of Chicago, we retrieved the following information from the American Community Survey (2007-2011).
- Tract ID
- Total population
- Proportion of makes and females
- Proportion of population divided by age groups (0-14, 15-24, 25-64, 64+)
- Proportion of population divided by ethnicity/race (White, Black, Asian, Hispanic, Other)
- Median income
- Proportion of families below the poverty line
- Unemployment rate
This data was retrieved using the FactFinder interface on the Census Bureau website, and was organized and cleaned using the two scripts in the munging/ACS_Census
directory.