Skip to content

jkaashoek/senior_thesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Senior Thesis README

Code base for my Applied Math undergraduate thesis. All requirements can be found in requirements.txt and can be installed with pip install -r requirements.txt

Code

The code is separated into three Jupyter notebooks.

  • dynamic_mode_decomp.ipynb: Chapter 2 code to perform dynamic mode decomposition.
  • clustering.ipynb: code to run a hierarchical and k-means clustering on county death series, as described in Chapter 3.
  • political_analysis.ipynb: the remaining Chapter 3 code that performs the descriptive and regression analyses as well as generate figures.

Data

The data folder contains publically available and self-generated datasets that I used in Chapter 3.

  • centers.csv: cluster centers, as generated by clustering.ipynb
  • cluster_assignment.csv: a mapping from county FIPS id to cluster assignment, as determined in clustering.ipynb.
  • governor_affiliation.csv: Self-generated party affiliations of governors. This dataset is not really used in the code anymore but was at one point. Original data table.
  • incarceration_trends.csv: Incarceration data provided by the VERA Institute. Original data source.
  • International_Report_Passengers.csv: International travel data provided by the Department of Transportation. Original data source.
  • LND01.csv: Land area data provided by the Census Bureau. Original data source.
  • nursing.csv: Nursing home data provided by CMS. Original data source.
  • OxCGRT_US_latest.csv: COVID-19 government interventions in the US provided by OxCGRT. Original data source.
  • PLACES__Local_Data_for_Better_Health__County_Data_2020_release.csv: PLACES dataset containing obesity data provided by the CDC. Original data source.
  • SVI2018_US_COUNTY.csv: Social Vulnerability Index data provided by the CDC. Original data source.
  • time_series_covid19_confirmed_US.csv: US county-level COVID-19 cases from Johns Hopkins. Not currently used in the analysis. The last data pull was February 12, 2021. Original data source.
  • time_series_covid19_deaths_US.csv: US county-level COVID-19 deaths from Johns Hopkins. The last data pull was February 12, 2021. Original data source.
  • us census bureau regions and divisions.csv: Census bureau and regions provided by Chris Halpert. Original data source.
  • vale_eligible_airports.csv: FAA airport emissions tracking, used to link airport codes to counties. Original data source.

Figures

Copies of figures. Most figures are saved as .svg files because paper figures were compiled using Powerpoint. Any figures that were included directly in the paper are saved as .pdf. Images are organized by chapter.

Shapefiles

Contain the county, state, and region shapefiles for the United States as provided by the Census Bureau. These are used for plotting purposes in both the dynamic mode decomposition code and the political analysis code. Original download links can be found here.

About

Code base for undergraduate thesis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published