Github: Project files related to Texas A&M University, Landscape Architecture and Urban Planning Course, Urban and Regional Science PhD, URSC 645 Urban and Regional Analytics.
Professor Nathanael Rosenheim
Files include Google Colab notebooks and Stata Do files for sharing and reference.
Course title and number: URSC 645 Urban and Regional Analytics
Course catalog description: Urban and regional administrative data management; data analysis; programming for replicable, systematic research; project workflow to support project collaboration
Course Purpose: The purpose of this course is to apply urban analytics tools, such as tools for data management and visualization, to publicly available data that relate to development, structure, and functioning of urban and regional environments. The course introduces data workflow skills to obtain, scrub, explore, visualize, interpret and publish data.
The course focuses on computer coding skills to ensure that research is replicable, systematic and generalizable.
This course will familiarize Urban and Regional Science and Sociology PhD students with data management concepts for reproducible research. Computer coding or scripting is the basis for a data science workflow that leads to systematic research that can be replicated and generalized. Reproducible research means that results can be validated by other researchers, when provided with the data and software code used to generate the published results. This course will guide students through the challenges associated with reproducible research.
By the end of this class, students will be able to:
- Demonstrate that they understand basic applications for code and scripts.
- Adopt a scalable workflow for individual and team-based projects.
- Identify replicable research in sociology or urban and regional science journals.
- Use appropriate software to obtain, scrub, explore, visualize, interpret and publish data.
In the typical statistics course students work with clean, orderly datasets. However, when students begin to do their own research they are faced with real-world-raw data that is far from clean and orderly. Often, the process of generating a clean dataset requires a large time investment and for many projects this data cleaning process can take more time than the data modeling and interpretation. This course was motivated in part by seeing students struggle with real world data.
For many graduate students in the fields of sociology and urban and regional science, coding, scripting, and visualization tools are not introduced in undergraduate programs. However, many students and faculty discover that coding skills are essential for systematic, generalizable, and replicable research. This course attempts to provide the motivation and the foundation for building a strong workflow to support urban and regional analytic research.
Physical copy required: Long, J. S. (2009). The workflow of data analysis using Stata. College Station, TX: Stata Press. https://www.stata.com/bookstore/workflow-data-analysis-stata/
Munafò, M. R., Nosek, B. A., Bishop, D. V., Button, K. S., Chambers, C. D., du Sert, N. P., Simonsohn, U., Wagenmakers, E., Ware, J.J., & Ioannidis, J. P. (2017). A manifesto for reproducible science. Nature Human Behaviour, 1, 0021. https://doi.org/10.1038/s41562-016-0021
Lowndes, J. S. S., Best, B. D., Scarborough, C., Afflerbach, J. C., Frazier, M. R., O'Hara, C. C., Jiang, N., & Halpern, B. S. (2017). Our path to better science in less time using open data science tools. Nature ecology & evolution, 1(6), 160. https://doi.org/10.1038/s41559-017-0160
Freese, J. (2007). Replication standards for quantitative social science: Why not sociology?. Sociological Methods & Research, 36(2), 153-172.
Arribas-Bel, D., de Graaff, T., & Rey, S. J. (2017). Looking at John Snow’s Cholera Map from the Twenty First Century: A Practical Primer on Reproducibility and Open Science. In Regional Research Frontiers-Vol. 2 (pp. 283-306). Springer International Publishing. 10.1007/978-3-319-50590-9_17
Gentzkow, M., & Shapiro, J. M. (2014). Code and data for the social sciences: A practitioner’s guide. University of Chicago mimeo. https://people.stanford.edu/gentzkow/sites/default/files/codeanddata.pdf
The Turing Way Community, Becky Arnold, Louise Bowler, Sarah Gibson, Patricia Herterich, Rosie Higman, … Kirstie Whitaker. (2021, Nov 10). The Turing Way: A Handbook for Reproducible Data Science (Version v1.0.1). Zenodo. http://doi.org/10.5281/zenodo.5671094 Ebook link: https://the-turing-way.netlify.app/welcome
Sheather, S. (2008). A Modern Approach to Regression with R. http://proxy.library.tamu.edu/login?url=https://dx.doi.org/10.1007/978-0-387-09608-7
Source code in SAS, Stata, and R: http://gattonweb.uky.edu/sheather/book/
Rosenheim, N. Peacock, W. Williams, A. Lane, G. Watson, M. Sullivan, E. Katare, A. Kastor, H. (2021) "Report of Applied Methods", in Food Access Impact Survey for Harris County and Southeast Texas after Hurricane Harvey in 2017. DesignSafe-CI. https://doi.org/10.17603/ds2-dh61-m731.
Rosenheim, Nathanael; Day, Wayne; Seong, Kijin (2021) “Automated Neighborhood Characteristics for Community Resilience Planning.” DesignSafe-CI. https://doi.org/10.17603/ds2-hj0p-bp40.
Roy, Malini; Rosenheim, Nathanael (2021) “Longitudinal Social Vulnerability Data Exploration for Harris County Census Tracts.” DesignSafe-CI. https://doi.org/10.17603/ds2-hn6r-dh03.
Rosenheim, Nathanael (2021) “Detailed Household and Housing Unit Characteristics: Data and Replication Code.” DesignSafe-CI. https://doi.org/10.17603/ds2-jwf6-s535.
Additional readings will be made available through Google Drive.
Vijayan, Lavanya. (2019). Python Quick Start. https://www.linkedin.com/learning/python-quick-start/
Davis, Annyce. (2020). Programming Foundations: Fundamentals. https://www.linkedin.com/learning/programming-foundations-fundamentals-3/
Buscha, Franz. (2019). Introduction to Stata 15. https://www.linkedin.com/learning/introduction-to-stata-15/