This repository contains data, python scripts and notebooks for Jinfei Zhu's ([email protected]) content analysis project studying personal finance concerns.
Download Scraped Reddit corpora from Google Drive (Because GitHub has 100M size limit):
https://drive.google.com/drive/folders/1KgXh2D3YdoO-CfWUviPcpfgESOI9hMNG?usp=sharing
Python LDA visualization https://github.com/bmabey/pyLDAvis
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. the Journal of machine Learning research, 3, 993-1022.
Blei, D., & Lafferty, J. (2006). Dynamic Topic Models (Vol. 2006). https://doi.org/10.1145/1143844.1143859
Evans, J., Desikan, B. S., & Kwon, H. (2021). Computational Contetn Analysis Notebooks and Slides.
Gareth James, D. W. T. H. (2013). An introduction to statistical learning : with applications in R. New York : Springer, [2013] ©2013.
Reddit. (2021). Personal Finance Wiki https://www.reddit.com/r/personalfinance/wiki/index
Federal Researve. (2018). Report on the Economic Well-Being of U.S. Households in 2017
Scikit-learn. (2021). CountVectorizer https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html