Sentiment analysis and topic modeling on the LOTR corpus.
All significant work is located in LOTR.ipynb, I found Jupyter Notebook to be more effective for this project, as I spent far less time compiling the program, this is usually the case for programs involving work with Natural Language Processing (and extensive use of Lists of Lists).
Corpus Details:
- The Silmarillion
- The Hobbit
- The Fellowship of the Ring
- The Two Towers
- The Return of the King
DO NOT move around the text files. (.txt and '-chapters' files)
The program uses packages from the following libraries: NLTK, sklearn, mglearn, matplotlib, collections as well as several other independent packages.
The exported dataframes were typically used with Excel or Tableau (with access to a 14-day trial).
http://help.sentiment140.com/for-students
https://github.com/JonathanReeve/chapterize
Introduction to Machine Learning with Python (student accessible): https://www.safaribooksonline.com/library/view/introduction-to-machine/9781449369880/ch01.html