Skip to content
/ LOTR Public

Classifying the Lord of the Rings Corpus using Latent Dirichlet Allocation and an API that employs the use of Naïve Bayes, Maximum Entropy, and Support Vector Machines.

Notifications You must be signed in to change notification settings

jblazzy/LOTR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LOTR

Sentiment analysis and topic modeling on the LOTR corpus.
All significant work is located in LOTR.ipynb, I found Jupyter Notebook to be more effective for this project, as I spent far less time compiling the program, this is usually the case for programs involving work with Natural Language Processing (and extensive use of Lists of Lists).

Corpus Details:

  • The Silmarillion
  • The Hobbit
  • The Fellowship of the Ring
  • The Two Towers
  • The Return of the King

DO NOT move around the text files. (.txt and '-chapters' files)

The program uses packages from the following libraries: NLTK, sklearn, mglearn, matplotlib, collections as well as several other independent packages.

The exported dataframes were typically used with Excel or Tableau (with access to a 14-day trial).

Documentation

http://help.sentiment140.com/for-students
https://github.com/JonathanReeve/chapterize
Introduction to Machine Learning with Python (student accessible): https://www.safaribooksonline.com/library/view/introduction-to-machine/9781449369880/ch01.html

About

Classifying the Lord of the Rings Corpus using Latent Dirichlet Allocation and an API that employs the use of Naïve Bayes, Maximum Entropy, and Support Vector Machines.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published