This project aims to extend the Joint Sentiment-Topic Model (JST) by a dynamic time component. With this, sentiment-topics can be analyzed over time to do trend analytics.
To compile the project type make
in a shell. Alternatively you can import the project into Visual Studio.
// TODO describe how to set the properties
For training on a single dataset execute:
jst -est -config YOUR-PATH/train.properties
This is equivalent to execute JST.
To continuously train on new data execute:
jst -djst -config YOUR-PATH/test.properties
The trained parameters/distributions will be written out in the following files:
<iter>.others
Holds summaries on the corpus statistics, like how many docs and words have been discovered. Also the training parameter details will be displayed (how many iterations, topics, senti-labels)<iter>.pi
Contains the per-document sentiment distributions<iter>.phi
Contains the sentiment specific topic-word distributions<iter>.theta
Contains the per-document sentiment specific topic proportions. For each topic the sentiment distribution is shown<iter>.tassign
Contains the sentiment label and topic assignments for words in training data. After sampling each word was assigned to a topic- and sentiment-label
The data input format should be like the following:
[Doc_1 name] [token_1] [token_2] ... [token_N]
:
:
[Doc_M name] [token_1] [token_2] ... [token_N]
Each line is a document and it is assumed, that data preprocessing has been performed already (removal of stop words, stemming,...).
The algorithm needs prior information on the sentiment-polarity of words. The sentiment lexicon used here is MPQA and has the following format:
[word] [neu prior prob.] [pos prior prob.] [neg prior prob.]
- Chenghua Lin - Initial work JST - linron84
- Kolja Maier extension to dJST
This project is licensed under the GPL License - see the LICENSE.md file for details
- Fur further documentation on classes and functions see Help
- Lin, He, Yulan and Lin, Chenghua and Gao, Wei and Wong, Kam-Fai Dynamic joint sentiment-topic model, ACM Transactions on Intelligent Systems and Technology, 2013.
- Lin, C., He, Y., Everson, R. and Reuger, S. Weakly-supervised Joint Sentiment-Topic Detection from Text, IEEE Transactions on Knowledge and Data Engineering (TKDE), 2011.
- Lin, C. and He, Y. Joint Sentiment/Topic Model for Sentiment Analysis, In Proceedings of the 18th ACM Conference on Information and Knowl- edge Management (CIKM), Hong Kong, China, 2009.