Lithuanian language processing tools to be used in NLP, search or other applications.
Folder: sentence-detect
OpenNLP model for Lithuanian sentence detection.
Scripts to help with building the model:
- add - append new text into the model (see comment inside the script)
- train - build model based on example corpora
- evaluate - evaluate detection quality
Snowball version of Porter stemmer for Lithuanian language was moved to this page.
Folder: language-detect
N-grams for Lithuanian language detection. Used in Apache Tika https://issues.apache.org/jira/browse/TIKA-582
Copyright (C) 2011 UAB TokenMill
Distributed under the Eclipse Public License.