TexTa

TexTa is a tagger that extracts contextual information from free text.

Given a free text, the script is able to extract information about 4 categories: activities, emotions, interactions and places. For each of these categories there is a dictionary, which contains a list of sub-categories.

Text given in input is parsed and then matched to the sub-categories by handwritten rules, which take into account syntactic information (lemmas, Parts-Of-Speech, dependency structure, ...).

Requirements

Requires Python 3.x
Requires the following Python libraries:
- spaCy v2.2.3
- spaCy language model 'en_core_web_sm' v2.2.5
- re

Installing spaCy and needed models

Install spaCy via pip or your preferred method (see here for more details)

pip install -U spacy
Download language model

python spacy -m download en_core_web_sm

Input

text

[choose how to pass the text to the file and how to get the output]

Output

For each category returns a matches list containing:

a numeric id for the matched sub-category
a number that states the point in the sentence where the match starts
a number that states the point in the sentence where the match ends

e.g. "We're playing games" will return this output:

[(5133706519360878345, 2, 3), (5133706519360878345, 2, 4), (5133706519360878345, 3, 4)]
5133706519360878345 is the id for the sub-category 'leisure'
2,3 is the span for 'playing'
2,4 is the span for 'playing games'
3,4 is the span for 'games'

! notice that in the span interval, the first number is included, the second one is NOT included

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
README.md		README.md
activities.py		activities.py
emotions.py		emotions.py
interactions.py		interactions.py
places.py		places.py
tagger.py		tagger.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TexTa

Requirements

Installing spaCy and needed models

Input

Output

About

Releases

Packages

Languages

biavarone/free_text_tagger

Folders and files

Latest commit

History

Repository files navigation

TexTa

Requirements

Installing spaCy and needed models

Input

Output

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages