Skip to content
This repository has been archived by the owner on Aug 10, 2021. It is now read-only.
/ marinetext Public archive

Computational experiments with papers related to marine biology

Notifications You must be signed in to change notification settings

sul-cidr/marinetext

Repository files navigation

Command line script to parse plain text files and write a csv file that identifies marine species within each file.

Install

Create a virtual environment and run:

pip install -r requirements.txt

Run

Make sure that all plain text files are in one directory.


  Small program to extract sets of named entities from texts
  based on a defined dictionary.

Options:
  --taxon_file TEXT  tsv file containing taxon data
  --text_dir TEXT    name of directory containing all plain text
                     files to be processed
  --output TEXT      filename for saving the filename and entity
                     data
  --help             Show this message and exit.

Sample command:

python ent_cli.py --taxon_file="WoRMS/taxon.txt" --text_dir=allpapers --output=entities.csv

About

Computational experiments with papers related to marine biology

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published