Skip to content

Commit

Permalink
Update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
Mike Bennett committed Apr 21, 2022
1 parent f24f217 commit f61f213
Showing 1 changed file with 59 additions and 6 deletions.
65 changes: 59 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,17 +9,23 @@ NLP Server provides a simple API for non-python programming languages to access

The server is simple to set up and easy to integrate with your programming language of choice.

## DigitalDogsbody Fork
This fork adds a second language prediction service, using the [FastText](https://fasttext.cc/) library, in support of the fantastic work of [The Archipelago Team](https://github.com/esmero).

An extra endpoint has been added at `/fasttext` and is documented below. Additionally, the web interface, requirements file and license file have been updated. Everything else is left as upstream and the extra functionality of this fork is only for the Python version herein - ports to the PHP and Laravel versions are welcome :-)

## PHP & Laravel clients
A PHP library and a Laraval package is available:
* https://github.com/web64/php-nlp-client
* https://github.com/web64/laravel-nlp


## Step1: Core Installation
The NLP Server has been tested on Ubuntu, but should work on other versions of Linux.
The upstream NLP Server project has been tested on Ubuntu, and this fork has been tested on Debian Buster (via the [Archipelago nlpserver Dockerfile](https://github.com/esmero/archipelago-docker-images/tree/main/nlpserver)) but should work on other versions of Linux.

```bash
git clone https://github.com/web64/nlpserver.git
cd nlpserver
git clone https://github.com/digitaldogsbody/nlpserver-fasttext.git
cd nlpserver-fasttext

sudo apt-get install -y libicu-dev python3-pip
sudo apt-get install polyglot
Expand Down Expand Up @@ -50,7 +56,10 @@ python3 -m spacy download en
python3 -m spacy download es
python3 -m spacy download xx
```

### Step 4: Download the FastText language classification model
```bash
curl -L "https://dl.fbaipublicfiles.com/fasttext/supervised-models/lid.176.bin" -O
```

### Detailed Installation
If you have any problems installing from requirements.txt you can instead install the libraries one by one.
Expand All @@ -73,6 +82,7 @@ sudo pip3 install readability-lxml
sudo pip3 install BeautifulSoup4
sudo pip3 install afinn
sudo pip3 install textblob
sudo pip3 install git+https://github.com/facebookresearch/fastText.git
```
The /status api endpoint will list missing python modules: http://localhost:6400/status

Expand Down Expand Up @@ -126,10 +136,12 @@ Endpoint|Method|Parameters|Info|Library
/polyglot/entities|POST|text,lang|Entity extraction and sentiment analysis for provided text|polyglot
/polyglot/sentiment|POST|text,lang|Sentiment analysis for provided text|polyglot
/polyglot/neighbours|GET|word,lang|Embeddings: neighbouring words|polyglot
/langid|GET,POST|text|Language detection for provided text|langid
/langid|GET,POST|text|Language detection for provided text with langid|langid
/fasttext|GET,POST|text,predictions|Language dectection for provided text with FastText|fasttext
/gensim/summarize|POST|text,word_count|Summarization of long text|gensim
/spacy/entities|POST|text,lang|Entity extraction for provided text in given language|SpaCy


## Usage
For API responses see /response_examples/ directory.

Expand All @@ -149,7 +161,7 @@ Example JSON response: https://raw.githubusercontent.com/web64/nlpserver/master/
curl -d "html=<html>...</html>" http://localhost:6400/newspaper
```

### Language Detection
### Language Detection with langid
`GET|POST /langid?text=what+language+is+this`

```bash
Expand All @@ -164,6 +176,47 @@ langid: {
}
```

### Language Detection with FastText
`GET|POST /fasttext?text=what+language+is+this`

```bash
curl http://localhost:6400/fasttext?text=what+language+is+this
```

Returns language code of provided text
```json
"fasttext": {
"language": "en",
"score": 0.9485139846801758
}
```

An optional parameter `predictions` allows more than one candidate language to be predicted by fasttext:
```bash
curl http://localhost:6400/fasttext?text=what+language+is+this&predictions=3
```

```json
"fasttext": {
"language": "en",
"score": 0.9485139846801758,
"results": [
[
"en",
0.9485139846801758
],
[
"bn",
0.009047050029039383
],
[
"ru",
0.005073812324553728
]
]
}
```

### Polyglot Entity Extraction & Sentiment Analysis
`POST /polyglot/entities [params: text]`
```bash
Expand Down

0 comments on commit f61f213

Please sign in to comment.