Skip to content

Latest commit

 

History

History
341 lines (226 loc) · 10.8 KB

README.md

File metadata and controls

341 lines (226 loc) · 10.8 KB

Nera.vim

Named Entities Recognition (Rasa-like syntax) Annotator for the vim editor.

🤔 what is it for?

This vim plugin helps to annotate named entities using simple entity annotation syntax, using inline-text mark-up tags, following the format used in RASA YAML files.

The final goal is to possibly demonstrate how fast is annotate with a text editor (specifically vim) a text file of intents + entities examples of a training set.

What the named entity [entity_value](entity_label) syntax format is?

[entity_value](entity_label)
 ^             ^
 |             |
 |             entity name (label)
 |
 value (sequence of characters/words) for entity referenced with `entity_label`

Where:

  • entity_value

    • is any sequence of characters or words,
    • delimited by characters [ and ]
  • entity_label

    • is the entity name (label) is a string of kind "variable name" in a programming language style, by example the label is made by alphabet letters and the character _
    • the label is delimited by characters ( e )

By example, given the sentence

my name is Giorgio Robino and I live in Genova, corso Magenta 35/4

You want to annotate three entities (entity_label = entity_value):

  • person = Giorgio Robino
  • city = Genova
  • address = corso Magenta 35/4

Using above described syntax, the annotated sentence is:

my name is [Giorgio Robino](person) and I live in [Genova](city), [corso Magenta 35/4](address)

What the plugin does?

With the vim plugin command :NeraSet, you can map up to 12 function keys (<F1>,...,<F12>) to a syntax substitution/decoration "macro" that add a entity label

  • to a visual selected text,
  • or to the current word and a configurable number of adjacent words, setting the cursor to the start of the word (entity) you want to tag.

👊 Commands

In vim command mode (:) these commands are available:

command description
:NeraSet functionKey label [contiguousWords] maps the specified functionKey to a substitution macro with argument label, and optional argument contiguousWords.

functionKey valid values are number 1 ... 12 or strings F1 ... F12, or <F1> ... <F12> key pressing.

label is the entity name (single word in camelCase or snake_case).

contiguousWords is a number of contiguous words to be selected, this is an optional argument (default value is 1).

Utilities:

command description
:NeraMapping shows function keys mapping
:NeraLoad command_script_file load and execute a script file containing Nera commands or any other vim : commands
:NeraLabels label ... [label] Set a list of labels, to be used afterward with NeraSet
:NeraLabelsClear Clear the list of preset labels, to be used afterward with NeraSet

Usage

:NeraSet for current (single) word annotation

Given the sentence (line):

my name is Giorgio Robino and I live in Genova, corso Magenta 35/4

To assign to function key <F1> a substitution for visual mode and single word selection:

  • assign a new "macro" substitution to <F1>

    :NeraSet f1 person_name
    
  • put the cursor at the begin of the word you want to annotate:

    my name is Giorgio Robino  nd I live in Genova, corso Magenta 35/4
               ^
               |
               set the vim cursor here
    
  • press <F1>. The line is updated with the entity notation syntax decoration:

    my name is [Giorgio](person_name) Robino and I live in Genova, corso Magenta 35/4
    

:NeraSet for Multiple contiguous words annotation

Maybe the example above is not what you exactly want, because a full person name is usually composed by two consecutive words (Giorgio Robino), so you maybe want to preset (another or the same) function key <F1> to automatically substitute the current and the successive word. In this case, set the mapping with argument contiguousWords set to 2:

  • assign a new "macro" substitution to <F2>

    :NeraSet F2 person_name 2
    
  • again put the cursor at the begin of the word you want to annotate:

    my name is Giorgio Robino and I live in Genova, corso Magenta 35/4
               ^
               |
               set the vim cursor here
    
  • press <F2>. The line is updated and in this case

    my name is [Giorgio Robino](person_name) and I live in Genova, corso Magenta 35/4
    

:NeraSet for visual mode annotation

Anyway, even if you do not specify the words number argument, you can proceed withe visual selection mode. So:

  • assign a new "macro" substitution to <F3>

    :NeraSet <F3> address
    
  • go in vim visual mode (pressing v) and select the span you want to annotate:

    my name is Giorgio Robino and I live in Genova, corso Magenta 35/4
                                            ^                        ^
                                            |                        |
                                            start visual selection   end visual selection
    
  • press esc and <F3>. The line is updated and in this case

    my name is [Giorgio Robino](person_name) and I live in [Genova, corso Magenta 35/4](address)
    

:NeraLabels :NeraLabelsClear

You want to prepare a precise (short) list of labels you will use afterward to annotate with NeraSet:

:NeraLabels name surname address city age gender 

This list act as the reference list, to validate NeraSet label argument. By example,

NeraSet <F4> job

generates a warning message, because you are setting a label not previously declared:

warning: label 'job' is not one of the configured labels: name surname address city age gender
functionKey: <F4>, label: job, contiguous words: 1
Press ENTER or type command to continue

:NeraLoad

Execute all Nera commands previously saved in specified script file.

  1. you create your script file examples/my_project_configs.vim containing Nera or other vim commands, by example:

    "
    " my_project_configs.vim
    "
    
    " F1 - F4
    NeraSet <F1>  name 1
    NeraSet <F2>  address 1 
    NeraSet <F3>  company 1 
    NeraSet <F4>  location 1 
    
    " F5 - F8
    NeraSet <F5>  email 
    NeraSet <F6>  name 2 
    NeraSet <F7>  name 3
    NeraSet <F8>  address 1 
    
    " F9 - F12
    NeraSet <F9>  gender
    NeraSet <F10> address 3
    NeraSet <F12> company 2 
    
  2. Afterward you run the script from command mode:

    :NeraLoad examples/my_project_configs.vim
    

:NeraMapping

Suppose you run commands:

:NeraSet <F1>  name 1
:NeraSet <F2>  address 1 
:NeraSet <F3>  company 1 
:NeraSet <F4>  location 1 
:NeraSet <F5>  email 

Afterward, you want to show the key mappings:

:NeraMapping
<F1>  c1w[<C-R><C-O>"](name)<Esc>
<F2>  c1w[<C-R><C-O>"](address)<Esc>
<F3>  c1w[<C-R><C-O>"](company)<Esc>
<F4>  c1w[<C-R><C-O>"](location)<Esc>
<F5>  c1w[<C-R><C-O>"](email)<Esc>
<F6>  
<F6>  
<F7>  
<F8>  
<F9>  
<F10> 
<F11>
<F12> 
Press ENTER or type command to continue

💡 Tips

  • Commands arguments auto completion

    When using command NeraSet you can use arguments auto completion (function key, labels, etc.). When using command NeraLoad you can exploit file name argument auto completion

  • Undo labeling

    If you are unhappy with your NeraSet labeling, just undo in vim as usual, pressing u in normal mode!

  • Visual mode is always on

    Any time you assign a key with NeraSet, you set the word mode for a specified number of contiguous words, but you also enable the visual mode! You can optionally

    • select set the cursor at the start of word and press afterward the function key
    • select in visual mode a span of words and press afterward the function key

📦 Install

Using vim-plug, in your .vimrc file:

Plug 'solyarisoftware/nera.vim'

Live demo

Some files available in examples directory of this repo. Here a live demo of this plugin commands usage to annotate entities:

⭐️ Status / How to contribute

This project is a work-in-progress proof-of-concept.

I'm not a vimscript expert, so any coding contribute is welcome.

For any proposal and issue, please submit here on github issues for bugs, suggestions, etc. You can also contact me via email ([email protected]).

I'm especially interested in any markup-based entity syntax formats alternative/different from RASA. Please let me know. Do not esitate to open a 'change request' issue.

IF YOU LIKE THE PROJECT, PLEASE ⭐️STAR THIS REPOSITORY TO SHOW YOUR SUPPORT! 🙏

To do

Changelog

  • v. 0.5.0

    • new commands NeraLabels and NeraLabelsClear
    • NeraSet arguments auto-completion
  • v. 0.4.1

    • NeraLoad new command to load script of commands
    • NeraMapping has now a cleaner list of key mappings
    • NeraSet now accept the function key argument just pressing the corresponding function key!

👏 Acknowledgements

  • Thanks you to biggybi that helped me here, inspiring me to build-up this plugin.

🤝 Related Project

License

MIT (c) Giorgio Robino

top