Skip to content

Various utilities regarding Levenshtein transducers. (Shared files for testing, etc.)

License

Notifications You must be signed in to change notification settings

universal-automata/liblevenshtein-shared

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

liblevenshtein

A library for generating Finite State Transducers based on Levenshtein Automata.

This particular module contains files that are shared among the supported languages, for testing, etc.

Levenshtein transducers accept a query term and return all terms in a dictionary that are within n spelling errors away from it. They constitute a highly-efficient (space and time) class of spelling correctors that work very well when you do not require context while making suggestions. Forget about performing a linear scan over your dictionary to find all terms that are sufficiently-close to the user's query, using a quadratic implementation of the Levenshtein distance or Damerau-Levenshtein distance, these babies find all the terms from your dictionary in linear time on the length of the query term (not on the size of the dictionary, on the length of the query term).

If you need context, then take the candidates generated by the transducer as a starting place, and plug them into whatever model you're using for context (such as by selecting the sequence of terms that have the greatest probability of appearing together).

For a quick demonstration, please visit the Github Page, here.

The library is currently only written in CoffeeScript (and JavaScript), but I will be porting it to other languages, soon. If you have a specific language you would like to see it in, or package-management system you would like it deployed to, let me know.

This library is based largely on the work of Stoyan Mihov, Klaus Schulz, and Petar Nikolaev Mitankin: "Fast String Correction with Levenshtein-Automata". For more details, please see the wiki.

About

Various utilities regarding Levenshtein transducers. (Shared files for testing, etc.)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published