Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trie compression via double-array tries #29

Open
dylon opened this issue Jul 23, 2014 · 2 comments
Open

Trie compression via double-array tries #29

dylon opened this issue Jul 23, 2014 · 2 comments
Assignees
Milestone

Comments

@dylon
Copy link
Member

dylon commented Jul 23, 2014

Here's one implementation: http://linux.thai.net/~thep/datrie/

For liblevenshtein, implement the algorithm described in, "Compression of double array structures for fixed length keywords", by Masao Fuketa, Hiroya Kitagawa, Takuki Ogawa, Kazuhiro Morita, and Jun-ichi Aoe. Have several implementations: one which uses a dense array as described by the paper and another that uses jagged arrays as a sparse matrix. Extend the algorithms in the paper to return the outgoing labels from the current state.

Some Papers:

  1. "Space-efficient Static Trees and Graphs", by Guy Jacobson (maybe: it has a high compression rate but slow query time)
  2. "An Efficient Implementation of Trie Structures", by Jun-Ichi Aoe, Katsushi Morimoto, and Takashi Sato
  3. "An efficient deletion method for a minimal prefix double array", by Susumu Yata, Masaki Oono, Kazuhiro Morita, Masao Fuketa, and Jun-ichi Aoe
  4. "A compact static double-array keeping character codes", by Susumu Yata, Masaki Oono, Kazuhiro Morita, Masao Fuketa, Toru Sumitomo, and Jun-ichi Aoe
  5. "Compression of double array structures for fixed length keywords", by Masao Fuketa, Hiroya Kitagawa, Takuki Ogawa, Kazuhiro Morita, and Jun-ichi Aoe
  6. "New methods for compression of MP double array by compact management of suffixes", by Tshering C. Dorji, El-sayed Atlam, Susumu Yata, Mahmoud Rokaya, Masao Fuketa, Kazuhiro Morita, and Jun-ichi Aoe
  7. "B-tries for disk-based string management", by Nikolas Askitis and Justin Zobel (disk-based storage of larger-than-memory tries)
@dylon dylon added this to the 2.1.0 milestone Jul 23, 2014
@dylon dylon self-assigned this Jul 23, 2014
@lygstate
Copy link

lygstate commented Jul 24, 2016

Ping for A new compression method for double-array structures by a hierarchical representation
http://www.inderscienceonline.com/doi/abs/10.1504/IJISTA.2015.074331

@dylon
Copy link
Member Author

dylon commented Jul 26, 2016

Nice! I'll check it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants