Skip to content

pwbh/inverted-index

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Inverted index

This is a concurrent implementation of the Inverted Index data structure using Rust programming language.

A quote about inverted index from Wikipedia

"In computer science, an inverted index (also referred to as a postings list, postings file, or inverted file) is a database index storing a mapping from content, such as words or numbers, to its locations in a table, or in a document or a set of documents (named in contrast to a forward index, which maps from documents to content).

The purpose of an inverted index is to allow fast full-text searches, at a cost of increased processing when a document is added to the database. The inverted file may be the database file itself, rather than its index.

It is the most popular data structure used in document retrieval systems, used on a large scale for example in search engines. Additionally, several significant general-purpose mainframe-based database management systems have used inverted list architectures, including ADABAS, DATACOM/DB, and Model 204."

Currently this implementation uses a plain resizable array for storing the indexes, I may change in the future for something a little more appropriate. Feel free to send in a PR.

Usage

fn main() {
    let file_path = "./my_files/texts/some_text.txt"
    let inverted_index = inverted_index::InvertedIndex::new(file_path);
    inverted_index.index(100).unwrap();
    println!("{}", inverted_index);
}

Performance

LoT Single Thread Multi Thread (100 worker threads)
80,000 3m 49.75s 3.686s

About

Multi threaded inverted index

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages