Blog with a homebrewed Search™ gem (./gems/search
).
- Ruby on Rails 5
- PostgreSQL
- ReactJS
- ES6
An enter point of the App is app/controllers/posts_controller.rb
. It implements an API for ReactJS and renders the main layout.
Search is implemented using a search
gem. The App has defined a search class (app/searches/posts_search.rb
) and it creates callbacks for a model that we are indexing.
Search™ provides tiny but extensible DSL for defining indexes.
Also, it adds 2 callbacks to a given model (after_save
and after_destroy
).
# app/searches/posts_search.rb
class PostsSearch < Search::Index
model Post
text_field :title,
char_filter: :phonetic,
tokenizer: [:standard, :ngram],
token_filter: [:lowercase, :stopword, :stemmer]
text_field :body,
char_filter: [:phonetic, :strip_html],
tokenizer: :standard,
token_filter: [:lowercase, :stopword, :stemmer, { length: { min: 2 } }]
text_field :author_name, token_filter: :lowercase
end
Search™ stores tokens in a Search::Token
model. You should run the generator which generates a migration to create table for it:
bundle exec rake search:install
Search™ performs each defined performer one by one in a specific order: Tokenizers -> CharFilters -> TokenFilters
.
Splits an input string for tokens.
Actually, does nothing. Just stores a whole input string as one token.
text_field :body, tokenizer: :plain
This is a default tokenizer.
Splits an input string by space and punctuation.
text_field :body, tokenizer: :standard
Splits input string by n-length grams. It may accept additional parameters:
size
— size of ngrams (default3
);word_separator
— ngrams will be splitted by this parameter (default' '
);padchar
— fills remaining position of a ngram (default''
).
text_field :body, tokenizer: :ngram
text_field :title, tokenizer: { ngram: { size: 5 } }
Mutates chars inside each token.
Replaces common phonetic patterns.
text_field :body, char_filer: :phonetic
Sanitizes HTML tags from an input string.
text_field :body, char_filer: :strip_html
Mutates and filters whole tokens.
Converts all chars to lowercase.
text_field :body, token_filter: :lowercase
Filters stopwords (for example a/the/...
) from tokens.
text_field :body, token_filter: :stopword
Stemms tokens.
text_field :body, token_filter: :stemmer
Filters tokens by length with a given range. Possible parameters:
min
max
text_field :body, token_filter: :length
text_field :title, token_filter: { length: { min: 2, max: 10 } }
Each performer should have defined perform
block and return an array of Search::Token
(flat_map_tokens
helps you with that).
# lib/search/token_filter/bang.rb
class Search::TokenFilter::Bang
include Search::Performing
perform do |string_or_tokens|
flat_map_tokens(string_or_tokens) do |token|
token.term += '!'
end
end
end
PostsSearch.index! # indexes all posts
PostsSearch.index!(Post.first) # indexes the given record
PostsSearch.index!(Post.where(author_id: 1)) # indexes the given scope
PostsSearch.delete! # deletes all records from index
PostsSearch.delete!(Post.first) # deletes only given record from index
PostsSearch.delete!(Post.where(author_id: 1)) # deletes all given records from index
Also, you can pass same parameters to a Search.index!
/Search.delete!
methods to update all indexes where this record is provided.
# app/searches/posts_search.rb
class PostsSearch < Search::Index
model Post
end
# app/searches/autocomplete_search.rb
class AutocompleteSearch < Search::Index
model Post
end
Search.index!(Post.first) # updates the record in both indexes
Search.delete!(Post.first) # deletes the record from both indexes
PostsSearch.search('Car') # returns ActiveRecord::Relation with matched posts
PostsSearch.search('Car', highlight: true) # assigns `highlights` attribute with matched terms to every post
cd gems/search
cp spec/internal/config/database.yml{.example,}
rake
You can add require 'search/rspec'
to your own spec_helper.rb
to enable some useful matchers:
expect { Post.create }.to index
expect { Post.destroy_all }.to delete_from_index
let(:post) { Post.create }
expect { post }.to index(post)
let!(:post) { Post.create }
expect { post.destroy }.to delete_from_index(post)