Skip to content

Understanding how Simila works

Mehran Davoudi edited this page Oct 3, 2015 · 1 revision

How Simila Works

How Simila Structures a String

When Simila starts to compare 2 strings, she breaks them into some Phrases.

Phrase: A string consists of one or more phrases separated with '.' character.

For example:

String: Today is a good day. Tomorrow will be better.

Phrases: [Today is a good day] + [Tomorrow will be better]

After then, Simila breaks Phrases into Words.

Word: A phrase consists of one or more words separated with non alpha-numeric characters.

Phrase: It is a good day

Words: [It]+[is]+[a]+[good]+[day]

And finally, Simila breaks Words into Characters

Character: A word consists of one or more characters.

Word: Tomorrow Characters: [T]+[o]+[m]+[o]+[r]+[r]+[o]+[w]

How Simila Compares everything:

In Simila, there is a very basic interface:

interface ISimilarityResolver<T>
{
    float GetSimilarity(T left, T right)
}

So, there are some default implementations for each type:

class CharacterSimilarityResolverDefault : ISimilarityResolver<char>
{
    // Some implementation for characters.
}

class WordSimilarityResolverDefault : ISimilarityResolver<Word>
{
    // Some implementation for words.
}

class PhraseSimilarityResolverDefault : ISimilarityResolver<Phrase>
{
    // Some implementation for phrases.
}

The interesting point is that all of these classes are just default implementations in Simila. So you can configure Simila to use your SimilarityResolvers for each type during its algorithm.