-
Notifications
You must be signed in to change notification settings - Fork 5
Understanding how Simila works
When Simila starts to compare 2 strings, she breaks them into some Phrases.
Phrase: A string consists of one or more phrases separated with '.' character.
For example:
String: Today is a good day. Tomorrow will be better.
Phrases:
[Today is a good day
] + [Tomorrow will be better
]
After then, Simila breaks Phrases into Words.
Word: A phrase consists of one or more words separated with non alpha-numeric characters.
Phrase: It is a good day
Words:
[It
]+[is
]+[a
]+[good
]+[day
]
And finally, Simila breaks Words into Characters
Character: A word consists of one or more characters.
Word: Tomorrow
Characters: [T
]+[o
]+[m
]+[o
]+[r
]+[r
]+[o
]+[w
]
In Simila, there is a very basic interface:
interface ISimilarityResolver<T>
{
float GetSimilarity(T left, T right)
}
So, there are some default implementations for each type:
class CharacterSimilarityResolverDefault : ISimilarityResolver<char>
{
// Some implementation for characters.
}
class WordSimilarityResolverDefault : ISimilarityResolver<Word>
{
// Some implementation for words.
}
class PhraseSimilarityResolverDefault : ISimilarityResolver<Phrase>
{
// Some implementation for phrases.
}
The interesting point is that all of these classes are just default implementations in Simila. So you can configure Simila to use your SimilarityResolvers
for each type during its algorithm.