Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example: minimal cost is not returned #24

Open
taylorpetty opened this issue Dec 9, 2020 · 1 comment
Open

Example: minimal cost is not returned #24

taylorpetty opened this issue Dec 9, 2020 · 1 comment

Comments

@taylorpetty
Copy link

taylorpetty commented Dec 9, 2020

Consider the following scenario:

Inserting a C costs 3. Everything else costs 1. The function returns lev(' ','C') = 3, but in actuality the minimal cost should be 2, since it's cheaper to insert an A (+1) and then transform the A into a C (+1).

I haven't been able to think of a solution yet, but a workaround at the least is to say that the single-character costs entered in as weights must already be minimal.

Deletion has the same issue, but substitution does not.

Here's an example:

from weighted_levenshtein import lev
import numpy as np

icosts = np.ones(128, dtype=np.float64)
dcosts = np.ones(128, dtype=np.float64)
scosts = np.ones((128, 128), dtype=np.float64)

icosts[ord('C')] = 4

dcosts[ord('G')] = 10

scosts[ord('A'),ord('T')] = 7

print(lev('','C',icosts,dcosts,scosts)) # returns 4, should return 2

print(lev('','A',icosts,dcosts,scosts)+lev('A','C',icosts,dcosts,scosts)) # returns 2

print(lev('G','',icosts,dcosts,scosts)) # returns 10, should return 2

print(lev('G','T',icosts,dcosts,scosts)+lev('T','',icosts,dcosts,scosts)) # returns 2

print(lev('A','T',icosts,dcosts,scosts)) # returns 2
```
@pachewise
Copy link
Contributor

Hi @tmpetty - thanks for reporting this issue. I was able to reproduce. If you do come up with a solution, please open a PR and we'll review and approve. (Right now, we do not have engineers who can focus on fixing this)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants