-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
--mincov
filter applied after culling_limit
#59
Comments
--mincov
filter seems to be applied after culling_limit
--mincov
filter applied after culling_limit
I am trying to get calls for the differen plasmidMLST (pmlst) genes. I have made a database for the schemes in Warwick. When I run abricate I get hits that I believe are not optimal. A longer hit with lower identity is preferred over a shorter hit with 100/100 cov/id. I checked this for the hit that abricate gives as FIA_17, but that I would expect to be FIA_2. After this I made a database with just the FIA_2 sequence and the 100/100 cov/id hit is called. The results from the example:
When using the full pmlst database:
and applying the minid filter removes the hits, although the 100/100 gene is present:
abricate version 0.9.8 Any suggestions on how to tackle this problem would be appreciated! |
@pepijnhuizinga Would Can you provide a link to the Warwick plasmidMLST download page? I think the problem is that I use |
I chose abricate above mlst because my goal was to get all the present alleles and with plasmid mlst its quite common for one isolate to have multiple variants of the same gene. For example the following would not be uncommon:
I was not sure that I could overcome this problem in And I must apologize I wrote Warwick as source of the database, which was incorrect. It is the pubmlst website. The download page I used was the following: https://pubmlst.org/bigsdb?db=pubmlst_plasmid_seqdef&page=downloadAlleles. |
When looking for aminoglycoside resistance genes, Norelle and I noticed that using the
ncbi
database, we weren't finding the genes we expected for some of our genomes.For example, for this genome, we expected to find the gene
aac(6')-Ib_1
, which is in both theresfinder
andncbi
databases:Instead, using the
ncbi
database, we get this:This identifies a partial hit to a bifunctional gene
ANT(3'')-Ih/AAC(6')-Iid
instead. Applying the--mincov
filter in abricate also doesn't work:This just seems to apply this coverage filter after the
blastn -culling_limit 1
filter has been applied.When I make a blast database with the
aac(6')
genes, the bifunctional gene is ranked higher due to a higher bitscore (longer alignment length).I think it has something to do with where you apply the
--mincov
filter inabricate
:You obviously still want to just report the top hit, but could you perhaps set -culling_limit 100, and then filter manually?
The text was updated successfully, but these errors were encountered: