Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gffutils error at stage "Populating features" #7

Open
jolbi opened this issue May 30, 2024 · 2 comments
Open

gffutils error at stage "Populating features" #7

jolbi opened this issue May 30, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@jolbi
Copy link

jolbi commented May 30, 2024

Hi, thank you for the tool, I am very excited to try it and compare the results from liftoff.

However, I am getting a gffutils error at the stage after miniprot:

>> Creating liftoff annotation database : /path_to_dir/lifton_output/liftoff/liftoff.gff3_polished
2024-05-30 12:16:44,480 - INFO - Populating features
gffutils database build failed with UNIQUE constraint failed: features.id

My command was:

lifton \
-g $ref_gff \
-o $out_dir/$asm_name."$gff_name"_lifton.gff3 \
-u $out_dir/$asm_name."$gff_name"_lifton_unmapped.txt \
-chroms $in_dir/$ref_name.chroms.txt \
-copies -polish -cds -sc 0.96 -flank 0.1\
-t $threads \
$asm \
$ref

The error suggests that some gff features don't have unique IDs. I cheked the input gff and it does not contain any duplicated IDs. I also run it through AGAT and it does not find any errors. Liftoff runs well on this gff. I am suspecting the problem is with the output features.

@Kuanhao-Chao
Copy link
Owner

Hi @jolbi,

Thanks for reporting this. Did the error occur when building the Liftoff annotation into the gffutils database? I believe the error might be due to some ID issues with exons and CDSs. Could you please check if there are any ID duplications for exons and CDSs?

I have pushed another commit so that if the error occurs, gffutils will attempt to build the database again using the merge_strategy "warning instead of the original create_unique. Could you please help us test if this fixes your error? You can download LiftOn again through Git. Clone the directory and run python setup.py install. This should install the latest version of LiftOn. For more details, please visit: https://ccb.jhu.edu/lifton/content/installation.html

Feel free to send me any of your files ([email protected]), and I can take a look as well.

Best,
Kuan-Hao

@Kuanhao-Chao Kuanhao-Chao added the bug Something isn't working label Jun 1, 2024
@jolbi
Copy link
Author

jolbi commented Jun 6, 2024

Did the error occur when building the Liftoff annotation into the gffutils database?

I am not sure if you are referring to standalone Liftoff or Liftoff as part of LiftOn?

  1. Standalone Liftoff runs fine with this gff
  2. When running LiftOn error occurs after miniprot stage finishes (so Liftoff part runs fine).

Could you please check if there are any ID duplications for exons and CDSs?

I checked for duplicates again (but this time manually) and there are indeed some CDS and UTR features with duplicated IDs. They seem to be non-overlapping though (but did not check all of them). I am not sure how to interpret this features, but I see that gff3 specification allow for duplicated IDs for discontinuous features, so it may be good to allow them in LiftOn also.

I have pushed another commit so that if the error occurs, gffutils will attempt to build the database again using the merge_strategy "warning instead of the original create_unique. Could you please help us test if this fixes your error? You can download LiftOn again through Git. Clone the directory and run python setup.py install. This should install the latest version of LiftOn. For more details, please visit: https://ccb.jhu.edu/lifton/content/installation.html

I tried to install the latest commit, but I am getting error: Couldn't find a setup script in /tmp/easy_install-liu654lb/numpy-2.0.0rc2.tar.gz when running python setup.py install.
I don't have much experience with installing from source, so it may be some stupid mistake.

I sent the files to your email.

Cheers,
Tim

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants