Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only parse biological_process #2

Open
JasonTan-code opened this issue Aug 15, 2020 · 1 comment
Open

Only parse biological_process #2

JasonTan-code opened this issue Aug 15, 2020 · 1 comment

Comments

@JasonTan-code
Copy link

Hi,

Thanks for the code. It seems this obo_parser can only work on biological_process and ignore cellular_component and molecular_function.

@allenbaron
Copy link

obo_parser is designed to extract records descending from a single root node. If not specified with the root_id argument obo_parser will identify a root node using the records and then extract all of its descendants. The three gene ontology terms you specify do not share a parent term; each serves as a root node. That's why it only ever downloads biological_process. You can deal with this in one of 3 ways:

  1. Add a record to the .obo file that represents an arbitrary parent node for the 3 terms and add that as the parent for each of the GO root nodes (biological_process, cellular_component, and molecular_function).
  2. Run obo_parser 3 times specifying each of biological_process (GO:0008150), cellular_component (GO:0003674), and molecular_function (GO:0005575) as the root_id once. This will generate 3 files, dropping "obsolete" entries, which you could then combine.
  3. Use a fork of obo_parser that I modified to extract all the records; it is pretty much the same but accepts a --return_all flag. This will keep all records, including "obsolete" entries. Note: I haven't made any updates to correct url errors so you'll have to download the file first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants