Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
StevenCannon-USDA authored Sep 23, 2024
1 parent e301239 commit ff0a61e
Showing 1 changed file with 5 additions and 2 deletions.
7 changes: 5 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,17 @@ Those files are the following - with "gensp" being the abbreviation for the pres

A traits file corresponds with a publication, named with the pattern `Author_Author_YEAR.yml`, is produced by a curator, and represents minimal essential information about a gene and its function as described by literature cited in the file.

Periodically, the collection of yaml files in a Genus/species/studies directory will be combined and processed to produce a **gensp.traits.yml** file that will go into the datastore, for example into `Glycine/max/gene_functions/`. The processing for addition to the datastore is, however, separate from the basic curation process.
Periodically, the collection of yaml files in a Genus/species/studies directory will be combined and processed to produce a **gensp.traits.yml** file that will go into the datastore, for example into `Glycine/max/gene_functions/`. The processing for addition of gene function information to the datastore is, however, separate from the basic curation process.

<details>
<summary>More about generation of files for the datastore (advanced) ...</summary>

<detals> More about generation of files for the datastore ...
The **gensp.citations.txt** file is generated by the script **get_citations.pl** (in the [scripts directory](https://github.com/legumeinfo/gene-function-registry/tree/main/scripts) of the gene-function-registry repository), which takes gensp.traits.yml as input. This file has five fields: DOI, PubMedID, PubMedCentralID, Author-Author-Year, and full citation. (\*Note that the **get_citations.pl** script can help fill in reference elements in gensp.traits.yml -- specifically, adding doi given the pmid, or the pmid given the doi.)

The **gensp.references.txt** file is generated by the script **get_references.pl**, which takes the gensp.citations.txt as input. This file has the [MEDLINE-format](https://www.nlm.nih.gov/bsd/mms/medlineelements.html) publication information (authors, title, abstract, etc.) for the citations in gensp.citations.txt.

The traits.yml file contains one or more yaml "documents", indicated by three leading dashes (`---`) at the top of each document. Each holds information about one gene with experimentally-established function or trait association. A document might also be thought of as a "function card", with information about one gene for which a phenotypic effect has been established.

</details>

## Curation and review process
Expand Down

0 comments on commit ff0a61e

Please sign in to comment.