-
Notifications
You must be signed in to change notification settings - Fork 19
2. Getting Started
This section describes patient data and knowledge data used to drive the operations. You can experiment with predefined queries (see postman collection) or create your own queries based on available data.
Representative patients are listed in the following table. Not all patients have genetic testing data. Several patients (e.g. HG00403, HG00406) have whole exome sequencing data; many patients (e.g. ABC456, HCC1143) have been studied for structural variants; some patients (e.g. HCC1143) have somatic data; there are patients with PGx star alleles (e.g. XYZ123) and HLA haplotypes (e.g. NB6TK328). Patient NA19238 is the mother, and patient NA19239 is the father of patient NA19240. Some patient data is based on build37 (e.g. HG02657), and some is based on build38 (e.g. CA12345). The find-study-metadata operation can be used to see what types of testing a patient has had.
patientID | Sex | patientID | Sex | patientID | Sex |
---|---|---|---|---|---|
ABC123 | M | m123 | M | NA19240 | F |
ABC456 | M | NA18498 | M | NA19247 | F |
ABC789 | F | NA18499 | F | NA19256 | M |
CA12345 | M | NA18870 | F | NB6TK328 | F |
HCC1143 | F | NA18871 | M | NB6TK329 | F |
HG00403 | M | NA19190 | F | XYZ123 | F |
HG00406 | M | NA19210 | M | XYZ234 | F |
HG02657 | M | NA19238 | F | XYZ345 | F |
huC30902 | M | NA19239 | M | --- | --- |
Knowledge data is used to dynamically compute diagnostic and therapeutic implications of genetic variants. This reference implementation is piloting the draft GA4GH Variant Annotation (VA) knowledge structures distributed as part of the GA4GH Genomic Knowledge Pilot. In the future, we anticipate using GA4GH VA-encoded knowledge to drive automated knowledge updates of the reference implementation.
Clinvar knowledge is based on a Aug 2022 extract, using both variant summary data and submission summary data. The Clinvar snapshop is limited to ACMG genes. Conditions are coded with Medgen codes (codeSystem='https://www.ncbi.nlm.nih.gov/medgen')
PharmGKB knowledge is based on a Dec 2021 extract. The PharmKGB snapshot is limited to CPIC Level A star alleles in CYP2B6, CYP2C9, CYP2C19, CYP2D6, CYP3A5, NUDT15, SLCO1B1, TPMT, UGT1A1. Medications are coded with RxNorm ingredient codes (codeSystem='http://www.nlm.nih.gov/research/umls/rxnorm')
CIViC knowledge is based on a Sep 2022 extract. The CIViC snapshot is limited to simple variants. Conditions are coded with Disease Ontology codes (codeSystem='https://disease-ontology.org'). Medications are coded with RxNorm ingredient codes (codeSystem='http://www.nlm.nih.gov/research/umls/rxnorm')
Variants in the reference implementation are enhanced with population allele frequencies and predicted molecular consequences. A software utility 'vcfPrepper' that implements our molecular consequence pipeline can be found here.
Population allele frequency data is obtained from gnomAD. gnomAD v2.1.1 contains data from 125,748 exomes, mapped to GRCh37; gnomAD v2 liftover contains gnomAD v2.1.1 data lifted over to GRCh38. Population allele frequencies are returned in the FHIR Genomics Variant profile, in component population-allele-frequency (LOINC 92821-8).
This section describes additional APIs provided as part of the reference implementation that are not part of FHIR Genomics Operations.
This utility returns genomic feature coordinates and other annotations. All data are from NCBI Human Genome Resources. For chromosomes, build 37 and build 38 reference sequences are returned. For genes, genomic coordinates are returned, along with a list of transcripts. MANE transcript is flagged. For transcripts, genomic coordinates are returned, along with the gene name and composite exons, along with exon coordinates. For proteins, the corresponding transcript is returned.
This utility returns all genes that intersect with a provided genomic region. Gene locations are from NCBI Human Genome Resources.