Skip to content

Commit

Permalink
Add priority to dataset extension
Browse files Browse the repository at this point in the history
On branch 1.2.0
Your branch is up to date with 'origin/1.2.0'.

Changes to be committed:
	modified:   README.md
	modified:   dataset/README.md
	modified:   dataset/dataset_extension.json
	modified:   fhir/README.md
	modified:   fhir/fhir_extension.json
	modified:   galaxy/README.md
	modified:   galaxy/galaxy_extension.json
	modified:   license/README.md
	modified:   license/license_extension.json
	modified:   scm/README.md
	modified:   scm/scm_extension.json
  • Loading branch information
HadleyKing committed Feb 7, 2023
1 parent a61ab84 commit 1764153
Show file tree
Hide file tree
Showing 11 changed files with 252 additions and 318 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
# extension_schema

extension schema for BCO
51 changes: 16 additions & 35 deletions dataset/README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
# Extension to External References: Additional Licenses and Dataset Catagories

The external references example extension to list additional licenses and dataset catagories .

```json

"extension_domain": [
{
"extension_schema": "http://www.w3id.org/biocompute/extension_domain/1.1.0/dataset/dataset_extension.json",
Expand Down Expand Up @@ -43,56 +44,36 @@ The external references example extension to list additional licenses and datase
```

## **additional_license**

The additional license property contains the details about the licenses applied to the dataset and the script or tool/software used to process the given dataset. Licenses ensure the permissions are clear to use, modify, and share one’s work by other users.

**data_license**: The license applied to the data or the dataset by the author. Usually, most organizations and academic settings widely use Creative Commons licenses that allow open access to knowledge with no or limited restrictions. The most popular license is the [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/) license that allows users to use, share, transform, and distribute the data as per their requirement provided attribution is given to the original author.
eg. ```
"data_license": "https://creativecommons.org/licenses/by/4.0/" ```

**script_license**: The license applied to the computational script or the tool/software developed to process (parse, QC, align) the input dataset for a final output dataset. There are 5 types of commonly used software licenses for publishing the code. Four of these licenses are open-source allowing reuse of the code to some extent while proprietary licenses are the most restrictive license in terms of code reuse. For most dataset scripts, open-source license such as [GNU General Public License v3](http://www.gnu.org/licenses/gpl-3.0.html) is used.
eg. ``` "script_license": "https://www.gnu.org/licenses/gpl-3.0.en.html"```
**data_license**: The license applied to the data or the dataset by the author. Usually, most organizations and academic settings widely use Creative Commons licenses that allow open access to knowledge with no or limited restrictions. The most popular license is the [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/) license that allows users to use, share, transform, and distribute the data as per their requirement provided attribution is given to the original author.
eg. ` "data_license": "https://creativecommons.org/licenses/by/4.0/"`

**script_license**: The license applied to the computational script or the tool/software developed to process (parse, QC, align) the input dataset for a final output dataset. There are 5 types of commonly used software licenses for publishing the code. Four of these licenses are open-source allowing reuse of the code to some extent while proprietary licenses are the most restrictive license in terms of code reuse. For most dataset scripts, open-source license such as [GNU General Public License v3](http://www.gnu.org/licenses/gpl-3.0.html) is used.
eg. ` "script_license": "https://www.gnu.org/licenses/gpl-3.0.en.html"`

## dataset_categories

Dataset categories describe and provide more information about the dataset which can be used to classify, group, sort and filter datasets. Currently, there are six categories for describing a dataset: species, molecule, tags, file_type, status, and scope. Each category has a distinct value.

**species**: This category provides information about the species to which the data belongs. The values can be a single scientific name of any species in NCBI taxonomy. When the datasets contain data from multiple species the value can be repeated

eg. ```{
"category_value": "Homo sapiens",
"category_name": "species"
} ```

eg. `{ "category_value": "Homo sapiens", "category_name": "species" } `

**molecule**: This category provides information about the biological macromolecules to which the data belongs. The value of the category can be either protein, glycan, nucleic acid, lipids, and proteoform. Other molecule values can be used if the molecule category is different than the above.
eg. ``` {
"category_value": "Protein",
"category_name": "molecule"
}```
eg. ` { "category_value": "Protein", "category_name": "molecule" }`

**tag**: This category adds a tag to a dataset. The values for the tag category can be a dataset name, resource name, data types, etc.
eg. ```{
"category_value": "Protein Canonical Accessions",
"category_name": "tag"
} ```
**tag**: This category adds a tag to a dataset. The values for the tag category can be a dataset name, resource name, data types, etc.
eg. `{ "category_value": "Protein Canonical Accessions", "category_name": "tag" } `

**file_type**: This category describes the file format type of the dataset. The values of the file_type category can be csv, txt, fasta, tsv, nt, gpff, etc
eg. ```{
"category_value": "csv",
"category_name": "file_type"
} ```
eg. `{ "category_value": "csv", "category_name": "file_type" } `

**status**: This category provides information about the current status of the dataset. The values of the status category can be: reviewed and retired.
eg. ``` {
"category_value": "reviewed",
"category_name": "status"
}```
eg. ` { "category_value": "reviewed", "category_name": "status" }`

**scope**: This category provides information about the usage scope of the dataset. The values of the scope category are internal and external. Datasets with the internal scope are used for internal purposes whereas datasets with the external scope are custom processed for use by external people or resources.
eg. ```{
"category_value": "external",
"category_name": "scope"
} ```
eg. `{ "category_value": "external", "category_name": "scope" } `

The external references **example** extension to list additional licenses and dataset catagories .


152 changes: 72 additions & 80 deletions dataset/dataset_extension.json
Original file line number Diff line number Diff line change
@@ -1,86 +1,78 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "http://www.w3id.org/biocompute/extension_domain/1.1.0/dataset/dataset_extension.json",
"title": "dataset_extension",
"type": "object",
"description": "The external references extension to list additional licenses and dataset catagories for a dataset BCO (dsBCO).",
"required": ["dataset_extension", "extension_schema"],
"additionalProperties": false,
"properties": {
"dataset_extension": {
"type": "object",
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "http://www.w3id.org/biocompute/extension_domain/1.1.0/dataset/dataset_extension.json",
"title": "dataset_extension",
"type": "object",
"description": "The external references extension to list additional licenses and dataset catagories for a dataset BCO (dsBCO).",
"required": ["dataset_extension", "extension_schema"],
"additionalProperties": false,
"properties": {
"dataset_extension": {
"type": "object",
"additionalProperties": false,
"required": ["dataset_categories"],
"properties": {
"additional_license": {
"type": "object",
"description": "The additional license property contains the details about the licenses applied to the dataset and the script or tool/software used to process the given dataset.",
"additionalProperties": false,
"properties": {
"data_license": {
"title": "data_license",
"type": "string",
"description": "The license applied to the data or the dataset by the author",
"examples": ["https://creativecommons.org/licenses/by/4.0/"]
},
"script_license": {
"title": "script_license",
"type": "string",
"description": "The license applied to the computational script or the tool/software developed to process (parse, QC, align) the input dataset for a final output dataset.",
"examples": ["https://www.gnu.org/licenses/gpl-3.0.en.html"]
}
}
},
"dataset_categories": {
"title": "dataset_categories",
"type": "array",
"description": "Dataset categories describe and provide more information about the dataset which can be used to classify, group, sort and filter datasets.",
"items": {
"required": ["category_value", "category_name"],
"additionalProperties": false,
"required": ["dataset_categories"],
"properties": {
"additional_license": {
"type": "object",
"description": "The additional license property contains the details about the licenses applied to the dataset and the script or tool/software used to process the given dataset.",
"additionalProperties": false,
"properties": {
"data_license": {
"title": "data_license",
"type": "string",
"description": "The license applied to the data or the dataset by the author",
"examples": [
"https://creativecommons.org/licenses/by/4.0/"
]
},
"script_license": {
"title": "script_license",
"type": "string",
"description": "The license applied to the computational script or the tool/software developed to process (parse, QC, align) the input dataset for a final output dataset.",
"examples": [
"https://www.gnu.org/licenses/gpl-3.0.en.html"
]
}
}
},
"dataset_categories": {
"title": "dataset_categories",
"type": "array",
"description": "Dataset categories describe and provide more information about the dataset which can be used to classify, group, sort and filter datasets.",
"items": {
"required": [
"category_value",
"category_name"
],
"additionalProperties": false,
"properties": {
"category_value": {
"type": "string",
"title": "category_value",
"description": "An explanation about the purpose of this instance.",
"examples": [
"Homo sapiens"
]
},
"category_name": {
"type": "string",
"title": "category_name",
"description": "An explanation about the purpose of this instance.",
"enum": [
"species",
"molecule",
"tag",
"file_type",
"status",
"scope"
]
}
}
}
}
"category_value": {
"type": "string",
"title": "category_value",
"description": "An explanation about the purpose of this instance.",
"examples": ["Homo sapiens"]
},
"category_name": {
"type": "string",
"title": "category_name",
"description": "An explanation about the purpose of this instance.",
"enum": [
"species",
"molecule",
"tag",
"tags",
"priority",
"file_type",
"status",
"scope"
]
}
}
},
"extension_schema": {
"title": "extension_schema",
"type": "string",
"format": "uri",
"description": "The schema applied to the extension object",
"examples": [
"http://www.w3id.org/biocompute/extension_domain/example.json"
]

}
}
}
}
},
"extension_schema": {
"title": "extension_schema",
"type": "string",
"format": "uri",
"description": "The schema applied to the extension object",
"examples": [
"http://www.w3id.org/biocompute/extension_domain/example.json"
]
}
}
}
9 changes: 4 additions & 5 deletions fhir/README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@

# Extension to External References: SMART on FHIR Genomics

The external references **example** extension to FHIR resource demonstrates how specific data elements can be extracted from EHR systems or other secure FHIR endpoints via technologies such as [SMART on FHIR Genomics](https://www.ncbi.nlm.nih.gov/pubmed/26198304) without compromising patient and providers’ information. This is because the portions being transferred contain no identifiable information about the patient. Instead there is a reference to the actual resource instance (via FHIR URL) through which all data is accessed.

The link to FHIR can also be added to the usability domain. More on FHIR Genomics in release 3 of FHIR can be found here: https://www.hl7.org/fhir/genomics.html
The link to FHIR can also be added to the usability domain. More on FHIR Genomics in release 3 of FHIR can be found here: https://www.hl7.org/fhir/genomics.html

SMART on FHIR Genomics provides a framework for EHR-based apps built on FHIR that integrate clinical and genomic information. For more information on how to use the SMART on FHIR Genomics apps, please visit http://projects.iq.harvard.edu/smartgenomics/.
SMART on FHIR Genomics provides a framework for EHR-based apps built on FHIR that integrate clinical and genomic information. For more information on how to use the SMART on FHIR Genomics apps, please visit http://projects.iq.harvard.edu/smartgenomics/.

```json
"extension_domain":[
Expand Down Expand Up @@ -45,12 +44,12 @@ SMART on FHIR Genomics provides a framework for EHR-based apps built on FHIR tha

## FHIR Extension "fhir_extension"

The `fhir_extension` is defined as an array of endpoints from which to fetch resources.
The `fhir_extension` is defined as an array of endpoints from which to fetch resources.

## FHIR Endpoint "fhir_endpoint"

`fhir_endpoint` is a string containing the URL of the endpoint of the FHIR server containing the resource. `fhir_version` must be present showing the FHIR version used.

## FHIR Resources "fhir_resources"

`fhir_resources` is an array of resources to fetch from the endpoint, where `fhir_resource` is a string containing the type of resource used according to the specified version. (a full list of permitted FHIR 3 resources is available at http://hl7.org/fhir/STU3/resourcelist.html) `fhir_id` is a string containing the server-specific identifier for the resource instance.
`fhir_resources` is an array of resources to fetch from the endpoint, where `fhir_resource` is a string containing the type of resource used according to the specified version. (a full list of permitted FHIR 3 resources is available at http://hl7.org/fhir/STU3/resourcelist.html) `fhir_id` is a string containing the server-specific identifier for the resource instance.
Loading

0 comments on commit 1764153

Please sign in to comment.