Skip to content

Commit

Permalink
Merge pull request #28 from monarch-initiative/update_docs
Browse files Browse the repository at this point in the history
cleaned up docs and adapted to the two use cases
  • Loading branch information
pnrobinson authored Jun 20, 2024
2 parents 8c5bba8 + 7d41b74 commit dbd3959
Show file tree
Hide file tree
Showing 3 changed files with 31 additions and 31 deletions.
31 changes: 17 additions & 14 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,34 @@
# Phenopacket2Prompt


phenopacket2promot is a Java 17 application that creates prompts intended for use with GPT starting from
GA4GH phenopackets.

*phenopacket2prompt* is a [Java](https://www.java.com/){:target="_blank"} application that creates prompts for Large
Language Models (LLMs) on the basis of clinical data that has been encoded using
the [Global Alliance for Genomics and Health (GA4GH)](https://www.ga4gh.org/){:target="_blank"}
[Phenopacket Schema](https://pubmed.ncbi.nlm.nih.gov/35705716/){:target="_blank"}.

Optionally, the prompts can be generated in Czech, Dutch German, Italian, and Spanish.

## Running phenopacket2prompt

There are currently two use cases:
1. The creation of prompts (in several languages!), starting from phenopackets, intended for use with a Large Language Model (LLM) which is asked for a differential diagnosis.
2. The creation of phenopackets from case reports via text mining using the [fenominal](https://pubmed.ncbi.nlm.nih.gov/38001031/){:target="_blank"}
3. library.

### Running with Phenopackets

For this use case, follow the instructions in [Set-up](setup.md) and [Batch](batch.md).

## Running phenopacket2prompt
### Running with case reports


Assuming the hp.json file has been downloaded as described above and all of the case report text files
Assuming the hp.json file has been downloaded as described in [Set-up](setup.md) and all the case report text files
are available in a directory at ``some/path/gptdocs``, run


```shell title="running the app"
java -jar phenopacket2prompt.jar gpt -g some/path/gptdocs
java -jar target/phenopacket2prompt.jar gpt -g some/path/gptdocs
```



This command will create a new directory called ``gptOut`` (this can be adjusted using the -o option).
It will contain four subdirectories

Expand All @@ -32,11 +38,8 @@ It will contain four subdirectories
4. txt_with_differential. Text that starts with the presentation by the first discussant up to and including the differential. This was used to check parsing but was not used in our analysis.





### Feedback


The best place to leave feedback, ask questions, and report bugs is the [phenopacket2prompt Issue Tracker](https://github.com/monarch-initiative/phenopacket2prompt/issues).
The best place to leave feedback, ask questions, and report bugs is the
[phenopacket2prompt Issue Tracker](https://github.com/monarch-initiative/phenopacket2prompt/issues){:target="_blank"}.

7 changes: 5 additions & 2 deletions docs/languages.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
# Languages

phenopacket2prompt creates phenopackets in various languages using a template system.
(TODO explain HPO translations).
*phenopacket2prompt* creates prompts for Large Lanugage Models starting from HPO terms and other data contained in a
GA4GH phenopacket.

Additionally, the HPO translation files (See [Gargano et al., 2024](https://pubmed.ncbi.nlm.nih.gov/37953324/){:template="_blank"})
are leveraged to create prompts in several languages other than English.


## The template
Expand Down
24 changes: 9 additions & 15 deletions docs/setup.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,36 @@
# Set-up

phenopacket2prompt requires at least Java 17. To build it from scratch, maven is also required.

## Download command
Before running the batch command, run the download command to get the necessary files

```
java -jar target/phenopacket2prompt.jar download
```


*phenopacket2prompt* requires at least Java 17. To build it from scratch, [Apache Maven](https://maven.apache.org/){:target="_blank"} is also required.

## Installation


Most users should download the prebuilt executable file from the
[Releases](https://github.com/monarch-initiative/phenopacket2prompt/releases) page of the GutHub repository.
[Releases](https://github.com/monarch-initiative/phenopacket2prompt/releases){:target="_blank"} page of the GutHub repository.

It is also possible to build the application from source using standard Maven and Java tools.

```shell title="building the app"
git clone https://github.com/monarch-initiative/phenopacket2prompt.git
cd phenopacket2prompt
maven package
java -jar target/phenopacket2prompt.jar
java -jar target/phenopacket2prompt.jar [OPTIONS]
```

## Setup


First download the latest copy of the [Human Phenotype Ontology](https://hpo.jax.org/app/) hp.json file. This file is
used for text mining of clinical signs and symptoms. For more information about the HPO, see
[Koehler et al. (2021)](https://pubmed.ncbi.nlm.nih.gov/33264411/). Adjust the path to the `phenopacket2prompt.jar`
file as necessary.
[Koehler et al. (2021)](https://pubmed.ncbi.nlm.nih.gov/33264411/).

## Download command
Before running the [batch](batch.md) command, run the download command to get the necessary files. Adjust the path to the `phenopacket2prompt.jar`
file as necessary, the default is `target/phenopacket2prompt`.



```shell title="download"
java -jar phenopacket2prompt.jar download
java -jar target/phenopacket2prompt.jar download
```

0 comments on commit dbd3959

Please sign in to comment.