Skip to content

Commit

Permalink
chore: take gene-peaks file as stdin to allow custom files
Browse files Browse the repository at this point in the history
  • Loading branch information
davidlougheed committed Jan 10, 2024
1 parent fc62d5f commit 8edd533
Show file tree
Hide file tree
Showing 3 changed files with 9 additions and 13 deletions.
2 changes: 1 addition & 1 deletion docs/setting_up_a_node.md
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,7 @@ docker compose up -d
First, import the assembly gene list and gene-peak association data into the database using the following command:

```bash
docker compose exec epivar-server node ./scripts/import-genes.mjs
docker compose exec -i epivar-server node ./scripts/import-genes.mjs < ./input-files/flu-infection-gene-peaks.csv
```

Then, import peaks and pre-computed peak matrix values into the database using the following command:
Expand Down
7 changes: 2 additions & 5 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -205,15 +205,12 @@ The different data sources to generate/prepare are:
- **Genes:** lists of gene names mapped to their characteristics, and features
associated with specific genes.
- **Import with:** `node ./scripts/import-genes.mjs`
- **Input:** `./input-files/flu-infection-genes.txt` and
- **Import with:** e.g., `node ./scripts/import-genes.mjs < ./input-files/flu-infection-gene-peaks.csv`
- **Input:** A pre-computed gene list for the assembly specified in `config.js` and a file resembling
`./input-files/flu-infection-gene-peaks.csv`
- Examples for these files / the versions used for the Aracena *et al.* instance
of the portal are already [in the repository](./input-files).
- **Format:**
- `flu-infection-genes.txt`: TSV file with *no header row*. Columns are:
gene name, chromosome with `chr` prefix, start coordinate, end coordinate,
strand (`+` or `-`).
- `flu-infection-gene-peaks.csv`: CSV *with header row*:
`"symbol","peak_ids","feature_type"` where `symbol` is gene name, `peak_ids`
is a feature string (e.g., `chr1_9998_11177`), and `feature_type` is the name
Expand Down
13 changes: 6 additions & 7 deletions scripts/import-genes.mjs
Original file line number Diff line number Diff line change
@@ -1,16 +1,15 @@
import fs from "node:fs";
import path from "node:path";

import process from "node:process";
import parseCSVSync from "csv-parse/lib/sync";

import envConfig from "../envConfig.js";
import config from "../config.js";

const ASSAY_NAME_RNASEQ = "RNA-seq";
import {genePathsByAssemblyID} from "../data/assemblies/index.mjs";

// TODO: should be dependent on assembly + a pre-computed list
const genesPath = path.join(envConfig.INPUT_FILES_DIR, 'flu-infection-genes.txt');
const ASSAY_NAME_RNASEQ = "RNA-seq";

const genesFeaturesPath = path.join(envConfig.INPUT_FILES_DIR, 'flu-infection-gene-peaks.csv');
const genesPath = genePathsByAssemblyID[config.assembly];
const genesFeaturesPath = process.argv[2] || "/dev/stdin";

import {precomputedPoints} from "./_common.mjs";

Expand Down

0 comments on commit 8edd533

Please sign in to comment.