Skip to content
This repository has been archived by the owner on Nov 16, 2021. It is now read-only.

Commit

Permalink
firt
Browse files Browse the repository at this point in the history
  • Loading branch information
PengJia6 committed Jun 4, 2020
1 parent 9a8beab commit cad33fe
Show file tree
Hide file tree
Showing 14 changed files with 183 additions and 442 deletions.
11 changes: 0 additions & 11 deletions .idea/MSHunter.iml

This file was deleted.

15 changes: 0 additions & 15 deletions .idea/inspectionProfiles/Project_Default.xml

This file was deleted.

4 changes: 0 additions & 4 deletions .idea/misc.xml

This file was deleted.

8 changes: 0 additions & 8 deletions .idea/modules.xml

This file was deleted.

6 changes: 0 additions & 6 deletions .idea/vcs.xml

This file was deleted.

262 changes: 0 additions & 262 deletions .idea/workspace.xml

This file was deleted.

47 changes: 2 additions & 45 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,45 +1,2 @@
# MSIhunter
## General introduction
MSIhunter is a python program for Microsatellites Instability (MSI) Evaluation using only tumor next generation sequencing data and accepts the whole genome sequencing, whole exome sequencing and target region sequencing data.

Fisrt, MSIhunter needs to scan the whole genome or the genome region you are interested in to get the location, repeat unit, repeat length and other information of microsatellites. Second, you need some cancer cases with MSI status ( Here, the MSI status could be evaluated by PCR,click [here](https://www.ncbi.nlm.nih.gov/gtr/tests/514558/overview/) for more detail information ) and sequencing data to get a set of discriminative microsatellites and their thresholds for a specific kind of cancer. Then, you can evaluate the MSI using sequencing data and the discriminative microsatellites you selected.

## Documentation
See [Wiki](https://github.com/PengJia6/MSIHunter/wiki) for documentation

## Requirement
If you want to use MSIhunter, you need python3 with pysam,numpy, pandas,numba and scipy.

## Quick start

**Step 1: Prepare the environment**

Make sure you have python3 with pysam,numpy, pandas and scipy in your environment,if not, you will need to install these packages using pip or conda:

Using pip: pip install pysam numpy scipy numba pandas
Using conda: conda install pysam numpy scipy numba pandas
**Step 2: Download the python script and data from the github**

git clone https://github.com/PengJia6/MSIHunter.git

**Step 3: Scan the microsatellites from reference genome**

First, you need to scan the reference genome to get all microsatellites of the whole genome or you sequencing regions, after this you will get the infomation of each microsatellites. You can download the microsatellites infomation of GRCh38.d1.vd1 on our github directly if you use this reference genome version.

python ScanMicosatellites.py -r GRCh38.d1.vd1.fa -m GRCh38.d1.vd1.fa.microsatellites

**Step 4: Select the discriminative Microsatellites for each cancer type.**

For the same cancer, this step only need to be done once. This step needs some cases that you know the MSI status, and we recommend that both MSI positive (MSI-H) and MSI (MSI-L/MSS) negative you use in this step should be more than 20, under this circumstance, you will get more reliable results.

pyhton SelectDiscriminativeMS.py -i trainConfigure.csv -m GRCh38.d1.vd1.fa.microsatellites -o MSIHunterTrain

If you don't have enough data to do this step you can use the discriminative Microsatellites selected by us. We provide the results of colorectal, gastric and endometrial cancer, and the training data is from TCGA.

**Step 5: MSI Evaluation**

python MSIhunter.py -mc mirosatellitesConfig.csv -i inputConfig.csv -o MSIHunterResult

For more details about MSIHunter input and output format,please visit the page [Input and output](https://github.com/PengJia6/MSIHunter/wiki/Input-and-Output)


# MShunter
## plese see the help of each command
Binary file modified src/__pycache__/errEval.cpython-37.pyc
Binary file not shown.
Binary file modified src/__pycache__/global_dict.cpython-37.pyc
Binary file not shown.
Binary file added src/__pycache__/units.cpython-37.pyc
Binary file not shown.
Loading

0 comments on commit cad33fe

Please sign in to comment.