Skip to content

Latest commit

 

History

History
140 lines (90 loc) · 5.63 KB

File metadata and controls

140 lines (90 loc) · 5.63 KB

DCC Analysis Summary files

Files here provide a unified summary of all Year 2 and later CPTAC3 genomic analysis results and their location on CPTAC-DCC.

Format

DCC Analysis Summary files have the following initial columns:

 1. case
 2. disease
 3. pipeline_name
 4. pipeline_version
 5. timestamp
 6. C3Y 
 7. DCC_path
 8. filesize
 9. file_format
10. md5sum

Column C3Y indicates "CPTAC3 Year" and takes values Y1, Y2, etc. It is used for administrative purposes.

DCC_path is the path of each result at the CPTAC DCC. Such results are also available relative to the following path on WUSTL RIS storage1:

/storage1/fs1/dinglab/Active/Projects/CPTAC3/Common/CPTAC3-DCC-Staging/DCC_STAGE_ROOT

Additional columns are specific to individual pipelines and will typically indicate the input data associated with this analysis. Pipelines which generate multiple result files per case will have multiple entries in the analysis summary file.

Analysis Summaries

Counts of unique cases processed per disease and pipeline. Last updated 9/21/22.

Pipeline AML CCRCC CM GBM HNSCC LSCC LUAD PDA SAR UCEC Total
Methylation Array 172 260 8 239 111 202 229 164 19 249 1653
miRNA-Seq 172 261 8 249 111 202 229 164 19 250 1665
RNA-Seq Expression 172 261 8 243 111 202 229 186 19 252 1683
RNA-Seq Fusion 172 222 8 243 111 202 229 164 19 246 1616
RNA-Seq Transcript + Splicing 172 261 3 244 111 202 218 186 0 246 1643
WGS CNV Somatic 133 258 0 218 111 202 229 166 0 243 1560
WGS SV 139 258 0 218 111 201 219 166 0 243 1646
WXS MSI 160 259 0 228 111 202 219 166 0 247 1543
WXS Somatic TD 160 259 0 225 111 202 218 181 0 233 1601
WXS Somatic SW 160 260 0 225 111 202 218 181 0 247 1604
WXS Germline 160 259 0 230 111 199 229 166 0 233 1587

Year 1

Processing performed during CPTAC3 Year 1 consisted analyses for CCRCC, LUAD, and UCEC discovery cohort, and a visual summary of processing per batch can be found in this processing update description. A subset of Year 1 calls is included in the DCC analysis summaries here, mainly those calls whose pipeline versions are consistent with those in Year 2. Year 1 analyses can be identified by "Y1" in C3Y column, and do not have details about input data.

Pipeline details

Details and notes about pipelines and processing status below. More complete pipeline details are included in documentation included with data files on DCC.

Methylation_Array

Details in Methylation_Array.DCC_analysis_summary.dat

CPTAC3 Methylation pipeline details

miRNA-Seq

miRNA-Seq analysis

Analysis details miRNA-Seq.DCC_analysis_summary.dat Note that each sample has results for mature miRNA, precursor miRNA, and total miRNA.

miRNA-Seq pipeline documentation and processing description.

RNA-Seq

RNA-Seq Expression

Analysis details RNA-Seq_Expression.DCC_analysis_summary.dat

CPTAC3 RNA-Seq Expression pipeline

RNA-Seq Fusion

Analysis details RNA-Seq_Fusion.DCC_analysis_summary.dat, and pipeline documentation on GitHub

RNA-Seq Transcript + Splicing

Analysis details RNA-Seq_Transcript.DCC_analysis_summary.dat

Pipeline documentation on GitHub

WGS

WGS SV

Analysis details WGS_SV.DCC_analysis_summary.dat

CPTAC3 SomaticSV pipeline on GitHub

WGS CNV Somatic

Analysis details WGS_CNV_Somatic.DCC_analysis_summary.dat

WGS CNV pipeline

WXS

WXS MSI

Analysis details WXS_MSI.DCC_analysis_summary.dat

WXS MSI pipeline on GitHub

WXS Normal Adjacent

Analysis details WXS_Normal_Adjacent.DCC_analysis_summary.dat

WXS Normal Adjacent analysis generated using TinDaisy pipeline

WXS Somatic TD

WXS Somatic analysis TinDaisy variant caller v2.1

Analysis details [WXS_Somatic_Variant_TD.DCC_analysis_summary.dat](WXS_Somatic_Variant_TD.DCC_analysis_summary.dat"

WXS Somatic SW

WXS_Somatic_Variant_SW.DCC_analysis_summary.dat.

Generated by SomaticWrapper.