Files here provide a unified summary of all Year 2 and later CPTAC3 genomic analysis results and their location on CPTAC-DCC.
DCC Analysis Summary files have the following initial columns:
1. case
2. disease
3. pipeline_name
4. pipeline_version
5. timestamp
6. C3Y
7. DCC_path
8. filesize
9. file_format
10. md5sum
Column C3Y indicates "CPTAC3 Year" and takes values Y1
, Y2
, etc. It is used for administrative purposes.
DCC_path
is the path of each result at the CPTAC DCC.
Such results are also available relative to the following path on WUSTL RIS storage1:
/storage1/fs1/dinglab/Active/Projects/CPTAC3/Common/CPTAC3-DCC-Staging/DCC_STAGE_ROOT
Additional columns are specific to individual pipelines and will typically indicate the input data associated with this analysis. Pipelines which generate multiple result files per case will have multiple entries in the analysis summary file.
Counts of unique cases processed per disease and pipeline. Last updated 9/21/22.
Pipeline | AML | CCRCC | CM | GBM | HNSCC | LSCC | LUAD | PDA | SAR | UCEC | Total |
---|---|---|---|---|---|---|---|---|---|---|---|
Methylation Array | 172 | 260 | 8 | 239 | 111 | 202 | 229 | 164 | 19 | 249 | 1653 |
miRNA-Seq | 172 | 261 | 8 | 249 | 111 | 202 | 229 | 164 | 19 | 250 | 1665 |
RNA-Seq Expression | 172 | 261 | 8 | 243 | 111 | 202 | 229 | 186 | 19 | 252 | 1683 |
RNA-Seq Fusion | 172 | 222 | 8 | 243 | 111 | 202 | 229 | 164 | 19 | 246 | 1616 |
RNA-Seq Transcript + Splicing | 172 | 261 | 3 | 244 | 111 | 202 | 218 | 186 | 0 | 246 | 1643 |
WGS CNV Somatic | 133 | 258 | 0 | 218 | 111 | 202 | 229 | 166 | 0 | 243 | 1560 |
WGS SV | 139 | 258 | 0 | 218 | 111 | 201 | 219 | 166 | 0 | 243 | 1646 |
WXS MSI | 160 | 259 | 0 | 228 | 111 | 202 | 219 | 166 | 0 | 247 | 1543 |
WXS Somatic TD | 160 | 259 | 0 | 225 | 111 | 202 | 218 | 181 | 0 | 233 | 1601 |
WXS Somatic SW | 160 | 260 | 0 | 225 | 111 | 202 | 218 | 181 | 0 | 247 | 1604 |
WXS Germline | 160 | 259 | 0 | 230 | 111 | 199 | 229 | 166 | 0 | 233 | 1587 |
Processing performed during CPTAC3 Year 1 consisted analyses for CCRCC, LUAD, and UCEC discovery cohort, and a visual summary of processing per batch can be found in this processing update description. A subset of Year 1 calls is included in the DCC analysis summaries here, mainly those calls whose pipeline versions are consistent with those in Year 2. Year 1 analyses can be identified by "Y1" in C3Y column, and do not have details about input data.
Details and notes about pipelines and processing status below. More complete pipeline details are included in documentation included with data files on DCC.
Details in
Methylation_Array.DCC_analysis_summary.dat
CPTAC3 Methylation pipeline details
Analysis details miRNA-Seq.DCC_analysis_summary.dat
Note that each sample has results for mature miRNA, precursor miRNA, and total miRNA.
miRNA-Seq pipeline documentation and processing description.
Analysis details RNA-Seq_Expression.DCC_analysis_summary.dat
CPTAC3 RNA-Seq Expression pipeline
Analysis details RNA-Seq_Fusion.DCC_analysis_summary.dat
, and
pipeline documentation on GitHub
Analysis details RNA-Seq_Transcript.DCC_analysis_summary.dat
Pipeline documentation on GitHub
Analysis details WGS_SV.DCC_analysis_summary.dat
CPTAC3 SomaticSV pipeline on GitHub
Analysis details WGS_CNV_Somatic.DCC_analysis_summary.dat
Analysis details WXS_MSI.DCC_analysis_summary.dat
Analysis details WXS_Normal_Adjacent.DCC_analysis_summary.dat
WXS Normal Adjacent analysis generated using TinDaisy pipeline
WXS Somatic analysis TinDaisy variant caller v2.1
Analysis details [WXS_Somatic_Variant_TD.DCC_analysis_summary.dat
](WXS_Somatic_Variant_TD.DCC_analysis_summary.dat"
WXS_Somatic_Variant_SW.DCC_analysis_summary.dat
.
Generated by SomaticWrapper.