Skip to content

Commit

Permalink
Merge branch 'gsea_4_2_0_bugfix'
Browse files Browse the repository at this point in the history
  • Loading branch information
davideby committed Dec 17, 2021
2 parents 99c9ec8 + 051c707 commit 1b50c87
Show file tree
Hide file tree
Showing 23 changed files with 333 additions and 88 deletions.
26 changes: 15 additions & 11 deletions docs/v20/index.html
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<!DOCTYPE html>
<!-- saved from url=(0083)http://software.broadinstitute.org/cancer/software/genepattern/modules/docs/GSEA/18 -->
<html class=""><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>GSEA (v20.2.x)</title>
<title>GSEA (v20.3.x)</title>
<link href="./application.css" media="all" rel="stylesheet">
<script src="./application.js"></script><style>.cke{visibility:hidden;}</style><style type="text/css"></style>
<meta http-equiv="X-UA-Compatible" content="IE=edge">
Expand All @@ -15,7 +15,7 @@

<div class="gp-content-header fluid">
<div class="container">
<h1>GSEA (v20.2.x) <a style="float: right" href="https://www.genepattern.org"><img alt="GP Logo" src="gplogo.png" /></a></h1>
<h1>GSEA (v20.3.x) <a style="float: right" href="https://www.genepattern.org"><img alt="GP Logo" src="gplogo.png" /></a></h1>
</div>
</div>
<div class="container">
Expand All @@ -33,7 +33,7 @@ <h1>GSEA (v20.2.x) <a style="float: right" href="https://www.genepattern.org"><i
<p></p>
</div>
<div class="col-sm-4">
<p><strong>GSEA Version: </strong> 4.1.0</p>
<p><strong>GSEA Version: </strong> 4.2.0</p>
</div>
</div>

Expand Down Expand Up @@ -245,6 +245,9 @@ <h2>Parameters</h2>
<li>Median_of_probes: For each sample, use the median expression value for the probe set.</li>
<li>Mean_of_probes: For each sample, use the mean expression value for the probe set.</li>
<li>Sum_of_probes: For each sample, sum all the expression values of the probe set.</li>
<li>Abs_max_of_probes: For each sample, use the expression value for the probe set with the maximum **absolute value**. Note that each value retains its original sign but is chosen based on absolute value.
In other words, the largest magnitude value is used. While this method is useful with computational-based input datasets it is generally **not recommended** for use with quantification-based expression
measures such as counts or microarray fluorescence.</li>
</ul>
</td>
</tr>
Expand Down Expand Up @@ -318,10 +321,6 @@ <h2>Parameters</h2>
<td valign="top">create gcts&nbsp;<span style="color:red;">*</span></td>
<td valign="top">Whether to save the dataset subsets backing the GSEA report heatmaps as GCT files; these will be subsets of your original dataset corresponding only to the genes of the heatmap.&nbsp;</td>
</tr>
<tr>
<td valign="top">create zip&nbsp;<span style="color:red;">*</span></td>
<td valign="top">Create a ZIP bundle of the output files. This is true by default, matching the former behavior where a ZIP bundle was always created.</td>
</tr>
</tbody>
</table>

Expand Down Expand Up @@ -358,11 +357,10 @@ <h2>Input Files</h2>

<h2>Output Files</h2>

<p>1. Optional Enrichment Report archive: ZIP</p>
<p>1. Enrichment Report archive: ZIP</p>

<p style="margin-left: 40px;">ZIP file containing the result files. &nbsp;For more information on interpreting these results, see <a href="http://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideTEXT.htm#_Interpreting_GSEA_Results">Interpreting GSEA Results</a> in the GSEA User Guide.
Note that in prior versions the ZIP bundle was created as the only output file. This behavior has been changed to give direct access to the results without the need for a download. The default is to create the ZIP bundle, matching the former behavior, but the report files
will always be created directly.</p>
Note that in prior versions the ZIP bundle was created as the only output file. This behavior has been changed to give direct access to the results without the need for a download.</p>

<p>2. Enrichment Report: HTML and PNG images</p>

Expand All @@ -374,7 +372,8 @@ <h2>Output Files</h2>

<p>3. Optional GCTs</p>

<p style="margin-left: 40px;">The datasets backing all the heatmap images from the Enrichment Report for use in external visualizers or analysis tools. These will have the same name as the corresponding image but instead with a GCT extension.</p>
<p style="margin-left: 40px;">The datasets backing all the heatmap images from the Enrichment Report for use in external visualizers or analysis tools. These will have the same name as the corresponding image but instead with a GCT extension.
When Collapse or Remap_Only is set, the collapsed dataset is also saved as a GCT. These files will be created if the Create GCTs option is true.</p>

</div>
</div>
Expand Down Expand Up @@ -410,6 +409,11 @@ <h2>Version Comments</h2>
</tr>
</thead>
<tbody>
<tr>
<td>20.3.0</td>
<td>2021-12-17</td>
<td>Updated with the GSEA Desktop 4.2.0 code base with numerous bug fixes. Adds the Abs_max_of_probes collapse mode. Fixed some issues handling datasets with missing values. Added the Spearman metric. Fixed issue with the min-sample check with gene_set permutation mode. Improved warnings and logging. Changed the FDR q-value scale on the NES vs Significance plot. Fixed bugs in weighted_p1.5 scoring.</td>
</tr>
<tr>
<td>20.2.4</td>
<td>2021-4-22</td>
Expand Down
24 changes: 14 additions & 10 deletions docs/v20/test.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# GSEA (v20.2.x)
# GSEA (v20.3.x)

Gene Set Enrichment Analysis

Expand All @@ -14,7 +14,7 @@ for GSEA questions.
team](http://software.broadinstitute.org/cancer/software/genepattern/contact)
for GenePattern issues.

**GSEA Version:** 4.1.0
**GSEA Version:** 4.2.0

## Description

Expand Down Expand Up @@ -301,6 +301,9 @@ For descriptions of the ranking metrics, see <a href="http://www.gsea-msigdb.org
<li>Median_of_probes: For each sample, use the median expression value for the probe set.</li>
<li>Mean_of_probes: For each sample, use the mean expression value for the probe set.</li>
<li>Sum_of_probes: For each sample, sum all the expression values of the probe set.</li>
<li>Abs_max_of_probes: For each sample, use the expression value for the probe set with the maximum **absolute value**. Note that each value retains its original sign but is chosen based on absolute value.
In other words, the largest magnitude value is used. While this method is useful with computational-based input datasets it is generally **not recommended** for use with quantification-based expression
measures such as counts or microarray fluorescence.</li>
</ul></td>
</tr>
<tr class="even">
Expand Down Expand Up @@ -367,10 +370,6 @@ For descriptions of the ranking metrics, see <a href="http://www.gsea-msigdb.org
<td align="left">create gcts <span style="color:red;">*</span></td>
<td align="left">Whether to save the dataset subsets backing the GSEA report heatmaps as GCT files; these will be subsets of your original dataset corresponding only to the genes of the heatmap. </td>
</tr>
<tr class="even">
<td align="left">create zip <span style="color:red;">*</span></td>
<td align="left">Create a ZIP bundle of the output files. This is true by default, matching the former behavior where a ZIP bundle was always created.</td>
</tr>
</tbody>
</table>

Expand Down Expand Up @@ -423,16 +422,14 @@ drop-down

## Output Files

1\. Optional Enrichment Report archive: ZIP
1\. Enrichment Report archive: ZIP

ZIP file containing the result files.  For more information on
interpreting these results, see [Interpreting GSEA
Results](http://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideTEXT.htm#_Interpreting_GSEA_Results)
in the GSEA User Guide. Note that in prior versions the ZIP bundle was
created as the only output file. This behavior has been changed to give
direct access to the results without the need for a download. The
default is to create the ZIP bundle, matching the former behavior, but
the report files will always be created directly.
direct access to the results without the need for a download.

2\. Enrichment Report: HTML and PNG images

Expand All @@ -450,6 +447,8 @@ can be decompressed using 'gunzip' on Mac or Linux and 7-Zip on Windows
The datasets backing all the heatmap images from the Enrichment Report
for use in external visualizers or analysis tools. These will have the
same name as the corresponding image but instead with a GCT extension.
When Collapse or Remap_Only is set, the collapsed dataset is also saved
as a GCT. These files will be created if the Create GCTs option is true.

## Platform Dependencies

Expand All @@ -476,6 +475,11 @@ Java
</tr>
</thead>
<tbody>
<tr class="even">
<td align="left">20.3.0</td>
<td align="left">2021-12-17</td>
<td align="left">Updated with the GSEA Desktop 4.2.0 code base with numerous bug fixes. Adds the Abs_max_of_probes collapse mode. Fixed some issues handling datasets with missing values. Added the Spearman metric. Fixed issue with the min-sample check with gene_set permutation mode. Improved warnings and logging. Changed the FDR q-value scale on the NES vs Significance plot. Fixed bugs in weighted_p1.5 scoring.</td>
</tr>
<tr class="odd">
<td align="left">20.2.4</td>
<td align="left">2021-4-22</td>
Expand Down
6 changes: 3 additions & 3 deletions gpunit_breadthTest/other/Dressman_81.2_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ params:
collapse.dataset: "No_Collapse"
#chip.platform.file:
# Renaming result file for ease of testing.
output.file.name: "Dressman_81.zip"
output.file.name: "Dressman_81.2.zip"
scoring.scheme: "weighted"
metric.for.ranking.genes: "Signal2Noise"
gene.list.sorting.mode: "real"
Expand All @@ -40,6 +40,6 @@ params:
assertions:
jobStatus: success
files:
"Dressman_81.zip":
"Dressman_81.2.zip":
diffCmd: ../diffGseaResults.sh
diff: "<%gpunit.resultData%>gpunit/GSEA/breadthTest/output/other/Dressman_81.2_test/Dressman_81.zip"
diff: "<%gpunit.resultData%>gpunit/GSEA/breadthTest/output/other/Dressman_81.2_test/Dressman_81.2.zip"
6 changes: 3 additions & 3 deletions gpunit_breadthTest/other/Lin.et.al.2008.2_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ params:
collapse.dataset: "No_Collapse"
#chip.platform.file:
# Renaming result file for ease of testing.
output.file.name: "Lin.et.al.2008.zip"
output.file.name: "Lin.et.al.2008.2.zip"
scoring.scheme: "weighted"
metric.for.ranking.genes: "Signal2Noise"
gene.list.sorting.mode: "real"
Expand All @@ -40,6 +40,6 @@ params:
assertions:
jobStatus: success
files:
"Lin.et.al.2008.zip":
"Lin.et.al.2008.2.zip":
diffCmd: ../diffGseaResults.sh
diff: "<%gpunit.resultData%>gpunit/GSEA/breadthTest/output/other/Lin.et.al.2008.2_test/Lin.et.al.2008.zip"
diff: "<%gpunit.resultData%>gpunit/GSEA/breadthTest/output/other/Lin.et.al.2008.2_test/Lin.et.al.2008.2.zip"
6 changes: 3 additions & 3 deletions gpunit_breadthTest/other/MD.outcome.2_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ params:
collapse.dataset: "No_Collapse"
#chip.platform.file:
# Renaming result file for ease of testing.
output.file.name: "MD.MD.outcome.2.cls_0_versus_NA.zip"
output.file.name: "MD.outcome.MD.outcome.2.cls_0_versus_NA.zip"
scoring.scheme: "weighted"
metric.for.ranking.genes: "Signal2Noise"
gene.list.sorting.mode: "real"
Expand All @@ -40,6 +40,6 @@ params:
assertions:
jobStatus: success
files:
"MD.MD.outcome.2.cls_0_versus_NA.zip":
"MD.outcome.MD.outcome.2.cls_0_versus_NA.zip":
diffCmd: ../diffGseaResults.sh
diff: "<%gpunit.resultData%>gpunit/GSEA/breadthTest/output/other/MD.outcome.2_test/MD.MD.outcome.2.cls_0_versus_NA.zip"
diff: "<%gpunit.resultData%>gpunit/GSEA/breadthTest/output/other/MD.outcome.2_test/MD.outcome.MD.outcome.2.cls_0_versus_NA.zip"
6 changes: 3 additions & 3 deletions gpunit_breadthTest/other/MD.outcome.BCAT_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ params:
collapse.dataset: "No_Collapse"
#chip.platform.file:
# Renaming result file for ease of testing.
output.file.name: "MD.outcome.zip"
output.file.name: "MD.outcome.BCAT.zip"
scoring.scheme: "weighted"
metric.for.ranking.genes: "Signal2Noise"
gene.list.sorting.mode: "real"
Expand All @@ -40,6 +40,6 @@ params:
assertions:
jobStatus: success
files:
"MD.outcome.zip":
"MD.outcome.BCAT.zip":
diffCmd: ../diffGseaResults.sh
diff: "<%gpunit.resultData%>gpunit/GSEA/breadthTest/output/other/MD.outcome.BCAT_test/MD.outcome.zip"
diff: "<%gpunit.resultData%>gpunit/GSEA/breadthTest/output/other/MD.outcome.BCAT_test/MD.outcome.BCAT.zip"
6 changes: 3 additions & 3 deletions gpunit_breadthTest/other/Ross_et_al.3-class_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ params:
collapse.dataset: "No_Collapse"
#chip.platform.file:
# Renaming result file for ease of testing.
output.file.name: "Ross_et_al.zip"
output.file.name: "Ross_et_al.3-class.zip"
scoring.scheme: "weighted"
metric.for.ranking.genes: "Signal2Noise"
gene.list.sorting.mode: "real"
Expand All @@ -40,6 +40,6 @@ params:
assertions:
jobStatus: success
files:
"Ross_et_al.zip":
"Ross_et_al.3-class.zip":
diffCmd: ../diffGseaResults.sh
diff: "<%gpunit.resultData%>gpunit/GSEA/breadthTest/output/other/Ross_et_al.3-class_test/Ross_et_al.zip"
diff: "<%gpunit.resultData%>gpunit/GSEA/breadthTest/output/other/Ross_et_al.3-class_test/Ross_et_al.3-class.zip"
6 changes: 3 additions & 3 deletions gpunit_breadthTest/other/dfci.subset_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ params:
collapse.dataset: "No_Collapse"
#chip.platform.file:
# Renaming result file for ease of testing.
output.file.name: "dfci.zip"
output.file.name: "dfci.subset.zip"
scoring.scheme: "weighted"
metric.for.ranking.genes: "Signal2Noise"
gene.list.sorting.mode: "real"
Expand All @@ -40,6 +40,6 @@ params:
assertions:
jobStatus: success
files:
"dfci.zip":
"dfci.subset.zip":
diffCmd: ../diffGseaResults.sh
diff: "<%gpunit.resultData%>gpunit/GSEA/breadthTest/output/other/dfci.subset_test/dfci.zip"
diff: "<%gpunit.resultData%>gpunit/GSEA/breadthTest/output/other/dfci.subset_test/dfci.subset.zip"
6 changes: 3 additions & 3 deletions gpunit_breadthTest/other/meta.1.2_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ params:
collapse.dataset: "No_Collapse"
#chip.platform.file:
# Renaming result file for ease of testing.
output.file.name: "meta.meta.1.2.cls_breast_versus_colon.zip"
output.file.name: "meta.1.meta.1.2.cls_breast_versus_colon.zip"
scoring.scheme: "weighted"
metric.for.ranking.genes: "Signal2Noise"
gene.list.sorting.mode: "real"
Expand All @@ -40,6 +40,6 @@ params:
assertions:
jobStatus: success
files:
"meta.meta.1.2.cls_breast_versus_colon.zip":
"meta.1.meta.1.2.cls_breast_versus_colon.zip":
diffCmd: ../diffGseaResults.sh
diff: "<%gpunit.resultData%>gpunit/GSEA/breadthTest/output/other/meta.1.2_test/meta.meta.1.2.cls_breast_versus_colon.zip"
diff: "<%gpunit.resultData%>gpunit/GSEA/breadthTest/output/other/meta.1.2_test/meta.1.meta.1.2.cls_breast_versus_colon.zip"
6 changes: 3 additions & 3 deletions gpunit_breadthTest/other/primet1.2_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ params:
collapse.dataset: "No_Collapse"
#chip.platform.file:
# Renaming result file for ease of testing.
output.file.name: "primet1.zip"
output.file.name: "primet1.2.zip"
scoring.scheme: "weighted"
metric.for.ranking.genes: "Signal2Noise"
gene.list.sorting.mode: "real"
Expand All @@ -40,6 +40,6 @@ params:
assertions:
jobStatus: success
files:
"primet1.zip":
"primet1.2.zip":
diffCmd: ../diffGseaResults.sh
diff: "<%gpunit.resultData%>gpunit/GSEA/breadthTest/output/other/primet1.2_test/primet1.zip"
diff: "<%gpunit.resultData%>gpunit/GSEA/breadthTest/output/other/primet1.2_test/primet1.2.zip"
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Copyright (c) 2003-2021 Broad Institute, Inc., Massachusetts Institute of Technology, and Regents of the University of California. All rights reserved.
#module: urn:lsid:broad.mit.edu:cancer.software.genepattern.module.analysis:00072:20
module: GSEA
name: GSEA collapse_NaN_Missing_abs_max_test
description: Test the GSEA 'collapse dataset' function handling Infinite, NaN and Missing values, collapse to absolute max of probes. Tests are centered on HTR4, HTR6, FLJ22639, HTR7, NPAL2, NPAL3, GSTK1, BCR.
params:
expression.dataset: "<%gpunit.testData%>gpunit/GSEA/v20/input/Diabetes_hgu133a_NaN_missing_vals.gct"
gene.sets.database: [ "<%gpunit.testData%>gpunit/GSEA/v20/input/c1.symbols.reduced.gmt" ]
number.of.permutations: "10"
# Uses P53_6samples.cls because it happens to have a reasonable class template for this use
phenotype.labels: "<%gpunit.testData%>gpunit/GSEA/v20/input/P53_6samples.cls"
target.profile: ""
permutation.type: "phenotype"
collapse.dataset: "Collapse"
chip.platform.file: "<%gpunit.testData%>gpunit/GSEA/v20/input/HG_U133A.chip"
# Modifying the ZIP name here so that the diffCmd can find the RNK inside. We could modify the diffCmd
# to be able to find it, but that makes the code somewhat complicated.
output.file.name: "Diabetes_hgu133a_NaN_missing_vals_collapsed_to_symbols.zip"
scoring.scheme: "weighted"
metric.for.ranking.genes: "Signal2Noise"
gene.list.sorting.mode: "real"
gene.list.ordering.mode: "descending"
max.gene.set.size: "500"
min.gene.set.size: "15"
collapsing.mode.for.probe.sets.with.more.than.one.match: "Abs_max_of_probes"
normalization.mode: "meandiv"
randomization.mode: "no_balance"
omit.features.with.no.symbol.match: "true"
median.for.class.metrics: "false"
number.of.markers: "100"
# Note that we use a fixed random seed rather than the 'timestamp' default so that we'll have reproducible test results
random.seed: "149"
create.svgs: "false"
create.gcts: "true"
save.random.ranked.lists: "false"
plot.graphs.for.the.top.sets.of.each.phenotype: "20"
make.detailed.gene.set.report: "false"
selected.gene.sets: ""
dev.mode: "true"
alt.delim: ""
create.zip: "true"
assertions:
jobStatus: success
files:
"Diabetes_hgu133a_NaN_missing_vals_collapsed_to_symbols.zip":
diffCmd: ../diffGseaResults.sh
diff: "<%gpunit.resultData%>gpunit/GSEA/v20/output/collapse/collapse_NaN_Missing_abs_max_test/Diabetes_hgu133a_NaN_missing_vals_collapsed_to_symbols.zip"
"stdout.txt":
diffCmd: ../grepMessages.sh
diff: "<%gpunit.resultData%>gpunit/GSEA/v20/output/collapse/collapse_NaN_Missing_abs_max_test/stdoutMatches.txt"
Loading

0 comments on commit 1b50c87

Please sign in to comment.