Errors in MultiQC ("quantms" module) #186

hendrikweisser · 2022-03-28T12:08:19Z

I've encountered two errors in the "quantms" module for MultiQC during the "pmultiqc" step:

If the experimental design file ("--input" parameter) uses the one-table OpenMS format (see https://abibuilder.informatik.uni-tuebingen.de/archive/openms/Documentation/release/latest/html/classOpenMS_1_1ExperimentalDesign.html#details), I get the error below. The reason seems to be that the "Sample" column in the experimental design table is expected by "quantms", but is not used in the one-table format. (If I use the two-tables format, the error goes away.)

  Parsing out csv file...
  ╭──────────────── Oops! The 'quantms' MultiQC module broke... ─────────────────╮
  │ Please copy this log and report it at                                        │
  │ https://github.com/ewels/MultiQC/issues                                      │
  │ Please attach a file that triggers the error. The last file found was:       │
  │ ./proteomicslfq/out.mzTab                                                    │
  │                                                                              │
  │ Traceback (most recent call last):                                           │
  │   File "/opt/conda/envs/nf-core-proteomicslfq-1.1.0dev/lib/python3.9/site-pa │
  │     return self._engine.get_loc(casted_key)                                  │
  │   File "pandas/_libs/index.pyx", line 136, in pandas._libs.index.IndexEngine │
  │   File "pandas/_libs/index.pyx", line 163, in pandas._libs.index.IndexEngine │
  │   File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs │
  │   File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs │
  │ KeyError: 'Sample'                                                           │
  │                                                                              │
  │ The above exception was the direct cause of the following exception:         │
  │                                                                              │
  │ Traceback (most recent call last):                                           │
  │   File "/opt/conda/envs/nf-core-proteomicslfq-1.1.0dev/lib/python3.9/site-pa │
  │     output = mod()                                                           │
  │   File "/opt/conda/envs/nf-core-proteomicslfq-1.1.0dev/lib/python3.9/site-pa │
  │     self.parse_out_csv()                                                     │
  │   File "/opt/conda/envs/nf-core-proteomicslfq-1.1.0dev/lib/python3.9/site-pa │
  │     Sample = list(exp_data[exp_data['Spectra_Filepath'] == i]['Sample'])[0]  │
  │   File "/opt/conda/envs/nf-core-proteomicslfq-1.1.0dev/lib/python3.9/site-pa │
  │     indexer = self.columns.get_loc(key)                                      │
  │   File "/opt/conda/envs/nf-core-proteomicslfq-1.1.0dev/lib/python3.9/site-pa │
  │     raise KeyError(key) from err                                             │
  │ KeyError: 'Sample'                                                           │
  │                                                                              │
  ╰──────────────────────────────────────────────────────────────────────────────╯

If both Comet and MS-GF+ are used as search engines ("--search_engines comet,msgf"), with results combined using ConsensusID, I get the error below. The reason seems to be that "quantms" checks for the presence of "msgf" or "comet" in the names of input idXML files, but in my case the files are named "..._consensus_fdr_filter.idXML". As a consequence the mzML_name variable is not initialised in the Python code (see https://github.com/bigbio/pmultiqc/blob/main/pmultiqc/modules/quantms/quantms.py#L1154-L1175).

  Parsing 20220223d_JR_METTL1_SILAC_SST_01_consensus_fdr_filter.idXML...
  ╭──────────────── Oops! The 'quantms' MultiQC module broke... ─────────────────╮
  │ Please copy this log and report it at                                        │
  │ https://github.com/ewels/MultiQC/issues                                      │
  │ Please attach a file that triggers the error. The last file found was:       │
  │ ./proteomicslfq/out.mzTab                                                    │
  │                                                                              │
  │ Traceback (most recent call last):                                           │
  │   File "/opt/conda/envs/nf-core-proteomicslfq-1.1.0dev/lib/python3.9/site-pa │
  │     output = mod()                                                           │
  │   File "/opt/conda/envs/nf-core-proteomicslfq-1.1.0dev/lib/python3.9/site-pa │
  │     self.parse_mzml_idx()                                                    │
  │   File "/opt/conda/envs/nf-core-proteomicslfq-1.1.0dev/lib/python3.9/site-pa │
  │     mzml_table[mzML_name]['Final result of spectra'] = self.mL_spec_ident_fi │
  │ UnboundLocalError: local variable 'mzML_name' referenced before assignment   │
  │                                                                              │
  ╰──────────────────────────────────────────────────────────────────────────────╯

The text was updated successfully, but these errors were encountered:

hendrikweisser · 2022-03-29T10:31:03Z

One more problem related to the two-tables experimental design file:
If I edit the .tsv file in LibreOffice Calc and save it, tabs are added up to the width of the first table in all lines that don't have enough. (Not sure if Excel would do the same.) That includes the empty line between the tables, which afterwards contains four tab characters. This breaks "quantms" when it's looking for exactly the empty line (https://github.com/bigbio/pmultiqc/blob/main/pmultiqc/modules/quantms/quantms.py#L202):

  ╭──────────────── Oops! The 'quantms' MultiQC module broke... ─────────────────╮
  │ Please copy this log and report it at                                        │
  │ https://github.com/ewels/MultiQC/issues                                      │
  │ Please attach a file that triggers the error. The last file found was:       │
  │ ./proteomicslfq/out.mzTab                                                    │
  │                                                                              │
  │ Traceback (most recent call last):                                           │
  │   File "/opt/conda/envs/nf-core-proteomicslfq-1.1.0dev/lib/python3.9/site-pa │
  │     output = mod()                                                           │
  │   File "/opt/conda/envs/nf-core-proteomicslfq-1.1.0dev/lib/python3.9/site-pa │
  │     self.draw_exp_design()                                                   │
  │   File "/opt/conda/envs/nf-core-proteomicslfq-1.1.0dev/lib/python3.9/site-pa │
  │     empty_row = data.index('\n')                                             │
  │ ValueError: '\n' is not in list                                              │
  │                                                                              │
  ╰──────────────────────────────────────────────────────────────────────────────╯

I know technically the input file isn't in the correct format, but presumably the code could be made more robust by stripping whitespace when reading the file.

fabianegli · 2022-04-12T14:31:27Z

I see some other problems in that experimental design parsing. Mostly related to sanity checks. What if someone mistakenly has an empty line in the beginning of the file? That will result in the whole experimental design not being read in. See https://github.com/bigbio/pmultiqc/blob/13208f07545a0bd67b329ad1f9ff3e3f728e7996/pmultiqc/modules/quantms/quantms.py#L201-L203

jpfeuffer mentioned this issue Mar 28, 2022

More errors from usage in proteomicslfq bigbio/pmultiqc#37

Closed

hendrikweisser mentioned this issue Apr 21, 2022

Problems with OpenMS-style experimental design input nf-core/quantms#19

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Errors in MultiQC ("quantms" module) #186

Errors in MultiQC ("quantms" module) #186

hendrikweisser commented Mar 28, 2022

hendrikweisser commented Mar 29, 2022

fabianegli commented Apr 12, 2022

Errors in MultiQC ("quantms" module) #186

Errors in MultiQC ("quantms" module) #186

Comments

hendrikweisser commented Mar 28, 2022

hendrikweisser commented Mar 29, 2022

fabianegli commented Apr 12, 2022