Generate an identification extension to track changes in taxonomic assignment #120

CecSve · 2024-08-15T08:32:42Z

Tool users supply a taxonomy file when the data is processed by the tool to generate a dwc-a. Ideally, the scientificName is either a BIN, SH etc. and it is possible to include Linnean ranks with further taxonomic identification.

Would it make sense to support a verbatimIdentification and perhaps an identificationRemarks field to the generated archive, where the original identification (maybe already used in scientific publications) can be added? Maybe more fields would be relevant and could be packaged as an extension file, although the fields mentioned could also just be added to the occurrence core file.

It could allow data users to track the changes in taxonomic identification.

The text was updated successfully, but these errors were encountered:

thomasstjerne · 2024-08-15T08:51:30Z

Actually, verbatimIdentification and identificationRemarks are both in the default list of fields listed in the taxonomy mapping. People use the fields in slightly different ways, but verbatimIdentification is often for the full taxonomy string retrieved from your blasting or whatever assigment tool you use e.g. k__Stramenopila;p__Ochrophyta;c__Phaeophyceae;o__Fucales;f__Sargassaceae;g__Sargassum;s__Sargassum_sp

Also, any field in Occurrence Core or DNA Derived data can be added by a user even though they are not in the default list.

CecSve · 2024-08-15T09:02:33Z

Oh great - that makes sense. I was just wondering if the tool should automatically fill the verbatimIdentification field based on the input from the publisher? It could be used as the original identification to track changes.

CecSve · 2024-08-15T09:07:31Z

And the identificationRemarks could include information about the values and refDB if users opt to use the seqID tool to assign taxonomy, for example:

bitScore: 111 | expectValue: 4.03e-24 | queryCoverage: 100 | matchType: BLAST_EXACT_MATCH | queried against a 99% clustered version of the BOLD Public Database v2024-01-06 public data (COI-5P sequences)

thomasstjerne · 2024-08-15T09:35:43Z

And the identificationRemarks could include information about the values and refDB if users opt to use the seqID tool to assign taxonomy, for example:

bitScore: 111 | expectValue: 4.03e-24 | queryCoverage: 100 | matchType: BLAST_EXACT_MATCH | queried against a 99% clustered version of the BOLD Public Database v2024-01-06 public data (COI-5P sequences)

Yes - eaxactly

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate an identification extension to track changes in taxonomic assignment #120

Generate an identification extension to track changes in taxonomic assignment #120

CecSve commented Aug 15, 2024

thomasstjerne commented Aug 15, 2024 •

edited

Loading

CecSve commented Aug 15, 2024

CecSve commented Aug 15, 2024

thomasstjerne commented Aug 15, 2024

Generate an identification extension to track changes in taxonomic assignment #120

Generate an identification extension to track changes in taxonomic assignment #120

Comments

CecSve commented Aug 15, 2024

thomasstjerne commented Aug 15, 2024 • edited Loading

CecSve commented Aug 15, 2024

CecSve commented Aug 15, 2024

thomasstjerne commented Aug 15, 2024

thomasstjerne commented Aug 15, 2024 •

edited

Loading