Create python notebook to explore processed NOM data #68

samobermiller · 2024-09-09T17:21:26Z

All Submissions:

[ Y] Have you followed the guidelines in our Contributing document?
[Y] Have you checked to ensure there aren't other open Pull Requests for the same update/change?
[Y] Does your PR link to an issue?
[ Y] Have you described the changes this PR will make?

New Notebook Submissions:

[ Y] Have you included a summary of the notebook in the README.md included updated links to the notebook?
[Y ] Does your PR include links to the new notebook (in the branch) for review using nbviewer, Colab, and reviewnb? These three are the preferred ways to review changes and additions to notebooks during review.

… analysis

…p percentages for single value per nom comparison, visualized in heatmap

…s for goldids etc. removed processed noms from analysis that had same url

…cessed nom files, added multiple peak match check and confidence score choice

…older

review-notebook-app · 2024-09-09T17:21:32Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

NOM_visualizations/README.md

NOM_visualizations/python/__pycache__/nom_functions.cpython-311.pyc

review-notebook-app · 2024-09-10T20:19:15Z

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:14Z
----------------------------------------------------------------

Define NOM and all other acronyms at the start.

review-notebook-app · 2024-09-10T20:19:16Z

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:15Z
----------------------------------------------------------------

Line #8.    #import nom_functions.py in second_omics_type folder as module to utilize its functions

I think this points to the incorrect location now. Can you separate out this import to a second chunk and explain/point to the other notebook that describes these functions more fully?

review-notebook-app · 2024-09-10T20:19:16Z

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:16Z
----------------------------------------------------------------

Line #9.    import nom_functions as func

Can we import/save as nmdc_api or something similar? The functions are not NOM-specific.

review-notebook-app · 2024-09-10T20:19:17Z

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:17Z
----------------------------------------------------------------

For explicitness, can we change " object ID" to " processed_nom_id "?

review-notebook-app · 2024-09-10T20:19:18Z

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:17Z
----------------------------------------------------------------

ID should be id

review-notebook-app · 2024-09-10T20:19:19Z

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:18Z
----------------------------------------------------------------

Can you add a bit of narrative here about why/how you queried all the fields for this step? Then you can use that to say you'll use the Envo information on the biosamples to label each as a type of sample (which leads into the next step)

review-notebook-app · 2024-09-10T20:19:19Z

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:19Z
----------------------------------------------------------------

How about

"Create final data frame of relevant metadata and NMDC schema information for each NOM processed data object"

review-notebook-app · 2024-09-10T20:19:20Z

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:20Z
----------------------------------------------------------------

I would move this up one chunk (before the "Create final dataframe" step)

review-notebook-app · 2024-09-10T20:19:28Z

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:27Z
----------------------------------------------------------------

Since you're getting warnings with this, I would try to downselect until you do not get warnings or set your settings so warnings are not displayed. We don't want the warnings messages in the notebook.

review-notebook-app · 2024-09-10T20:19:29Z

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:28Z
----------------------------------------------------------------

I would add a bit of narrative here about what a Van Krevelen plot is and how they are used. Nothing extensive - think wikipedia-level (https://en.wikipedia.org/wiki/Van_Krevelen_diagram).

review-notebook-app · 2024-09-10T20:19:29Z

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:29Z
----------------------------------------------------------------

For the Marginal Density Plot, if the OC/HC ratio is in both soil and sand, how does that appear? Consider making the overlapping formula a separate color (black?) which would highlight formula that were only seen in a subset of the sample types.

review-notebook-app · 2024-09-10T20:19:30Z

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:30Z
----------------------------------------------------------------

Line #9.    g.figure.subplots_adjust(top=0.8)

I would split this into two different chunks! One for the top plot, a second with the bottom.

review-notebook-app · 2024-09-10T20:19:31Z

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:30Z
----------------------------------------------------------------

I think these next parts are really cool and the visualizations look awesome, but they make the notebook very long, so I'd vote to cut them.

samobermiller · 2024-09-15T00:49:34Z

I thought the random state setting was the seed?

View entire conversation on ReviewNB

kheal · 2024-09-16T17:12:50Z

Aha! You're right :)

View entire conversation on ReviewNB

NOM_visualizations/python/nom_data.ipynb

…ain, push

…hout it now

samobermiller added 16 commits July 17, 2024 15:00

added python script containing functions, finished API calls, started…

537e56a

… analysis

not all changes pushed

1b95ec7

removal of duplicated nom url from analysis. averaging the two overla…

1669888

…p percentages for single value per nom comparison, visualized in heatmap

added code for heatmap and made up some sample sources as placeholder…

e2252ef

…s for goldids etc. removed processed noms from analysis that had same url

new heatmap, bea's sample types added in, md5_checksum for unique pro…

fd23e76

…cessed nom files, added multiple peak match check and confidence score choice

stats no longer on just test set

19204bb

issue with file editor when trying to merge

c1b0219

latest heatmap with quality control filtering

2d57efb

van krevlen plots, expanded comments and notebook blurbs

2908022

updated requirements file, updated visualizations, updated comments

0e70a62

rendered

1fa78da

rendering of jupyter notebook and putting everything into python subf…

0318c58

…older

axes labels and post bea comparison

982d1f3

ran through fully before generating nbviewer and google colab links

1ed5c28

added rendered links

0db2958

added reviewnb link

5aa08a1

samobermiller requested review from kheal and brynnz22 September 9, 2024 17:21

samobermiller self-assigned this Sep 9, 2024

samobermiller linked an issue Sep 9, 2024 that may be closed by this pull request

Create python notebook to explore processed data for a second 'omics data type #33

Closed

5 tasks

kheal requested changes Sep 10, 2024

View reviewed changes

NOM_visualizations/README.md Show resolved Hide resolved

NOM_visualizations/python/__pycache__/nom_functions.cpython-311.pyc Outdated Show resolved Hide resolved

samobermiller added 2 commits September 17, 2024 13:20

changes made based on kheal feedback

0d9b834

final .ipynb run through for rendering

0145aa0

brynnz22 reviewed Sep 17, 2024

View reviewed changes

samobermiller added 2 commits September 18, 2024 15:14

feedback incorporation from Brynn

809870a

retrigger checks

ac27aab

This comment was marked as resolved.

Sign in to view

samobermiller added 2 commits September 18, 2024 17:33

updates after meeting today

60923f9

added rendered links

94b2258

samobermiller changed the title ~~33 create python notebook to explore processed data for a second omics data type (FEEDBACK REQUEST)~~ Create python notebook to explore processed NOM data Sep 19, 2024

samobermiller added 2 commits September 18, 2024 17:44

retrigger checks

c4569a5

manual kernel name adjustment

0b3546f

This comment was marked as resolved.

Sign in to view

samobermiller added 6 commits September 19, 2024 15:48

kheal updates from 9/19/24

2ce2c36

gitignore for pycache (delete local, push, rerender, change kernel ag…

59c19f5

…ain, push

without fastcluster, heatmap appears to not produce warnings even wit…

d5c4028

…hout it now

updates for google colab .py module

0f8c318

no f'' allowed around url

543ab93

remove fastcluster from dependencies

6773e48

kheal approved these changes Sep 20, 2024

View reviewed changes

changed .py path to main

503ab73

samobermiller merged commit c578bdf into main Sep 20, 2024
2 checks passed

kheal mentioned this pull request Sep 26, 2024

Milestone - Sample Jupyter and RStudio notebooks available that highlight NMDC data and metadata (2.26) microbiomedata/issues#512

Closed

bmeluch deleted the 33-create-python-notebook-to-explore-processed-data-for-a-second-omics-data-type branch September 26, 2024 21:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create python notebook to explore processed NOM data #68

Create python notebook to explore processed NOM data #68

samobermiller commented Sep 9, 2024

review-notebook-app bot commented Sep 9, 2024

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

samobermiller commented Sep 15, 2024

kheal commented Sep 16, 2024

This comment was marked as resolved.

This comment was marked as resolved.

Create python notebook to explore processed NOM data #68

Create python notebook to explore processed NOM data #68

Conversation

samobermiller commented Sep 9, 2024

All Submissions:

New Notebook Submissions:

review-notebook-app bot commented Sep 9, 2024

review-notebook-app bot commented Sep 10, 2024 • edited Loading

review-notebook-app bot commented Sep 10, 2024 • edited Loading

review-notebook-app bot commented Sep 10, 2024 • edited Loading

review-notebook-app bot commented Sep 10, 2024 • edited Loading

review-notebook-app bot commented Sep 10, 2024 • edited Loading

review-notebook-app bot commented Sep 10, 2024 • edited Loading

review-notebook-app bot commented Sep 10, 2024 • edited Loading

"Create final data frame of relevant metadata and NMDC schema information for each NOM processed data object"

review-notebook-app bot commented Sep 10, 2024 • edited Loading

review-notebook-app bot commented Sep 10, 2024 • edited Loading

review-notebook-app bot commented Sep 10, 2024 • edited Loading

review-notebook-app bot commented Sep 10, 2024 • edited Loading

review-notebook-app bot commented Sep 10, 2024 • edited Loading

review-notebook-app bot commented Sep 10, 2024 • edited Loading

samobermiller commented Sep 15, 2024

kheal commented Sep 16, 2024

This comment was marked as resolved.

This comment was marked as resolved.

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 10, 2024 •

edited

Loading