Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create python notebook to explore processed NOM data #68

Conversation

samobermiller
Copy link
Collaborator

All Submissions:

  • [ Y] Have you followed the guidelines in our Contributing document?
  • [Y] Have you checked to ensure there aren't other open Pull Requests for the same update/change?
  • [Y] Does your PR link to an issue?
  • [ Y] Have you described the changes this PR will make?

New Notebook Submissions:

  • [ Y] Have you included a summary of the notebook in the README.md included updated links to the notebook?
  • [Y ] Does your PR include links to the new notebook (in the branch) for review using nbviewer, Colab, and reviewnb? These three are the preferred ways to review changes and additions to notebooks during review.

@samobermiller samobermiller self-assigned this Sep 9, 2024
@samobermiller samobermiller linked an issue Sep 9, 2024 that may be closed by this pull request
5 tasks
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link

review-notebook-app bot commented Sep 10, 2024

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:14Z
----------------------------------------------------------------

Define NOM and all other acronyms at the start.


Copy link

review-notebook-app bot commented Sep 10, 2024

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:15Z
----------------------------------------------------------------

Line #8.    #import nom_functions.py in second_omics_type folder as module to utilize its functions

I think this points to the incorrect location now. Can you separate out this import to a second chunk and explain/point to the other notebook that describes these functions more fully?


Copy link

review-notebook-app bot commented Sep 10, 2024

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:16Z
----------------------------------------------------------------

Line #9.    import nom_functions as func

Can we import/save as nmdc_api or something similar? The functions are not NOM-specific.


Copy link

review-notebook-app bot commented Sep 10, 2024

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:17Z
----------------------------------------------------------------

For explicitness, can we change " object ID" to " processed_nom_id "?


Copy link

review-notebook-app bot commented Sep 10, 2024

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:17Z
----------------------------------------------------------------

ID should be id


Copy link

review-notebook-app bot commented Sep 10, 2024

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:18Z
----------------------------------------------------------------

Can you add a bit of narrative here about why/how you queried all the fields for this step? Then you can use that to say you'll use the Envo information on the biosamples to label each as a type of sample (which leads into the next step)


Copy link

review-notebook-app bot commented Sep 10, 2024

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:19Z
----------------------------------------------------------------

How about

"Create final data frame of relevant metadata and NMDC schema information for each NOM processed data object"


Copy link

review-notebook-app bot commented Sep 10, 2024

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:20Z
----------------------------------------------------------------

I would move this up one chunk (before the "Create final dataframe" step)


Copy link

review-notebook-app bot commented Sep 10, 2024

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:27Z
----------------------------------------------------------------

Since you're getting warnings with this, I would try to downselect until you do not get warnings or set your settings so warnings are not displayed. We don't want the warnings messages in the notebook.


Copy link

review-notebook-app bot commented Sep 10, 2024

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:28Z
----------------------------------------------------------------

I would add a bit of narrative here about what a Van Krevelen plot is and how they are used. Nothing extensive - think wikipedia-level (https://en.wikipedia.org/wiki/Van_Krevelen_diagram).


Copy link

review-notebook-app bot commented Sep 10, 2024

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:29Z
----------------------------------------------------------------

For the Marginal Density Plot, if the OC/HC ratio is in both soil and sand, how does that appear? Consider making the overlapping formula a separate color (black?) which would highlight formula that were only seen in a subset of the sample types.


Copy link

review-notebook-app bot commented Sep 10, 2024

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:30Z
----------------------------------------------------------------

Line #9.    g.figure.subplots_adjust(top=0.8)

I would split this into two different chunks! One for the top plot, a second with the bottom.


Copy link

review-notebook-app bot commented Sep 10, 2024

View / edit / reply to this conversation on ReviewNB

kheal commented on 2024-09-10T20:19:30Z
----------------------------------------------------------------

I think these next parts are really cool and the visualizations look awesome, but they make the notebook very long, so I'd vote to cut them.


Copy link
Collaborator Author

I thought the random state setting was the seed?


View entire conversation on ReviewNB

Copy link
Collaborator

kheal commented Sep 16, 2024

Aha! You're right :)


View entire conversation on ReviewNB

@kheal

This comment was marked as resolved.

@samobermiller samobermiller changed the title 33 create python notebook to explore processed data for a second omics data type (FEEDBACK REQUEST) Create python notebook to explore processed NOM data Sep 19, 2024
kheal

This comment was marked as resolved.

@samobermiller samobermiller merged commit c578bdf into main Sep 20, 2024
2 checks passed
@bmeluch bmeluch deleted the 33-create-python-notebook-to-explore-processed-data-for-a-second-omics-data-type branch September 26, 2024 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create python notebook to explore processed data for a second 'omics data type
3 participants