Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functionality to pull sample type #129

Merged
merged 13 commits into from
May 8, 2024
Merged

Add functionality to pull sample type #129

merged 13 commits into from
May 8, 2024

Conversation

chantelwetzel-noaa
Copy link
Contributor

These changes do the following:

  1. Add a function argument to pull_catch and PullCatch.fn that allows users to pull all partition sample type data. The argument documentation notes that this primarily pertains to Pacific hake.
  2. Create a new function that allows users to combine catch data from the same tow which can occur if there are multiple partition sample types on a single tow. The pull_catch function adds a warning comment but does not augment the data.
  3. Add tests for pulling data with all partition sample types.

Chantel Wetzel added 7 commits April 30, 2024 16:35
Function that can be used if sample types have been used and multiple records are returned for the same trawl_id.  This function would be up to the user to apply but a message has been added to the pull_catch function that prints if trawl_ids are not uniuqe.
@chantelwetzel-noaa
Copy link
Contributor Author

@kellijohnson-NOAA I wanted to check in to make sure you received an email for review. This pull request handles pulling all sample-type data that is particularly specific to Pacific hake. Let me know if you have questions or do not have time for review.

Copy link
Contributor

@kellijohnson-NOAA kellijohnson-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay @chantelwetzel-noaa. I made some rough comments and would like to review it again after you respond. Two major things, I think that there should be more test written and that there should not be changes made to the deprecated function.

R/PullCatch.fn.R Outdated Show resolved Hide resolved
tests/testthat/test-data.R Outdated Show resolved Hide resolved
tests/testthat/test-data.R Outdated Show resolved Hide resolved
R/pull_catch.R Outdated Show resolved Hide resolved
R/combine_tows.R Outdated Show resolved Hide resolved
R/combine_tows.R Outdated Show resolved Hide resolved
R/combine_tows.R Outdated Show resolved Hide resolved
R/combine_tows.R Outdated Show resolved Hide resolved
man-roxygen/sample_types.R Outdated Show resolved Hide resolved
tests/testthat/test-data.R Show resolved Hide resolved
Chantel Wetzel added 5 commits May 7, 2024 10:32
Removed the set.seed that was used throughout this function when these tests were created.
1. Remove partition samples that should not be used for index creation.
2. Add code to save the combined tows data
3. Add spaces for code style
4. Revise function documentation
@chantelwetzel-noaa
Copy link
Contributor Author

@kellijohnson-NOAA I think I have addressed all the items that you identified in your initial review. I also added code to the combine_tows function to pull out samples that should not be used for index creation and code to save the modified data frame if dir != NULL.

@kellijohnson-NOAA
Copy link
Contributor

Great, thank you so much @chantelwetzel-noaa for being so speedy. I will start this tonight but I probably won't have time to finish. But, tomorrow I have time.

R/combine_tows.R Outdated

find <- grep("trawl_id", colnames(data), ignore.case = TRUE)
n_id <- table(data[, find])
if (sum(n_id == 2) == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there ever be an instance where there are three or more? I think that a more robust test would be

if (all(n_id == 1)) {

R/pull_catch.R Outdated
@@ -262,17 +259,25 @@ pull_catch <- function(common_name = NULL,
catch$cpue_kg_km2 <- catch$cpue_kg_per_ha_der * 100
colnames(catch)[which(colnames(catch) == "area_swept_ha_der")] <- "area_swept_ha"

if(sum(c("Life Stage", "Size") %in% sample_types) == 2) {
n_id <- table(catch$trawl_id)
if(sum(n_id == 2) > 0){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above regarding a more robust test

if (any(n_id > 1)) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a great suggestion. Thank you!

Copy link
Contributor

@kellijohnson-NOAA kellijohnson-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made two additional comments but you can merge the PR in without addressing them if you want.

@chantelwetzel-noaa chantelwetzel-noaa merged commit 7e54dcf into main May 8, 2024
8 checks passed
@chantelwetzel-noaa chantelwetzel-noaa deleted the sample_type branch May 8, 2024 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants