Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For steps applied several times, personalized brief is only accessible for the first column #478

Open
mayeulk opened this issue May 9, 2023 · 2 comments · May be fixed by #564
Open

For steps applied several times, personalized brief is only accessible for the first column #478

mayeulk opened this issue May 9, 2023 · 2 comments · May be fixed by #564

Comments

@mayeulk
Copy link
Contributor

mayeulk commented May 9, 2023

The personalized brief, for steps applied several times (when there are several columns), is only accessible for the first column.

library(pointblank)

c_agent <- create_agent( tbl= small_table) %>%
  col_vals_regex(columns = vars(b, f), 
                 #col_vals_regex(columns = vars(gnad_fin_url), # CORRECT          
                 regex = "[a-z]",
                 step_id = "my_regex",
                 label = "This is my regex",
                 brief = "The current field must have at least one letter"
  )  %>%
  interrogate()

c_agent # tooltip over "This is my regex" (first column in report table) is only visilbe for the first step of the series

c_agent$validation_set$step_id[1]
c_agent$validation_set$label[1]
c_agent$validation_set$brief[1] #!is.NA (this is correct)

c_agent$validation_set$step_id[2]
c_agent$validation_set$label[2]
c_agent$validation_set$brief[2] # is.NA (this is the bug)

get_agent_x_list(c_agent)$briefs # second value is.NA (same bug)
c_agent$validation_set$brief # second value is.NA (same bug)

c_agent$validation_set$label # No NAs

@mayeulk
Copy link
Contributor Author

mayeulk commented May 9, 2023

The following is an extended short example with three groups of rules, and a temporary workaround to help correct the bug:

library(pointblank)

c_agent <- create_agent( tbl= small_table) %>%
  col_vals_regex(columns = vars(b, f), 
                 #col_vals_regex(columns = vars(gnad_fin_url), # CORRECT          
                 regex = "[a-z]",
                 step_id = "my_regex",
                 label = "This is my regex",
                 brief = "The current field must have at least one letter"
  )  %>%
  col_vals_regex(columns = vars(b), 
                 #col_vals_regex(columns = vars(gnad_fin_url), # CORRECT          
                 regex = "\\d",
                 step_id = "anoter_regex_1_column",
                 label = "Another regex for just 1 column",
                 brief = "The b field must have at least one number"
  )  %>%
    col_vals_regex(columns = vars(b, f), 
                 #col_vals_regex(columns = vars(gnad_fin_url), # CORRECT          
                 regex = "\\d",
                 step_id = "anoter_regex",
                 label = "Another regex",
                 brief = "The current field must have at least one number"
  )  %>%
  interrogate()

c_agent # tooltip over "This is my regex" (first column in report table) is only visilbe for the first step of the series

c_agent$validation_set$step_id[1]
c_agent$validation_set$label[1]
c_agent$validation_set$brief[1] #!is.NA (this is correct)

c_agent$validation_set$step_id[2]
c_agent$validation_set$label[2]
c_agent$validation_set$brief[2] # is.NA (this is the bug)

get_agent_x_list(c_agent)$briefs # every second value is.NA for steps with several "sub-steps" (same bug)
c_agent$validation_set$brief # same bug

c_agent$validation_set$label # No NAs

# Start of the workaround
steps_labels_briefs <- data.frame(step_id = c_agent$validation_set$step_id, label=c_agent$validation_set$label, old_brief = c_agent$validation_set$brief, new_brief = NA)

steps_labels_briefs$step_group <- 
  sub(pattern = "^(.+)(\\.\\d{4})$", replacement ="\\1",  steps_labels_briefs$step_id, perl = T)

# First row (step) of each group of steps
steps_labels_briefs_0001 <- 
  steps_labels_briefs[grepl(pattern = "^(.+)(\\.0001)$", steps_labels_briefs$step_id),]

# Assign the correct brief value (workaround)
steps_labels_briefs$new_brief <- steps_labels_briefs$old_brief
for (c_step_group in steps_labels_briefs_0001$step_group){
  steps_labels_briefs$new_brief[steps_labels_briefs$step_group == c_step_group] <-
    steps_labels_briefs_0001$old_brief[steps_labels_briefs_0001$step_group  == c_step_group]
}
c_agent$validation_set$brief <- steps_labels_briefs$new_brief

# Check that the workaround worked
get_agent_x_list(c_agent)$briefs # correct
c_agent$validation_set$brief #  correct (side note: much faster than previous line of code)

@rich-iannone
Copy link
Member

Thanks for catching this. This can definitely be fixed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment