Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

table_wide fails with more complex error structures in aov objects #556

Open
IndrajeetPatil opened this issue Jul 16, 2021 · 8 comments · May be fixed by #1061
Open

table_wide fails with more complex error structures in aov objects #556

IndrajeetPatil opened this issue Jul 16, 2021 · 8 comments · May be fixed by #1061
Labels
Bug 🐛 Something isn't working

Comments

@IndrajeetPatil
Copy link
Member

simple error structure

set.seed(123)
library(parameters)
options(error = traceback)

# simple error structure
mod1 <- aov(yield ~ N * P * K + Error(block), npk)

# works
model_parameters(
  mod1,
  eta_squared = "partial",
  ci = 0.95,
  table_wide = TRUE
)
#> # block
#> 
#> Parameter |    F | df | df (error) |     p | Sum_Squares | Mean_Square | Eta2 (partial) |  Eta2 95% CI | Sum_Squares_Error | Mean_Square_Error
#> ----------------------------------------------------------------------------------------------------------------------------------------------
#> N:P:K     | 0.48 |  1 |          4 | 0.525 |       37.00 |       37.00 |           0.11 | [0.00, 0.64] |            306.29 |            306.29
#> 
#> # Within
#> 
#> Parameter |     F | df | df (error) |     p | Sum_Squares | Mean_Square | Eta2 (partial) |  Eta2 95% CI | Sum_Squares_Error | Mean_Square_Error
#> -----------------------------------------------------------------------------------------------------------------------------------------------
#> N         | 12.26 |  1 |         12 | 0.004 |      189.28 |      189.28 |           0.51 | [0.08, 0.74] |            185.29 |            185.29
#> P         |  0.54 |  1 |         12 | 0.475 |        8.40 |        8.40 |           0.04 | [0.00, 0.38] |            185.29 |            185.29
#> K         |  6.17 |  1 |         12 | 0.029 |       95.20 |       95.20 |           0.34 | [0.00, 0.64] |            185.29 |            185.29
#> N:P       |  1.38 |  1 |         12 | 0.263 |       21.28 |       21.28 |           0.10 | [0.00, 0.45] |            185.29 |            185.29
#> N:K       |  2.15 |  1 |         12 | 0.169 |       33.14 |       33.14 |           0.15 | [0.00, 0.50] |            185.29 |            185.29
#> P:K       |  0.03 |  1 |         12 | 0.863 |        0.48 |        0.48 |       2.59e-03 | [0.00, 0.22] |            185.29 |            185.29
#> 
#> Anova Table (Type 1 tests)

more complex error structure

library(ggstatsplot) # for data

mod2 <- stats::aov(
  formula = value ~ attribute * measure + Error(id / (attribute * measure)),
  data = iris_long
)

# {parameters} does return output
model_parameters(
  mod2,
  eta_squared = "partial",
  ci = 0.95
)
#> # id
#> 
#> Parameter | Sum_Squares | df | Mean_Square
#> ------------------------------------------
#> Residuals |      264.01 |  1 |      264.01
#> 
#> # id:attribute
#> 
#> Parameter | Sum_Squares | df | Mean_Square
#> ------------------------------------------
#> attribute |      237.25 |  1 |      237.25
#> 
#> # id:measure
#> 
#> Parameter | Sum_Squares | df | Mean_Square
#> ------------------------------------------
#> measure   |     1113.81 |  1 |     1113.81
#> 
#> # id:attribute:measure
#> 
#> Parameter         | Sum_Squares | df | Mean_Square
#> --------------------------------------------------
#> attribute:measure |        0.80 |  1 |        0.80
#> 
#> # Within
#> 
#> Parameter         | Sum_Squares |  df | Mean_Square |       F |      p | Eta2 (partial) |  Eta2 95% CI
#> ------------------------------------------------------------------------------------------------------
#> attribute         |      470.08 |   1 |      470.08 | 1445.99 | < .001 |           0.71 | [0.68, 0.74]
#> measure           |       57.69 |   1 |       57.69 |  177.46 | < .001 |           0.23 | [0.18, 0.29]
#> attribute:measure |        1.54 |   1 |        1.54 |    4.72 | 0.030  |       7.92e-03 | [0.00, 0.03]
#> Residuals         |      192.46 | 592 |        0.33 |         |        |                |             
#> 
#> Anova Table (Type 1 tests)

# but pivoting it to wider doesn't work
model_parameters(
  mod2,
  eta_squared = "partial",
  ci = 0.95,
  table_wide = TRUE
)
#> Error in `$<-.data.frame`(`*tmp*`, "df_error", value = numeric(0)): replacement has 0 rows, data has 1

Created on 2021-07-16 by the reprex package (v2.0.0)

@IndrajeetPatil IndrajeetPatil added the Bug 🐛 Something isn't working label Jul 16, 2021
@bwiernik
Copy link
Contributor

bwiernik commented Aug 9, 2021

Do we just want to not support table_wide in these cases?

@IndrajeetPatil
Copy link
Member Author

In the tidyverse version of the function I had posted here, it worked.

library(magrittr)

pretty_aov <- function(x) {
  df <- parameters::model_parameters(x, eta_squared = "partial")
  
  if (class(x)[[1]] %in% c("aov", "aovlist", "anova", "Gam", "manova", "maov")) {
    # creating numerator and denominator degrees of freedom
    if (dim(dplyr::filter(df, Parameter == "Residuals"))[[1]] > 0L) {
      df$df_error <- df$df[nrow(df)]
      df$Sum_Squares_Error <- df$Sum_Squares[nrow(df)]
    }
  }
  
  # final cleanup
  dplyr::filter(df, !is.na(p)) %>%
    dplyr::select(Parameter, Sum_Squares, Sum_Squares_Error, df, df_error, dplyr::everything())
}

set.seed(123)
mod1 <- aov(yield ~ N * P * K + Error(block), npk)

pretty_aov(mod1)
#> # block
#> 
#> Parameter | Sum_Squares | Sum_Squares_Error | df | df (error) | Mean_Square |    F |     p | Eta2 (partial)
#> -----------------------------------------------------------------------------------------------------------
#> N:P:K     |       37.00 |            185.29 |  1 |         12 |       37.00 | 0.48 | 0.525 |           0.11
#> 
#> # Within
#> 
#> Parameter | Sum_Squares | Sum_Squares_Error | df | df (error) | Mean_Square |     F |     p | Eta2 (partial)
#> ------------------------------------------------------------------------------------------------------------
#> N         |      189.28 |            185.29 |  1 |         12 |      189.28 | 12.26 | 0.004 |           0.51
#> P         |        8.40 |            185.29 |  1 |         12 |        8.40 |  0.54 | 0.475 |           0.04
#> K         |       95.20 |            185.29 |  1 |         12 |       95.20 |  6.17 | 0.029 |           0.34
#> N:P       |       21.28 |            185.29 |  1 |         12 |       21.28 |  1.38 | 0.263 |           0.10
#> N:K       |       33.14 |            185.29 |  1 |         12 |       33.14 |  2.15 | 0.169 |           0.15
#> P:K       |        0.48 |            185.29 |  1 |         12 |        0.48 |  0.03 | 0.863 |       2.59e-03

Created on 2021-08-11 by the reprex package (v2.0.1)

But it doesn't seem to work with the base-R version. We just need to figure out a way to make it work also in the base-R solution.

@strengejacke
Copy link
Member

I'm not sure the conversion to wide format works properly at all:

mod1 <- aov(yield ~ N * P * K + Error(block), npk)

parameters::model_parameters(
  mod1,
  eta_squared = "partial",
  ci = 0.95,
  table_wide = FALSE
)
#> # block
#> 
#> Parameter | Sum_Squares | df | Mean_Square |    F |     p
#> ---------------------------------------------------------
#> N:P:K     |       37.00 |  1 |       37.00 | 0.48 | 0.525
#> Residuals |      306.29 |  4 |       76.57 |      |      
#> 
#> # Within
#> 
#> Parameter | Sum_Squares | df | Mean_Square |     F |     p
#> ----------------------------------------------------------
#> N         |      189.28 |  1 |      189.28 | 12.26 | 0.004
#> P         |        8.40 |  1 |        8.40 |  0.54 | 0.475
#> K         |       95.20 |  1 |       95.20 |  6.17 | 0.029
#> N:P       |       21.28 |  1 |       21.28 |  1.38 | 0.263
#> N:K       |       33.14 |  1 |       33.14 |  2.15 | 0.169
#> P:K       |        0.48 |  1 |        0.48 |  0.03 | 0.863
#> Residuals |      185.29 | 12 |       15.44 |       |      
#> 
#> Anova Table (Type 1 tests)

parameters::model_parameters(
  mod1,
  eta_squared = "partial",
  ci = 0.95,
  table_wide = TRUE
)
#> # block
#> 
#> Parameter |    F | df | df (error) |     p | Sum_Squares | Mean_Square
#> ----------------------------------------------------------------------
#> N:P:K     | 0.48 |  1 |          4 | 0.525 |       37.00 |       37.00
#> 
#> Parameter | Sum_Squares_Error | Mean_Square_Error
#> -------------------------------------------------
#> N:P:K     |            306.29 |            306.29
#> 
#> # Within
#> 
#> Parameter |     F | df | df (error) |     p | Sum_Squares | Mean_Square
#> -----------------------------------------------------------------------
#> N         | 12.26 |  1 |         12 | 0.004 |      189.28 |      189.28
#> P         |  0.54 |  1 |         12 | 0.475 |        8.40 |        8.40
#> K         |  6.17 |  1 |         12 | 0.029 |       95.20 |       95.20
#> N:P       |  1.38 |  1 |         12 | 0.263 |       21.28 |       21.28
#> N:K       |  2.15 |  1 |         12 | 0.169 |       33.14 |       33.14
#> P:K       |  0.03 |  1 |         12 | 0.863 |        0.48 |        0.48
#> 
#> Parameter | Sum_Squares_Error | Mean_Square_Error
#> -------------------------------------------------
#> N         |            185.29 |            185.29
#> P         |            185.29 |            185.29
#> K         |            185.29 |            185.29
#> N:P       |            185.29 |            185.29
#> N:K       |            185.29 |            185.29
#> P:K       |            185.29 |            185.29
#> 
#> Anova Table (Type 1 tests)

Created on 2025-01-07 with reprex v2.1.1

Shouldn't mean square error be 15.44 in the wide table?
And considering that the argument table_wide just removes the last row and adds it as extra column, isn't that an almost redundant feature? I would also vote for removing it.

@mattansb you often use anova, wdyt?

mattansb added a commit that referenced this issue Jan 8, 2025
@mattansb
Copy link
Member

mattansb commented Jan 8, 2025

What looks wrong to you? Seems fine to me (made the window wide to avoid the row snipping):

mod1 <- aov(yield ~ N * P * K + Error(block), npk)

parameters::model_parameters(
  mod1,
  eta_squared = "partial",
  ci = 0.95,
  table_wide = TRUE
)
#> # block 
#> 
#> Parameter |    F | df | df (error) |     p | Sum_Squares | Mean_Square | Sum_Squares_Error | Mean_Square_Error
#> --------------------------------------------------------------------------------------------------------------
#> N:P:K     | 0.48 |  1 |          4 | 0.525 |       37.00 |       37.00 |            306.29 |            306.29
#> 
#> # Within 
#> 
#> Parameter |     F | df | df (error) |     p | Sum_Squares | Mean_Square | Sum_Squares_Error | Mean_Square_Error
#> ---------------------------------------------------------------------------------------------------------------
#> N         | 12.26 |  1 |         12 | 0.004 |      189.28 |      189.28 |            185.29 |            185.29
#> P         |  0.54 |  1 |         12 | 0.475 |        8.40 |        8.40 |            185.29 |            185.29
#> K         |  6.17 |  1 |         12 | 0.029 |       95.20 |       95.20 |            185.29 |            185.29
#> N:P       |  1.38 |  1 |         12 | 0.263 |       21.28 |       21.28 |            185.29 |            185.29
#> N:K       |  2.15 |  1 |         12 | 0.169 |       33.14 |       33.14 |            185.29 |            185.29
#> P:K       |  0.03 |  1 |         12 | 0.863 |        0.48 |        0.48 |            185.29 |            185.29
#> 
#> Anova Table (Type 1 tests)

Anyway, I've fixed the issue #1061:

iris_long <-
  iris |> 
  datawizard::data_modify(id = 1:length(Species)) |> 
  datawizard::data_to_long(select = colnames(iris)[1:4]) |> 
  datawizard::data_separate(select = "name", separator = "\\.",
                            new_columns = c("attribute", "measure"))


mod1 <- stats::aov(
  formula = value ~ attribute * measure + Error(id),
  data = iris_long
)

parameters::model_parameters(
  mod1,
  eta_squared = "partial",
  ci = 0.95,
  table_wide = TRUE
)
#> # Fixed Effects
#> 
#> Parameter         |       F | df | df (error) |      p |  Group | Sum_Squares | Mean_Square | Sum_Squares_Error | Mean_Square_Error
#> -----------------------------------------------------------------------------------------------------------------------------------
#> attribute         |  831.32 |  1 |        595 | < .001 | Within |      583.12 |      583.12 |            417.36 |            417.36
#> measure           | 1527.15 |  1 |        595 | < .001 | Within |     1071.20 |     1071.20 |            417.36 |            417.36
#> attribute:measure |    2.76 |  1 |        595 | 0.097  | Within |        1.94 |        1.94 |            417.36 |            417.36
#> 
#> Anova Table (Type 1 tests)


mod2 <- stats::aov(
  formula = value ~ attribute * measure + Error(id / (attribute * measure)),
  data = iris_long
)

parameters::model_parameters(
  mod2,
  eta_squared = "partial",
  ci = 0.95,
  table_wide = TRUE
)
#> # id:attribute 
#> 
#> Parameter | df | Sum_Squares | Mean_Square
#> ------------------------------------------
#> attribute |  1 |      237.25 |      237.25
#> 
#> # id:attribute:measure 
#> 
#> Parameter         | df | Sum_Squares | Mean_Square
#> --------------------------------------------------
#> attribute:measure |  1 |        0.80 |        0.80
#> 
#> # id:measure 
#> 
#> Parameter | df | Sum_Squares | Mean_Square
#> ------------------------------------------
#> measure   |  1 |     1113.81 |     1113.81
#> 
#> # Within 
#> 
#> Parameter         |       F | df | df (error) |      p | Sum_Squares | Mean_Square | Sum_Squares_Error | Mean_Square_Error
#> --------------------------------------------------------------------------------------------------------------------------
#> attribute         | 1445.99 |  1 |        592 | < .001 |      470.08 |      470.08 |            192.46 |            192.46
#> measure           |  177.46 |  1 |        592 | < .001 |       57.69 |       57.69 |            192.46 |            192.46
#> attribute:measure |    4.72 |  1 |        592 | 0.030  |        1.54 |        1.54 |            192.46 |            192.46
#> 
#> Anova Table (Type 1 tests)

@strengejacke
Copy link
Member

What looks wrong to you?

The values in the column Mean_Square_Error.

@strengejacke strengejacke linked a pull request Jan 8, 2025 that will close this issue
@mattansb
Copy link
Member

mattansb commented Jan 8, 2025

In these cases it should be equal to Sum_Squares because we are dividing by df=1 🤷

@strengejacke
Copy link
Member

Ok, I thought table_widewas just a reshaping of the table, not a new calculation.

@mattansb
Copy link
Member

mattansb commented Jan 8, 2025

Oops, sharp eye!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug 🐛 Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants