text_short() etc. and current API thoughts? #153

bwiernik · 2021-04-09T02:34:53Z

It's not exactly clear to me what the current status of text_short() etc. are. They still appear in the pkgdown site, but are commented out in various places in the source. Has that API been abandoned?

One API thought I had looking at the various functions is that format() might be a good generic to use here because it is so extensible. Table/text/numbers only, as well as choosing specific parameters or quantities could controlled with arguments. Not saying to discard other existing approaches, but might be useful in lieu of an ever expanding list of specific functions?

The text was updated successfully, but these errors were encountered:

DominiqueMakowski · 2021-04-09T03:45:08Z

text_short was rather deprecated in favour of the summary() generic report() %>% summary(). Where you would see format fitting? In lieu of summary?

One nice API in theory would be to have report() and then pipe it to format(interpet_p = FALSE, display_intercept = TRUE, show_this = TRUE) that would format the report according to specifications, but that seems impossible as there's hardly any modularity possible at the report level, currently the workaround for the summary() is to have both versions (long and short) prepared upfront by report() and then summary() just extracts the short version but doesn't do anything in itself.

Basically, one of the issue with report (and why it had several entire rewrites, see the discussion around the last one here) is the balance between modularity and specificity. In some cases, it would be easier to just have a report() function for each model that does everything specifically for it, but it's also hard to test and a leads to lot of semi-redundant and non-optimized code. But full-modularity with pieces that can be manipulated afterwards seems also impossible.

So currently we split the reports into several submodules (report_parameters, report_intercept, report_performance, etc.) [that we also export] which seem to make sense for a lot of models and share a lot of commonalities, but once the main report() assembles them that's it it's hard to preserve the piece-wise information. In fact, going back to the issue, each of this piece also makes the long and "summarized" version, which is arguably not the most efficient way of doing it (even though the pieces are relatively lightweight since there are just some text strings or tables) but I didn't find any better way of doing it.

One possibility would be to drop the summarized version altogether, but I realized (early on) when using proto-versions of reports that in my own work for instance if I have several similar models, having all the full info for all of them made the files super verbose and not digest with a lot of redundant info, so it was nice to have like one time the full model and then just the critical information (although I agree that what is "critical" is highly debatable)

~~in any case~~ _{I forgot what I wanted to say}

bwiernik · 2021-04-09T03:55:44Z

The main things I was considering were (1) interpret_p or similar arguments to exclude p values and/or their interpretation and (2) a format to report only formatted numeric values without accompanying text (ala #86 and #108).

bwiernik · 2021-04-09T04:01:53Z

text_short was rather deprecated in favour of the summary() generic report() %>% summary().

The pkgdown site still has these functions documented: https://easystats.github.io/report/reference/text_long.html

bwiernik · 2021-04-12T14:47:32Z

I'm wondering what the purpose of storing static text_full and text values is, versus keeping the components as a list until print/format time.

So, for example:

format_report_coefficient <- function(rr, digits = 3, down_with_nhst = FALSE) {
	if (down_with_nhst) {
		rr$interpret_significance <- NA
		rr$p_value <- NA
	}
	with(
		rr,
		glue::glue(
			"The effect of {term} was",
			if (!is.na(interpret_significance)) " {interpret_significance}" else "",
			if (all(!is.na(interpret_significance), !is.na(interpret_es))) " and" else "",
			if (!is.na(interpret_es)) " {interpret_es}" else "",
			" (β = {format(es, digits = digits)}",
			if (!is.na(ci)) ", [{ci}% CI {format(ci_low, digits = digits)}, {format(ci_high, digits = digits)}]" else "",
			if (!is.na(p_value)) ", p = {format(p_value, digits = digits)}" else "",
			")",
			"."
		)
	)
}

rr <- list(term = "predictor", interpret_significance = "statistically significant", interpret_es = "large", es = .90, ci = 99, ci_low = .85, ci_high = .95, p_value = .001)

format_report_coefficient(rr)
# The effect of predictor was statistically significant and large (ß = 0.9, [99% CI 0.85, 0.95], p = 0.001).

format_report_coefficient(rr, down_with_nhst = TRUE)
# The effect of predictor was large (ß = 0.9, [99% CI 0.85, 0.95]).

Here, the idea is that report() generates a list of components and then this function is what would glue the pieces together according to the specified parameters. This enables modularity for things like omitting significance or p values, reporting just formatted numeric results without textual interpretations, or even adjustment of things like number of digits, CI width, etc. at print time.

(Using glue() here is mostly to make things easier to write, but it also is what permits easy use of digit formatting, etc.)

bwiernik · 2021-04-12T14:52:16Z

Other formatting bits could include things like localization (e.g., US English "we fit a model" vs UK English "we fitted a model", perhaps eventually non-English translation).

DominiqueMakowski · 2021-04-13T00:36:28Z

That seems like a good idea, we could start experimenting/implementing such increased modularity little-by-little, starting in places where it's easy and "stable"?

though we might want to keep not having "glue" as a dependency (or we add it, after all, report does some heavy gluing throughout... @strengejacke ?).

Also, I don't think we'll be able to have ß symbols, since CRAN only allows ASCII symbols in the package (hence that discussion #97 and my proposal to spell out the symbol as beta as a replacement).

strengejacke · 2021-04-13T06:53:03Z

though we might want to keep not having "glue" as a dependency (or we add it, after all, report does some heavy gluing throughout...

Although glue has no further dependencies, I don't see the benefit, i.e. if using glue outweigh potential issues. I always have the example of "withr" and "rstanarm" in mind, which totally broke the package. Unless we really need specific functions, I'd say we stick to base R functions.

strengejacke · 2021-04-13T06:53:51Z

have ß symbols

Yes, I think so, too. I'm not sure if expression() or bquote() help, or if these are only working for graphics?

bwiernik · 2021-04-13T09:52:31Z

The way to use the beta symbol is to Unicode escape it. "\u03B2"

DominiqueMakowski · 2021-04-13T09:55:50Z

I'm worried that it will introduce a lot of \u03B2 = 42 when formatting is not done. Isn't easier to keep, at its core, beta = 42 and then eventually replace all of the beta = by ``\u03B2 =` during postprocessing/formatting/displaying?

bwiernik · 2021-04-13T10:03:38Z

If the printing environment can’t support unicode (such as R GUI on Windows with a US code page), it prints fallback characters, which for Greek letters are English characters (b, etc). There are some tests we could run to check the environment and unicode support, but they probably aren’t necessary.

If report is going to be used in, for example, latex or markdown report generation, it really needs to be able to include correct character formatting.

DominiqueMakowski · 2021-04-13T10:06:10Z

cool! Indeed we should test it a bit first but otherwise it seems like a good idea

This comment has been minimized.

Sign in to view

IndrajeetPatil added the docs 📚 Something to be adressed in docs and/or vignettes label Apr 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

text_short() etc. and current API thoughts? #153

text_short() etc. and current API thoughts? #153

bwiernik commented Apr 9, 2021

DominiqueMakowski commented Apr 9, 2021 •

edited

Loading

bwiernik commented Apr 9, 2021

bwiernik commented Apr 9, 2021

This comment has been minimized.

bwiernik commented Apr 12, 2021

bwiernik commented Apr 12, 2021

DominiqueMakowski commented Apr 13, 2021

strengejacke commented Apr 13, 2021

strengejacke commented Apr 13, 2021

bwiernik commented Apr 13, 2021

DominiqueMakowski commented Apr 13, 2021

bwiernik commented Apr 13, 2021

DominiqueMakowski commented Apr 13, 2021

text_short() etc. and current API thoughts? #153

text_short() etc. and current API thoughts? #153

Comments

bwiernik commented Apr 9, 2021

DominiqueMakowski commented Apr 9, 2021 • edited Loading

bwiernik commented Apr 9, 2021

bwiernik commented Apr 9, 2021

This comment has been minimized.

bwiernik commented Apr 12, 2021

bwiernik commented Apr 12, 2021

DominiqueMakowski commented Apr 13, 2021

strengejacke commented Apr 13, 2021

strengejacke commented Apr 13, 2021

bwiernik commented Apr 13, 2021

DominiqueMakowski commented Apr 13, 2021

bwiernik commented Apr 13, 2021

DominiqueMakowski commented Apr 13, 2021

DominiqueMakowski commented Apr 9, 2021 •

edited

Loading