The mcc
package allows calculating and visualizing metacognitive performance judgment data. For a theoretical elaboration of the methodological approach and for the results from an empirical application, please read the following paper:
Tobler, S. & Kapur, M. (2023). Metacognitive calibration: a methodological expansion and empirical application. https://doi.org/10.3929/ethz-b-000600979
In this paper, the following functions are described:
- Overconfidence
- Underconfidence
- Calibration accuracy
- Miscalibration
These functions can be applied to performance judgment based on 4-point Likert scale data (yes / rather yes / rather no / no) or on performance judgment based on binary data (yes / no).
Additionally, commonly used functions in the literature for calibration accuracy are available as well. These correspond to:
- d'
- gamma
- G-index
To use the functions, the data need to be prepared. Therefore, the functions require
- a data frame with the performance data (participants x questions), in which questions are rated 1 if correct and 0 if incorrect
- a data frame with the performance-judgment values, either numerically designated (e.g., 0-3) or alphabetically with the already correctly assigned letters (see Table 1 in the paper).
The assigned letters in the data preparation correspond to those depicted in Table 1 of the paper mentioned above. A step-by-step instruction in R is shown below.
Ideally, the performance judgment is assessed using a 4-point Likert scale. In case the performance judgment has been assessed on a binary scale, you need to first transform them by using the btof
-function. This function only works if the confidence judgments are already transformed to the letters a to d.
btof
: To transform the binary judgment data so that they can be used in the various functions. This step is necessary for all functions described here beside d', gamma, or G-index.
In case the judgment data is assessed on a 4-point Likert scale, and the values are numerical, steps 1 and 2 have to be performed. If the values are already transformed into letters according to Table 1 in the paper, step 1 can be skipped.
Step 1: Transforming Likert data to letters a-h
letterassignment
: requires the input of performance and judgment data
Step 2: Count different letters per participant
participant_summary
: requires either the results from Step 1 or the letter-based data. If binary data is assessed, use first thebtof
-function and then proceed here.
In case you need to use the functions d', gamma, or G-index, but you collected the data on a 4-point scale, you can transform them to binary values using this function:
binarization
: to transform 4-level to 2-level data
To analyze overconfidence, underconfidence, calibration accuracy, and miscalibration, the following functions can be used:
overconfidence
: requires data with the counted letters per participant (i.e., the result from theparticipant_summary
-function)underconfidence
: requires data with the counted letters per participant (i.e., the result from theparticipant_summary
-function)calibrationaccuracy
: requires data with the counted letters per participant (i.e., the result from theparticipant_summary
-function)miscalibration
: requires data with the counted letters per participant (i.e., the result from theparticipant_summary
-function)
Alternatively, one can use the function conf.stats
to skip all these steps and get the summary directly. This function works only when the judgment data has been assessed on a 4-point Likert scale.
conf.stats
: requires performance values (0 / 1), and judgment values (on a numerical scale)
To visualize the findings, one can either look at the confidence accuracy ratings and the miscalibration individually or directly visualize both in one plot. The functions to do so are.
confidence_plot
: requires the calibration accuracy valuesmiscalibration_plot
: requires the miscalibration valuescombined_plot
: requires both calibration and miscalibration values
Additionally, the following functions can be used to compare two or more groups visually:
confidence_plot.groups
: requires additionally a group valuemiscalibration_plot.group
: requires additionally a group valueoverconfidence_plot.groups
: to visualize overconfidence values in different groupsunderconfidence_plot.groups
: to visualize underconfidence values in different groups
d_apostrophe
: to calculate d'-valuesgamma
: to calculate gamma-valuesg_index
: to calculate G-index-values
Some auxiliary scripts and functions are added to allow faster processing. These include:
colors.R
: for a set of predefined colorsletters.R
: a vector of the letters in the performance-judgement matrix
library(devtools)
devtools::install_github("samueltobler/mcc", force = TRUE)
library(mcc)
To cite the mcc
-package in publications, please use:
Tobler, S. & Kapur, M. (2023). Metacognitive calibration: a methodological expansion and empirical application. Proceedings of the 17th International Conference of the Learning Sciences (ICLS 2023). Montréal, Canada. https://doi.org/10.3929/ethz-b-000600979
Some of the functions require previously published R packages. These are the references of these packages (in alphabetical order).
- Auguie B (2017). gridExtra: Miscellaneous Functions for "Grid" Graphics. R package version 2.3, https://CRAN.R-project.org/package=gridExtra.
- Kassambara A (2020). ggpubr: 'ggplot2' Based Publication Ready Plots. R package version 0.4.0, https://CRAN.R-project.org/package=ggpubr.
- Neuwirth E (2022). RColorBrewer: ColorBrewer Palettes. R package version 1.1-3, https://CRAN.R-project.org/package=RColorBrewer.
- Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York.