R/Silicosis_diagnostic_rule.qmd

---
title: "Silicosis Diagnostic Rule"
subtitle: "Description of diagnostic rule and impact of misclassification error"
author: 
  - "Javier Mancilla Galindo, junior researcher" 
  - "Dr. Lützen Portengen, supervisor      " 
  - "Dr. Susan Peters, supervisor          "
date: today
execute: 
  echo: false
  warning: false
format:
  html:
    toc: true
    toc_float: true
    embed-resources: true
    code-links:
        - text: "GitHub"
          href: https://github.com/UtrechtUniversity/lexces-silicosis-predict
          icon: github
  pdf: 
    toc: true
    documentclass: scrartcl
bibliography: ../docs/manuscript/lexces-silicosis-predict.bib
csl: ../docs/manuscript/american-medical-association.csl
editor: source
---

\pagebreak

# Summary

**Introduction**: A diagnostic prediction model was developed to rule out pneumoconiosis in Dutch construction workers (surveyed in 1998) and published in 2007. The diagnostic rule identifies workers at high risk of pneumoconiosis who are referred for medical examination and diagnostic imaging with chest X-ray (CXR). Recently, concerns have been raised about the poor diagnostic performance of CXR compared to high resolution computed tomography (HRCT) for the diagnosis of silicosis, especially for detecting early cases. With the ultimate intention of recommending whether the diagnostic prediction rule should be incorporated into a health surveillance program for silicosis, this work provides an overview of the diagnostic prediction rule and estimates the extent, impact, and potential implications of outcome misclassification from its use.

**Methods**: Data were simulated to replicate the summary characteristics and outcome probability of the original diagnostic rule development study. A total of 5000 samples were used to estimate the potential impact of outcome misclassification over the diagnostic rule's accuracy, by using combinations of false positive and negative rates (FPR and FNR) of CXR (index test) compared to HRCT (reference test), to estimate the adjusted area under the curve (AUC) assuming non-differential outcome misclassification. Simulated observations were categorized into low (\<5 points) and high (>=5 points) risk categories according to the pneumoconiosis diagnostic rule scoring system. The true outcome that could have been observed had HRCT been performed was obtained with a reverse-misclassification function using diagnostic performance estimates. Two scenarios are presented by using possible combinations of sensitivity and specificity diagnostic performance parameters of CXR against HRCT within the uncertainty boundaries of their 95% confidence intervals (95%CI) and a negative inverse correlation between them of -0.7 and -0.8.

**Results**: Incorporation of uncertainties of the diagnostic performance of CXR against HRCT, assuming that outcome misclassification is non-differential, reveals that the AUC of the diagnostic prediction rule is underestimated when using CXR (absolute difference: 0.039 to 0.04). Using a cut-off score of 5 points results in 17.35% (224 out of 1291) of workers identified as being in the high-risk category. On average, outcome misclassification leads to 1 case less detected with HRCT than CXR in the high-risk group (median prevalence 7.73% vs 6.98%), whereas its impact is greater in the low-risk category, with 19 cases detected with CXR and 40-41 with HRCT (median prevalence 3.78% vs 1.76%). However, scenarios are widely variable, reflecting the uncertainties in the diagnostic performance of CXR against HRCT.

**Discussion**: The diagnostic prediction rule for silicosis has been used with a higher threshold (5 points) than optimal (3.75 points in the original study), likely due to the need of minimizing the number of workers who undergo diagnostic imaging studies. Under that threshold, 17.35% of participants are identified as high risk individuals to undergo screening. Incorporating current knowledge and uncertainties of the diagnostic performance of CXR against HRCT reveals that outcome misclassification has a greater impact in workers in the low risk category than those classified as high-risk. Scenarios in which differential misclassification exists remain to be explored as well as those with different disease prevalence or varying cut-off points.  

\pagebreak

```{r}
#| include: false  

# Setup 

# Create directories for sub-folders  
inputfolder <- "../data/raw"
psfolder <- "../data/processed"
tempfolder <- "../data/temp"
figfolder <- "../results/output_figures"
tabfolder <- "../results/output_tables"

dir.create(inputfolder, showWarnings = FALSE)
dir.create(psfolder, showWarnings = FALSE)
dir.create(tempfolder, showWarnings = FALSE)
dir.create(figfolder, showWarnings = FALSE)
dir.create(tabfolder, showWarnings = FALSE)
```

```{r}
#| include: false  

# Packages used 

if (!require("pacman", quietly = TRUE)) {
  install.packages("pacman")
}

pacman::p_load(
  tidyverse,  # Used for basic data handling and visualization.
  MASS,       # Used for multivariate normal distribution.
  gt,         # Used to print html tables.  
  report      # Used to cite packages used in this session.   
)
```

# Background

A diagnostic prediction model to **rule out** pneumoconiosis in construction workers was developed and published in 2007.[@Suarthana2007] The study population consisted of Dutch natural stone construction workers age 30 years and older. Lexces partners are currently designing a health surveillance program (HSP) for respiratory occupational diseases, including silicosis. The diagnostic prediction rule could be incorporated into the HSP to determine which workers exposed to silica dust should undergo further diagnostic workup for silicosis. However, concerns have been raised about the prediction rule not detecting early cases of silicosis. Thus, the objective of this work is to provide an overview of the diagnostic prediction rule and to estimate the extent, impact, and implications of outcome misclassification with its use.

## Outcome

To develop the prediction rule, the diagnosis of pneumoconiosis was defined as a chest x ray (CXR) indicative of pneumoconiosis (ILO profusion category >=1/1), for which the ILO international classification of radiographs of pneumoconioses 2000 version was used. The most up-to-date version of this guideline is the 2022 revised edition.[@ILO2022] The ILO score is assigned upon examination of small opacities on CXR, in comparison to standardized CXR images. The range of possible values are integers between 0 and 3, which are assigned to a major category, followed by a subcategory (see **Box 1** for a simple example). For instance, a score of 1/0 means that 1 was assigned as the major category, while 0 (subcategory) was strongly considered as the alternative. Conversely, a score of 0/1 means that the radiologist assigned 0 as the major category, but strongly considered 1 as suitable. A score 1/1 means that the CXR is consistent with the standard CXR graded as 1 in the ILO classification report.

As mentioned earlier, an ILO score **>=1/1** was considered as the reference standard for pneumoconiosis to develop the diagnostic prediction rule.[@Suarthana2007] This contrasts with standard recommendations at the time mentioning that an ILO category **1/0** or higher should be considered consistent with the presence of pneumoconiosis.[@Wagner1996] This decision was made under the rationale that a 1/0 cutoff could lead to greater misclassification, resulting in more unnecessary chest x-rays. Out of the 1291 workers included for analysis, a total of 37 (2.9%) had a score >=1/1, whereas 131 (10.1%) were graded >=1/0.

Noteworthy, three different radiologists examined the CXR and provided a score. Radiologists were **blinded** to patient characteristics, except for the fact that all participants worked on the construction industry. The median score was used for analysis.

## Predictors

Lung function measured with a pneumotacometer on the same day of CXR obtention and worker questionnaire variables were assessed as potential predictors of pneumoconiosis. Seven candidate predictors were identified in univariable analysis:

-   Age
-   Smoking status
-   Job title
-   Time working in the construction industry
-   Feeling unhealthy
-   Cumulative exposure to silica index
-   Standardized residual FEV1

Continuous variables were dichotomized and modeled separately, as continuous and binary. Since there were no differences in the AUC of a prediction model with continuous vs binary predictors, the latter were kept to simplify the diagnostic rule usage.

The final model included five predictors:

+----------------------------------------+-------------------------+-------------+-------------+
| Predictor                              | Value                   | Score       | Beta        |
+========================================+=========================+=============+=============+
| Age                                    | greater/equal 40 years  | 1.0         | 0.72        |
+----------------------------------------+-------------------------+-------------+-------------+
| Smoking habit                          | Current smoker          | 1.0         | 0.70        |
+----------------------------------------+-------------------------+-------------+-------------+
| Job title                              | High exposure job title | 1.5         | 1.14        |
+----------------------------------------+-------------------------+-------------+-------------+
| Work duration in construction industry | greater/equal 15 years  | 1.5         | 1.00        |
+----------------------------------------+-------------------------+-------------+-------------+
| Self-related health                    | Feeling unhealthy       | 1.25        | 0.84        |
+----------------------------------------+-------------------------+-------------+-------------+
| Standardized residual FEV1             | lower/equal -1.0        | 1.25        | 0.91        |
+----------------------------------------+-------------------------+-------------+-------------+

The uncorrected AUC of the model was **0.81 (95%CI: 0.75 to 0.86)**. The corrected AUC was 0.76.

\pagebreak

## Model Validation

In the original Suarthana study,[@Suarthana2007] the prediction model was only internally validated. A formal external validation procedure was not performed as currently recommended in TRIPOD+AI guidelines.[@Collins2024]

To scope for studies reporting the use of the diagnostic prediction rule and any posterior external validation studies, the citations of the diagnostic rule development model were [retreived from Google Scholar](https://scholar.google.com/scholar?cites=1797392544233043996&as_sdt=2005&sciodt=0,5&hl=es&inst=7240083048524121927) on 10/09/2024 and screened for title and abstract. Google Scholar was chosen due to its wide coverage of literature sources. A total of **59** records citing the paper were found. In comparison, other databases retrieved less results: PubMed-MEDLINE (n = 11), Web of Science (n = 22), Scopus (n = 32), semantic scholar (n = 34), and dimensions (n = 26). All documents were reviewed, including those in other languages, for which automatic translations were obtained to screen for any calculations of the probability of silicosis according to the diagnostic prediction rule. Out of **59** records citing the paper, **5 studies**[@Nicol2015; @Meijer2011; @Mets2012; @Stigter2011; @Rooijackers2016] reported having used the diagnostic prediction rule to calculate workers' risk of pneumoconiosis. These studies are summarized in the following subheadings:

#### Nicol, et al.[@Nicol2015]

In a case series of 6 young stonemasons from the UK who were diagnosed with silicosis after performing a high-resolution computed tomography (HRCT) (three of them with progressive massive fibrosis), the diagnostic rule was applied and all 6 cases had a probability of having silicosis of 0%.[@Nicol2015] All these 6 cases would have not been referred for further chest x-ray investigation based solely on the diagnostic prediction rule score.

#### Meijer, et al.[@Meijer2011]

A subset of 180 participants enrolled in the study used for the development of the diagnostic prediction rule were invited for further examination with chest HRCT, of which a total of **n=79** ultimately underwent HRCT.[@Meijer2011] Participants invited were intended to be representative of the different risk score categories of the diagnostic prediction rule. A definite diagnosis of silicosis was not made. The study reports HRCT findings for different ILO thresholds (0/0, 1/0, and >=1/1), agreement between individual HRCT features between radiologists, and associations between the cumulative exposure index to silica and HRCT findings, controlling for smoking.

In participants with a normal CXR (ILO 0/0), only 34.9% had a normal HRCT. In these patients, findings suggestive of silicosis such as well-defined round opacities (8%) and parietal pleural abnormalities (24%) were frequent on HRCT. Emphysema was also frequent (41%), as well as irregular and/or linear opacities (22%).

#### Mets, et al.[@Mets2012]

This was a case-control study in which workers in the construction industry with a high-risk of silicosis based on the diagnostic prediction rule (score 5 or higher) were invited to undergo diagnostic workup, including chest CT, pulmonary function test, and medical examination by a pulmonologist. A total of 398 workers out of 42,150 (0.9%) were in the high risk category and invited to participate. The proportion of high-risk participants was lower than in the original Suarthana paper, possibly due to the ARBOUW database including a large fraction of administrative workers and not only construction workers. Ultimately, 54 participated as cases (high-risk), whereas controls were patients from a cancer screening cohort. The study reports micronodules found on chest CT. 

#### Rooijackers, et al.[@Rooijackers2016]

This is a congress abstract which also used the ARBOW database to identify high-risk participants with a threshold of 5 points in the diagnostic prediction rule. Out of 75,000 employees, 1123 (1.5%) were high-risk participants. A total of 295 workers ultimately participated and underwent chest CT. Silicosis was found in 64 workers (22%), 37 (13%) in an early stage. 

#### Stigter, et al.[@Stigter2011]

This is a congress abstract reporting the use of the diagnostic prediction rule to identify high-risk workers (cut-off: 5 points) in a ceramic tile prodiction plant. Out of 353 employees, 52 (15%) were in the high-risk category and underwent chest CT. Silicosis was found in 8 workers (17%). 

\pagebreak

## Cut-off points of the diagnostic prediction rule

A cutoff point of **3.75** is suggested as optimal, with the following classification measures:

|        | CXR + | CXR - |      |
|--------|-------|-------|------|
| Rule + | 33    | 534   | 567  |
| Rule - | 4     | 720   | 724  |
|        | 37    | 1254  | 1291 |

-   Sensitivity: 89.2%,
-   Specificity: 57.4%,
-   Negative Predictive Value: 99.4%,
-   Positive Predictive Value: 85.2%

Nonetheless, a **higher cut-off point of 5** has been used in practice.[@Stigter2011; @Rooijackers2016; @Mets2012] The summary data for this exact cut-off point is not provided in the original diagnostic rule paper, so the cut-off point of **5.25** is used here to provide an impression of its classification properties reported in the original study (note that this may differ from the actual diagnostic performance characteristics):

|        | CXR + | CXR - |      |
|--------|-------|-------|------|
| Rule + | 13    | 106   | 119  |
| Rule - | 24    | 1148  | 1178 |
|        | 37    | 1254  | 1291 |

-   Sensitivity: 35.1%,
-   Specificity: 91.5%,
-   Negative Predictive Value: 98.0%,
-   Positive Predictive Value: 10.9%

The decision to use a higher cut-off point than the optimal is likely due to the large number of individuals that should undergo CXR with a 3.75 cut-off (43.9%) vs 5.25 (9.2%).

\pagebreak

# Misclassification of chest X-Ray vs HRCT

## CXR based on ILO 1/1 cut-off from Hoy, et al.[@Hoy2024]

Using the summary data reported by Hoy, et al.[@Hoy2024] for different ILO scores, a 2x2 table can be recreated for the 1/1 ILO threshold:

|       | HRCT + | HRCT - |     |
|-------|--------|--------|-----|
| CXR + | 23     | 2      | 25  |
| CXR - | 17     | 68     | 85  |
|       | 40     | 70     | 110 |

```{r}
# Input values from 2x2 tables 
TP <- 23 # True positives
TN <- 68 # True negatives 
FP <- 2  # False positives
FN <- 17 # False negatives 

# Calculate point estimate 
Sn <- TP/(TP+FN)
Sp <- TN/(FP+TN)
FPR <- 1-Sp
FNR <- 1-Sn

# Confidence intervals
ci_Sn <- prop.test(TP, TP+FN, conf.level = 0.95)$conf.int
ci_Sp <- prop.test(TN, TN+FP, conf.level = 0.95)$conf.int
ci_FPR <- 1 - ci_Sp
ci_FNR <- 1 - ci_Sn

# Additional diagnostic performance measures 
PPV <- TP/(TP+FP)
NPV <- TN/(FN+TN)
LRpos <- Sn/(1-Sp)
LRneg <- (1-Sn)/Sp
Accuracy <- (TP+TN)/(TP+FP+FN+TN)
DOR <- (TP*TN)/(FP*FN)

# Print results 
cat("Sensitivity (%):", round((Sn*100),1), "(95%CI:", round(ci_Sn[1]*100,1), "-", round(ci_Sn[2]*100,1), ")", "\n")
cat("Specificity (%):", round((Sp*100),1), "(95%CI:", round(ci_Sp[1]*100,1), "-", round(ci_Sp[2]*100,1), ")", "\n")
cat("False Positive Rate (%):", round((1-Sp)*100,1), "(95%CI:", round(ci_FPR[2]*100,1), "-", round(ci_FPR[1]*100,1), ")", "\n")
cat("False Negative Rate (%):", round((1-Sn)*100,1), "(95%CI:", round(ci_FNR[2]*100,1), "-", round(ci_FNR[1]*100,1), ")", "\n")
cat("Positive Predictive Value; (%):", round((PPV*100),1), "\n")
cat("Negative Predictive Value; (%):", round((NPV*100),1), "\n")
cat("Likelihood Ratio (+):", round(LRpos,2), "\n")
cat("Likelihood Ratio (-):", round(LRneg,2), "\n")
cat("Accuracy (%):", round((Accuracy*100),1), "\n")
cat("Diagnostic Odds Ratio:", round(DOR,2), "\n")
```

\pagebreak

# Accounting for misclassification error

Corrected ROC curve analysis of prediction models can be done by taking into account misclassification error for binary outcomes, provided that disease prevalence and misclassification rates are known.[@Zawistowski2017] Zawistowski, et al. simulate the value of the true outcome and then introduce different misclassification rates to understand the impact of misclassification on the prediction models' AUC.

## Non-differential misclassification 

In the case of the diagnostic prediction rule, we do not know the value of the true outcome, which would have been determined with HRCT. Instead, the diagnostic prediction rule used CXR as the reference test, which means that only the value of the misclassified outcome is know. Zawistowski's[@Zawistowski2017] procedure can be adapted to obtain the reverse-misclassified outcome instead, by using the information from Hoy, et al.[@Hoy2024] to estimate what the diagnostic rule AUC would have been had HRCT been used instead of CXR. The original functions, as well as the adapted reverse-misclassification function are found in the following script which is sourced into this document:

```{r}
#| echo: true
source("scripts/Zawistowski_misclassification_functions.R")
```

```{r}
#| echo: true
#| eval: false
source("scripts/Diagnostic_rule_AUC_misclassification.R")
```

Simulated data with a sample size of 1291 participants is used to replicate samples with a similar size as the original diagnostic rule development study, by using the summary data reported in the paper and assigning the outcome based on the outcome probability from the diagnostic rule equation. A total of 5000 different samples are drawn to perform estimations of the potential impact of misclassification from the diagnostic prediction rule. Furthermore, scores for every fictitious participant are calculated based on the diagnostic prediction rule scoring system and a cut-off value of 5 is used to classify on high-risk (>=5 points) and low-risk (\<5) of silicosis, since this is the cut-off value that has been used in practice.[@Stigter2011; @Rooijackers2016]

\pagebreak

### Scenario 1

Sensitivity and specificity confidence intervals from Hoy, et al.[@Hoy2024] are used, simulating combinations of Sn and Sp from a bivariate normal distribution and a **correlation of -0.8** between them. 

-   Sensitivity 95%CI: 41.0 - 72.6
-   Specificity 95%CI: 89.1 - 99.5

The distribution of low risk and high risk participants is as follows:

```{r}
ROC_results <- read.csv(file.path(tabfolder, "ROC_results_scenario_1_corr0.8.csv"))

columns <- c(
  "sum_low_risk",
  "sum_high_risk"
  )

# Summary statistics function 
summary_stats <- function(column) {
  c(Median = round(median(ROC_results[[column]]), 3),
    P = round(quantile(ROC_results[[column]], 0.25), 3),
    P = round(quantile(ROC_results[[column]], 0.75), 3),
    Min = round(min(ROC_results[[column]]), 3),
    Max = round(max(ROC_results[[column]]), 3))
}

table_results <- data.frame(
  Risk = c(
    "Low (<5 points)",
    "High (>=5 points)"
    ),
  t(sapply(columns, summary_stats))
) 

table_results %>% gt() %>% 
  gtsave(filename = "Risk_categories_count_corr0.8.rtf", path = tabfolder)

table_results %>% gt()
```

The following table shows the distribution of outcome occurrence:

```{r}
columns <- c(
  "sum_outcome_observed",
  "sum_outcome_true",
  "sum_high_risk_observed",
  "sum_low_risk_observed",
  "sum_high_risk_true",
  "sum_low_risk_true"
  )

table_results <- data.frame(
  Characteristic = c(
    "Silicosis (CXR)","Silicosis (HRCT)","Silicosis (CXR) | high-risk",
    "Silicosis (CXR) | low-risk","Silicosis (HRCT) | high-risk",
    "Silicosis (HRCT) | low-risk"
    ),
  t(sapply(columns, summary_stats))
)

table_results %>% gt() %>% 
  gtsave(filename = "Cases_count_corr0.8.rtf", path = tabfolder)

table_results %>% gt()
```

##### Prevalence of silicosis in high and low risk groups

```{r}
ROC_results <- 
  ROC_results %>% 
  mutate(
    prevalence_high_risk_observed = sum_high_risk_observed/sum_high_risk*100,
    prevalence_low_risk_observed = sum_low_risk_observed/sum_low_risk*100, 
    prevalence_high_risk_true = sum_high_risk_true/sum_high_risk*100,
    prevalence_low_risk_true = sum_low_risk_true/sum_low_risk*100,
    ID = 1:nrow(ROC_results)
  )
```

```{r}
columns <- c(
  "prevalence_high_risk_observed",
  "prevalence_low_risk_observed",
  "prevalence_high_risk_true",
  "prevalence_low_risk_true"
  )

table_results <- data.frame(
  Characteristic = c(
    "Prevalence (CXR) | high-risk",
    "Prevalence (CXR) | low-risk",
    "Prevalence (HRCT) | high-risk",
    "Prevalence (HRCT) | low-risk"
    ),
  t(sapply(columns, summary_stats) %>% round(2))
)

table_results %>% gt() %>% 
  gtsave(filename = "Prevalence_corr0.8.rtf", path = tabfolder)

table_results %>% gt()
```


```{r}
ROC_results_random <- ROC_results %>% 
  mutate(ID = 1:nrow(ROC_results)) %>% 
  sample_n(100)
```

\pagebreak

High risk group 
```{r}
ROC_results_random %>%
  pivot_longer(cols = c(prevalence_high_risk_observed, prevalence_high_risk_true), names_to = "Outcome", values_to = "Prevalence") %>%
  mutate(Outcome = factor(case_when(
    Outcome == "prevalence_high_risk_observed" ~ "Silicosis (CXR) | high-risk",
    Outcome == "prevalence_high_risk_true" ~ "Silicosis (HRCT) | high-risk"
  ))) %>% 
  ggplot(aes(Outcome, Prevalence, group = ID, color = ID)) +
  geom_point() +
  geom_line() +
  stat_summary(fun = mean, geom = "line", lwd = 2, aes(group=1)) +
  labs(x = "Outcome", y = "Prevalence (%)") +
  theme_minimal() +
  theme(
    legend.position = "none",
    axis.line = element_line(colour = "black"),
    panel.grid.minor = element_blank(),
    panel.grid.major.x = element_blank(),
    plot.margin = margin(5, 5, 5, 5, "mm"),
    axis.title.x = element_text(margin = margin(t = 10)),
    axis.title.y = element_text(margin = margin(r = 10)),
    ) + 
  expand_limits(x = c(0.5,2.5), y = c(0,25)) + 
  coord_cartesian(expand = FALSE) +
# Add a single data point at y = 22, when x = "Silicosis (HRCT) | high-risk" with a label "Rooijackers, et al. 2016"
  geom_point(aes(y = 22, x = "Silicosis (HRCT) | high-risk"), color = "red") +
  geom_text(aes(y = 22,  x = "Silicosis (HRCT) | high-risk", label = "Rooijackers, et al. 2016"), vjust = -1) +
# Add a single data point at y = 22, when x = "Silicosis (HRCT) | high-risk" with a label "Rooijackers, et al. 2016"
  geom_point(aes(y = 17, x = "Silicosis (HRCT) | high-risk"), color = "red") +
  geom_text(aes(y = 17,  x = "Silicosis (HRCT) | high-risk", label = "Stigter, et al. 2011"), vjust = -1) 
  
# Save in figfolder with white background 
ggsave(
  file.path(figfolder, "prevalence_high_risk_corr0.8.png"),
  width = 6, 
  height = 4,
  dpi = 300,
  bg = "white"
  )
```

Low risk group

```{r}
ROC_results_random %>%
  pivot_longer(cols = c(prevalence_low_risk_observed, prevalence_low_risk_true), names_to = "Outcome", values_to = "Prevalence") %>%
  mutate(Outcome = factor(case_when(
    Outcome == "prevalence_low_risk_observed" ~ "Silicosis (CXR) | low-risk",
    Outcome == "prevalence_low_risk_true" ~ "Silicosis (HRCT) | low-risk"
  ))) %>% 
  ggplot(aes(Outcome, Prevalence, group = ID, color = ID)) +
  geom_point() +
  geom_line() +
  stat_summary(fun = mean, geom = "line", lwd = 2, aes(group=1)) +
  labs(x = "Outcome", y = "Prevalence (%)") +
  theme_minimal() +
  theme(
    legend.position = "none",
    axis.line = element_line(colour = "black"),
    panel.grid.minor = element_blank(),
    panel.grid.major.x = element_blank(),
    plot.margin = margin(5, 5, 5, 5, "mm"),
    axis.title.x = element_text(margin = margin(t = 10)),
    axis.title.y = element_text(margin = margin(r = 10)),
    ) + 
  expand_limits(x = c(0.5,2.5), y = c(0,25)) + 
  coord_cartesian(expand = FALSE)
  
# Save in figfolder with white background 
ggsave(
  file.path(figfolder, "prevalence_low_risk_corr0.8.png"),
  width = 6, 
  height = 4,
  dpi = 300,
  bg = "white"
  )
```

\pagebreak

##### ROC curve analysis

```{r}
columns <- c(
  "ROC_observed_outcome", 
  "ROC_observed_corrected"
  )

table_results <- data.frame(
  Outcome = c("CXR", "CXR-corrected"),
  t(sapply(columns, summary_stats))
)

table_results %>% gt() %>% 
  gtsave(filename = "Corrected_ROC_corr0.8.rtf", path = tabfolder)

table_results %>% gt()
```

```{r}
bias_absolute <- abs(table_results$Median[1] - table_results$Median[2])

cat("Absolute difference in AUC (CXR-corrected):", bias_absolute, "\n") 
```

```{r}
ROC_results %>%
  pivot_longer(cols = c(ROC_observed_outcome, ROC_observed_corrected), names_to = "Outcome", values_to = "AUC") %>%
  mutate(Outcome = factor(case_when(
    Outcome == "ROC_observed_outcome" ~ "Observed",
    Outcome == "ROC_observed_corrected" ~ "Corrected"
  )) %>% fct_relevel(c("Observed", "Corrected"))) %>%
  ggplot(aes(x = Outcome, y = AUC, fill = Outcome)) +
  geom_boxplot() +
  ylim(0.5, 1) +
  labs(x = "Outcome", y = "AUC", caption = "Correlation Sn ~ Sp = -0.8") +
  scale_fill_manual(values = c("Observed" = "lightblue", "Corrected" = "skyblue3")) +
  theme_minimal() +
  theme(
    legend.position = "none",
    axis.line = element_line(colour = "black"),
    panel.grid.minor = element_blank(),
    panel.grid.major.x = element_blank(),
    plot.margin = margin(5, 5, 5, 5, "mm"),
    axis.title.x = element_text(margin = margin(t = 10)),
    axis.title.y = element_text(margin = margin(r = 10)),
    ) + 
  expand_limits(x = c(0.5,2.5), y = 0) + 
  coord_cartesian(expand = FALSE)

ggsave(
  file.path(figfolder, "AUC_corrected_corr0.8.png"),
  width = 5, 
  height = 4,
  dpi = 300,
  bg = "white"
  )
```

\pagebreak

### Scenario 2

Sensitivity and specificity confidence intervals from Hoy, et al.[@Hoy2024] are used, simulating combinations of Sn and Sp from a bivariate normal distribution and a **correlation of -0.7** between them. 

-   Sensitivity 95%CI: 41.0 - 72.6
-   Specificity 95%CI: 89.1 - 99.5

The distribution of low risk and high risk participants is as follows:

```{r}
ROC_results <- read.csv(file.path(tabfolder, "ROC_results_scenario_1_corr0.7.csv"))

columns <- c(
  "sum_low_risk",
  "sum_high_risk"
  )

# Summary statistics function 
summary_stats <- function(column) {
  c(Median = round(median(ROC_results[[column]]), 3),
    P = round(quantile(ROC_results[[column]], 0.25), 3),
    P = round(quantile(ROC_results[[column]], 0.75), 3),
    Min = round(min(ROC_results[[column]]), 3),
    Max = round(max(ROC_results[[column]]), 3))
}

table_results <- data.frame(
  Risk = c(
    "Low (<5 points)",
    "High (>=5 points)"
    ),
  t(sapply(columns, summary_stats))
) 

table_results %>% gt() %>% 
  gtsave(filename = "Risk_categories_count_corr0.7.rtf", path = tabfolder)

table_results %>% gt()
```

The following table shows the distribution of outcome occurrence:

```{r}
columns <- c(
  "sum_outcome_observed",
  "sum_outcome_true",
  "sum_high_risk_observed",
  "sum_low_risk_observed",
  "sum_high_risk_true",
  "sum_low_risk_true"
  )

table_results <- data.frame(
  Characteristic = c(
    "Silicosis (CXR)","Silicosis (HRCT)","Silicosis (CXR) | high-risk",
    "Silicosis (CXR) | low-risk","Silicosis (HRCT) | high-risk",
    "Silicosis (HRCT) | low-risk"
    ),
  t(sapply(columns, summary_stats))
)

table_results %>% gt() %>% 
  gtsave(filename = "Cases_count_corr0.7.rtf", path = tabfolder)

table_results %>% gt()
```

##### Prevalence of silicosis in high and low risk groups

```{r}
ROC_results <- 
  ROC_results %>% 
  mutate(
    prevalence_high_risk_observed = sum_high_risk_observed/sum_high_risk*100,
    prevalence_low_risk_observed = sum_low_risk_observed/sum_low_risk*100, 
    prevalence_high_risk_true = sum_high_risk_true/sum_high_risk*100,
    prevalence_low_risk_true = sum_low_risk_true/sum_low_risk*100,
    ID = 1:nrow(ROC_results)
  )
```

```{r}
columns <- c(
  "prevalence_high_risk_observed",
  "prevalence_low_risk_observed",
  "prevalence_high_risk_true",
  "prevalence_low_risk_true"
  )

table_results <- data.frame(
  Characteristic = c(
    "Prevalence (CXR) | high-risk",
    "Prevalence (CXR) | low-risk",
    "Prevalence (HRCT) | high-risk",
    "Prevalence (HRCT) | low-risk"
    ),
  t(sapply(columns, summary_stats) %>% round(2))
)

table_results %>% gt() %>% 
  gtsave(filename = "Prevalence_corr0.7.rtf", path = tabfolder)

table_results %>% gt()
```


```{r}
ROC_results_random <- ROC_results %>% 
  mutate(ID = 1:nrow(ROC_results)) %>% 
  sample_n(100)
```

\pagebreak

High risk group 
```{r}
ROC_results_random %>%
  pivot_longer(cols = c(prevalence_high_risk_observed, prevalence_high_risk_true), names_to = "Outcome", values_to = "Prevalence") %>%
  mutate(Outcome = factor(case_when(
    Outcome == "prevalence_high_risk_observed" ~ "Silicosis (CXR) | high-risk",
    Outcome == "prevalence_high_risk_true" ~ "Silicosis (HRCT) | high-risk"
  ))) %>% 
  ggplot(aes(Outcome, Prevalence, group = ID, color = ID)) +
  geom_point() +
  geom_line() +
  stat_summary(fun = mean, geom = "line", lwd = 2, aes(group=1)) +
  labs(x = "Outcome", y = "Prevalence (%)") +
  theme_minimal() +
  theme(
    legend.position = "none",
    axis.line = element_line(colour = "black"),
    panel.grid.minor = element_blank(),
    panel.grid.major.x = element_blank(),
    plot.margin = margin(5, 5, 5, 5, "mm"),
    axis.title.x = element_text(margin = margin(t = 10)),
    axis.title.y = element_text(margin = margin(r = 10)),
    ) + 
  expand_limits(x = c(0.5,2.5), y = c(0,25)) + 
  coord_cartesian(expand = FALSE) +
# Add a single data point at y = 22, when x = "Silicosis (HRCT) | high-risk" with a label "Rooijackers, et al. 2016"
  geom_point(aes(y = 22, x = "Silicosis (HRCT) | high-risk"), color = "red") +
  geom_text(aes(y = 22,  x = "Silicosis (HRCT) | high-risk", label = "Rooijackers, et al. 2016"), vjust = -1) +
# Add a single data point at y = 22, when x = "Silicosis (HRCT) | high-risk" with a label "Rooijackers, et al. 2016"
  geom_point(aes(y = 17, x = "Silicosis (HRCT) | high-risk"), color = "red") +
  geom_text(aes(y = 17,  x = "Silicosis (HRCT) | high-risk", label = "Stigter, et al. 2011"), vjust = -1) 
  
# Save in figfolder with white background 
ggsave(
  file.path(figfolder, "prevalence_high_risk_corr0.7.png"),
  width = 6, 
  height = 4,
  dpi = 300,
  bg = "white"
  )
```

Low risk group

```{r}
ROC_results_random %>%
  pivot_longer(cols = c(prevalence_low_risk_observed, prevalence_low_risk_true), names_to = "Outcome", values_to = "Prevalence") %>%
  mutate(Outcome = factor(case_when(
    Outcome == "prevalence_low_risk_observed" ~ "Silicosis (CXR) | low-risk",
    Outcome == "prevalence_low_risk_true" ~ "Silicosis (HRCT) | low-risk"
  ))) %>% 
  ggplot(aes(Outcome, Prevalence, group = ID, color = ID)) +
  geom_point() +
  geom_line() +
  stat_summary(fun = mean, geom = "line", lwd = 2, aes(group=1)) +
  labs(x = "Outcome", y = "Prevalence (%)") +
  theme_minimal() +
  theme(
    legend.position = "none",
    axis.line = element_line(colour = "black"),
    panel.grid.minor = element_blank(),
    panel.grid.major.x = element_blank(),
    plot.margin = margin(5, 5, 5, 5, "mm"),
    axis.title.x = element_text(margin = margin(t = 10)),
    axis.title.y = element_text(margin = margin(r = 10)),
    ) + 
  expand_limits(x = c(0.5,2.5), y = c(0,25)) + 
  coord_cartesian(expand = FALSE)
  
# Save in figfolder with white background 
ggsave(
  file.path(figfolder, "prevalence_low_risk_corr0.7.png"),
  width = 6, 
  height = 4,
  dpi = 300,
  bg = "white"
  )
```

\pagebreak

##### ROC curve analysis

```{r}
columns <- c(
  "ROC_observed_outcome", 
  "ROC_observed_corrected"
  )

table_results <- data.frame(
  Outcome = c("CXR", "CXR-corrected"),
  t(sapply(columns, summary_stats))
)

table_results %>% gt() %>% 
  gtsave(filename = "Corrected_ROC_corr0.7.rtf", path = tabfolder)

table_results %>% gt()
```

```{r}
bias_absolute <- abs(table_results$Median[1] - table_results$Median[2])

cat("Absolute difference in AUC (CXR-corrected):", bias_absolute, "\n") 
```

```{r}
ROC_results %>%
  pivot_longer(cols = c(ROC_observed_outcome, ROC_observed_corrected), names_to = "Outcome", values_to = "AUC") %>%
  mutate(Outcome = factor(case_when(
    Outcome == "ROC_observed_outcome" ~ "Observed",
    Outcome == "ROC_observed_corrected" ~ "Corrected"
  )) %>% fct_relevel(c("Observed", "Corrected"))) %>%
  ggplot(aes(x = Outcome, y = AUC, fill = Outcome)) +
  geom_boxplot() +
  ylim(0.5, 1) +
  labs(x = "Outcome", y = "AUC", caption = "Correlation Sn ~ Sp = -0.7") +
  scale_fill_manual(values = c("Observed" = "lightblue", "Corrected" = "skyblue3")) +
  theme_minimal() +
  theme(
    legend.position = "none",
    axis.line = element_line(colour = "black"),
    panel.grid.minor = element_blank(),
    panel.grid.major.x = element_blank(),
    plot.margin = margin(5, 5, 5, 5, "mm"),
    axis.title.x = element_text(margin = margin(t = 10)),
    axis.title.y = element_text(margin = margin(r = 10)),
    ) + 
  expand_limits(x = c(0.5,2.5), y = 0) + 
  coord_cartesian(expand = FALSE)

ggsave(
  file.path(figfolder, "AUC_corrected_corr0.7.png"),
  width = 5, 
  height = 4,
  dpi = 300,
  bg = "white"
  )
```
\pagebreak


## Differential outcome misclassification

Prior analyses assumed that outcome misclassification is non-differential. However, differential outcome misclassification is conceivable. The sources and mechanisms of differential misclassification are summarized in a mind-map ([link to resource - in progress)](https://mm.tt/app/map/3433596501?t=S3mwy4V8Vu)). Here, the focus is on how the main candidate predictors of the diagnostic prediction model could have led to differential outcome misclassification through a mechanism that systematically increases the FPR with the probability of being a case and/or increases the FNR with the probability of being a control, as these are the two mechanisms that could have led AUC overestimation in the original diagnostic rule development study. Only **age** and **smoking** are thought to potentially lead to differential outcome misclassification through plausible mechanisms, because radiologists were blinded to participant characteristics, thereby blocking the sources of differential outcome misclassification for the other predictors.

... Work in progress ...

\pagebreak

# Extended Data

#### Box 1

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Box 1. Understanding the ILO chest X-ray classification scheme                                                                                                                                                                                                                                                                                                            |
+===========================================================================================================================================================================================================================================================================================================================================================================+
| The ILO CXR classification scheme may be unintuitive at first. An analogy can be made with a daily life situation to simplify its understanding. Suppose that a radiologist goes to the supermarket to buy chocolate. The radiologist finds 4 options on the shelve:                                                                                                      |
|                                                                                                                                                                                                                                                                                                                                                                           |
| -   Sweet chocolate (30% cocoa) = **ILO 0**                                                                                                                                                                                                                                                                                                                               |
|                                                                                                                                                                                                                                                                                                                                                                           |
| -   Semi-sweet chocolate (50% cocoa) = **ILO 1**                                                                                                                                                                                                                                                                                                                          |
|                                                                                                                                                                                                                                                                                                                                                                           |
| -   Semi-dark chocolate (70% cocoa) = **ILO 2**                                                                                                                                                                                                                                                                                                                           |
|                                                                                                                                                                                                                                                                                                                                                                           |
| -   Dark chocolate (95% cocoa) = **ILO 3**                                                                                                                                                                                                                                                                                                                                |
|                                                                                                                                                                                                                                                                                                                                                                           |
| Radiologist number 1 (R1) has a hard time deciding between 30% (ILO 0) and 50% (ILO 1) cocoa, but does not even consider buying a 70% (ILO 2) or 95% (ILO 3) cocoa bar. In the end, R1 picks the 50% cocoa bar. Thus, the final score is **1/0** because they payed for semi-sweet chocolate (ILO 1), but strongly considered sweet chocolate (ILO 0) as the alternative. |
|                                                                                                                                                                                                                                                                                                                                                                           |
| On the contrary, radiologist number 2 (R2) is convinced that semi-sweet chocolate (ILO 1) is the right choice as soon as they see the shelve and does not even consider other options. Thus, the final score for R2 is **1/1**.                                                                                                                                           |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

\pagebreak

#### Session and package dependencies

```{r}
#| echo: false  

# remove clutter
session <- sessionInfo()
session$BLAS <- NULL
session$LAPACK <- NULL
session$loadedOnly <- NULL
# write log file
writeLines(
  capture.output(print(session, locale = FALSE)),
  paste0("sessions/",lubridate::today(), "_session_Silicosis_diagnostic_rule.txt")
)                 

session
```

### Package References

```{r}
#| output: asis
report::cite_packages(session)
```

\pagebreak

# References

```{r}
#| include: false

# Run this chunk if you wish to clear your environment.
rm(list = ls())
```