Statistical modeling formalizes the assumed relationships of our data mathematically and lets us estimate parameters like variances and means that we can use later for statistical inference. We will use linear mixed effects models as they are the simplest model that can specify the relationships and sources of variation in our experimental design. Once fit, we will assess the models for violations of assumptions. Once we have verified our models fit our data well, we can use them for statistical inference, which we will do in the next vignette, 4-Statistical-Inference.

Load required packages

library(tidyverse)
library(ggrepel)
library(ggpubr)
library(data.table)
library(lmerTest)
library(emmeans)
library(Cardinal)
library(broom.mixed)
source("Utilities.R")

zero_to_na <- function(x, mz) {
  x[x == 0] <- NA_real_
  x
}

bp_param <- function(workers) {
  if (interactive()) {
    SnowfastParam(workers = as.integer(workers), force.GC = TRUE)
  } else {
    NULL
  }
}

setCardinalVerbose(verbose = FALSE)
gcinfo(verbose = FALSE)

Load peak-picked, normalized, aggregated data

DATA_PATH <- file.path("Data","Filtering-Aggregation","Preprocessed-NSFiltered-Aggregated-CartilageFiltered25.imzML")
mse <- readMSIData(DATA_PATH)

mse <- mse |>
      addProcessing(zero_to_na, label="zero_to_na_before_modeling") |>
  process(spectra="intensity", index="mz", BPPARAM = bp_param(3L))

Specify Statistical Models

Models are specified based on the experimental design. In our case, we have two fixed effects: condition (Osteoarthritis vs Control) and tissue (Medial vs Lateral), and one random effect: subject. We will also include the interaction between condition and tissue. Modeling can be done in and out of Cardinal. We will demonstrate two methods: using Cardinal::meansTest and using lmer directly.

Method 1: Modeling in Cardinal

Cardinal offers statistical modeling via meansTest. Under the hood, meansTest uses nlme::lme or lme4::lmer to fit models with random effects and stats::lm for models without. meansTest requires a few steps to prepare the dataset for modeling.

To confine statistical testing to the region of interest, we will subset the pixels to only those in the cartilage. This is the longest step and may take a couple of minutes.

pData(mse)$subject <- as.factor(pData(mse)$subject)
pData(mse)$tissue <- as.factor(pData(mse)$tissue)
pData(mse)$condition <- as.factor(pData(mse)$condition)
pData(mse)$tissue_zone <- as.factor(pData(mse)$tissue_zone)
pData(mse)$run <- as.factor(pData(mse)$run)

mse <- subsetPixels(mse, tissue_zone == "cartilage")

We are ready to fit the models. The fixed effects are specified in the second argument. condition * tissue is shorthand for condition + tissue + condition:tissue, which includes the main effects and their interaction. The random effects are specified in the third argument. ~ 1|subject indicates that we want a random intercept for each subject. Specification of random effects for lmer can be found here. For compatibility with previous versions of Cardinal and nlme::lme(), the fixed and random effects are separated. The samples argument specifies the variable that delineates which pixels are included in each sample or experimental unit.

For mixed effects models, we can run meansTest using the default implementation, nlme::lme(), or specify lme4::lmer() using use_lmer=TRUE for newer Cardinal releases. When modeling basic random effects structures, both implementations are equivalent. lme4::lmer() extended via lmerTest enables use of the Satterthwaite approximation for calculating degrees of freedom. As we have opted to replace zero values with NA’s, we must pass na.rm = TRUE to ensure NA values are dropped during summarization. meansTest will summarize all pixels in the object. To specify the cartilage ROI, we used subsetPixels to eliminate non-cartilage pixels.

Modeling with `nlme::lme()` in Cardinal

nlme_means_test <- meansTest(mse, ~ condition * tissue, ~ 1|subject, samples = run(mse), na.rm = T)
nlme_means_test

## MeansTest of length 124
## model: lme 
## DataFrame with 10 rows and 6 columns
##            i        mz                  fixed       random statistic    pvalue
##    <integer> <numeric>            <character>  <character> <numeric> <numeric>
## 1          1   1126.52 intensity ~ conditio.. ~1 | subject  0.711532  0.870488
## 2          2   1143.53 intensity ~ conditio.. ~1 | subject  0.894013  0.826872
## 3          3   1302.63 intensity ~ conditio.. ~1 | subject  5.194768  0.158078
## 4          4   1304.61 intensity ~ conditio.. ~1 | subject  3.057369  0.382864
## 5          5   1325.60 intensity ~ conditio.. ~1 | subject  0.599056  0.896648
## 6          6   1343.65 intensity ~ conditio.. ~1 | subject  3.927915  0.269354
## 7          7   1366.62 intensity ~ conditio.. ~1 | subject  2.447975  0.484767
## 8          8   1367.62 intensity ~ conditio.. ~1 | subject  1.370194  0.712536
## 9          9   1381.63 intensity ~ conditio.. ~1 | subject  1.482458  0.686325
## 10        10   1388.60 intensity ~ conditio.. ~1 | subject  2.039564  0.564236
## ... and 114 more results

Note the P-value column. This is a P value for a likelihood-ratio test (LRT), investigating the goodness of fit by comparing our model to a reduced model. When no reduced model is given and random effects are specified, the reduced model defaults to a model with just the random effects. The P value indicates whether the full model is significantly better at explaining the variation than the reduced model. In this context the LRT is generally more of a diagnostic test of model fit. When specified correctly LRTs can offer a measure of the variance-explaining capacity of a specific term in the model, but does not offer information on the relevance to the outcome of interest. To get that information, we must run contrasts, as we will do in the next vignette, 4-Statistical-Inference.

Modeling with `lme4::lmer()` in Cardinal

lmer_means_test <- meansTest(
  mse, ~ condition * tissue, ~ 1 | subject,
  samples = run(mse),
  na.rm = TRUE,
  use_lmer = TRUE)

lmer_means_test

## MeansTest of length 124
## model: lmerModLmerTest 
## DataFrame with 10 rows and 7 columns
##            i        mz                  fixed       random statistic    pvalue
##    <integer> <numeric>            <character>  <character> <numeric> <numeric>
## 1          1   1126.52 intensity ~ conditio.. ~1 | subject        NA        NA
## 2          2   1143.53 intensity ~ conditio.. ~1 | subject        NA        NA
## 3          3   1302.63 intensity ~ conditio.. ~1 | subject        NA        NA
## 4          4   1304.61 intensity ~ conditio.. ~1 | subject        NA        NA
## 5          5   1325.60 intensity ~ conditio.. ~1 | subject        NA        NA
## 6          6   1343.65 intensity ~ conditio.. ~1 | subject        NA        NA
## 7          7   1366.62 intensity ~ conditio.. ~1 | subject        NA        NA
## 8          8   1367.62 intensity ~ conditio.. ~1 | subject        NA        NA
## 9          9   1381.63 intensity ~ conditio.. ~1 | subject        NA        NA
## 10        10   1388.60 intensity ~ conditio.. ~1 | subject        NA        NA
##     singular
##    <logical>
## 1       TRUE
## 2      FALSE
## 3      FALSE
## 4      FALSE
## 5       TRUE
## 6      FALSE
## 7      FALSE
## 8       TRUE
## 9       TRUE
## 10     FALSE
## ... and 114 more results

Note the singular column - Cardinal automatically reports singularity for lme4::lmer() models. Models can be singular for many reasons; in our case, this is likely due to low biological variance relative to technical variance. More on this later. If we print the results, we can see information about the models. In Cardinal, models fit with lme4::lmer() (via use_lmer=TRUE) use restricted maximum likelihood estimation. Means testing with nlme::lme() in Cardinal uses maximum likelihood estimation to be compatible with LRTs.

Note: meansTest does not support LRTs for lme4::lmer() models, so the statistic and pvalue columns in the output will be NA.

Let’s save the combined image and lme4::lmer() results for later.

save(lmer_means_test, file = file.path("Outputs", "meansTest_cartilage_lmer.RData"))

Method 2: Modeling externally with `lmer`

Alternatively, we can extract the data from the MSImagingExperiment objects and run the models ourselves. This allows for greater flexibility than using Cardinal::meansTest.

First, let us create a helper function to run the models and extract the results tidily. In addition, we will calculate the intraclass correlation coefficient (ICC) for each model - a measure of the breakdown between within-subject and residual variance, with greater values (> 0.5) indicating more within-subject variance than residual variance. We will also extract whether the model is singular and the convergence code. Our function will return three objects: a list of models, a wide tibble of model results, and a long tibble of model results that is better suited for plotting.

# Helper function to model and catch errors
model_and_extract <- function(mse_frame, form) {
  models <- list()
  for (f in unique(mse_frame$feature_id)) {
    data <- mse_frame |> filter(feature_id == f)
    print(paste0("Running model for: ", f))
    models[[as.character(f)]] <- lmer(form, data = data)
  }
  model_results <- bind_rows(lapply(models, FUN = function(m){
    tidy(m, "fixed") |> mutate(model_id = "means",
                                isSingular = isSingular(m),
                                convergenceCode = get_convergence_code(m),
                                icc = as.data.frame(VarCorr(m))$vcov[1]/sum(as.data.frame(VarCorr(m))$vcov),
                                total_var = sum(as.data.frame(VarCorr(m))$vcov))  |> 
      pivot_wider(id_cols = c(isSingular, convergenceCode, icc, total_var),
                  names_from = term,
                  values_from = c(estimate, std.error, statistic, df, p.value))
  }), .id = "feature_id")
  # good for plotting
  model_results_long <- model_results |>
    pivot_longer(
      cols = starts_with("estimate_") | starts_with("std.error_") | starts_with("statistic_") | starts_with("df_") | starts_with("p.value_"),
      names_to = c("model_item", "parameter"),
      names_pattern = "(.*)_(.*)",
      values_to = "value")
  return(list(models = models, 
              model_results = model_results, 
              model_results_long = model_results_long))
}

Now let’s summarize the pixel means per sample by region of interest (ROI).

# Summarize pixel means per sample by ROI
mse <- summarizeFeatures(mse, 
                    stat = c("mean", "nnzero"),
                    na.rm = TRUE, 
                    verbose = F)

mse <- summarizeFeatures(mse, 
                    stat = c("mean", "nnzero"), 
                    groups = mse$run, 
                    na.rm = TRUE, 
                    verbose = F)

Next, we will extract the feature data from the MSE objects (roughly, a data frame of each sample’s mean feature intensity by ROI) and add the experimental variables to create a modeling data frame. It is important to ensure that the sample IDs in the data frame match those in the MSImagingExperiment object, and that the experimental variables are factors. The models will still run if the variables are numeric, but they will treat them as continuous variables rather than categorical variables, which have different interpretations.

# Extract means for modeling and specify experimental vars
mse_frame <- as_tibble(fData(mse)) |>
  mutate(feature_id = row_number()) |>
  pivot_longer(
    X6891.OA.lateral.mean:X6954.OA.medial.mean,
    names_to = "run",
    values_to = "cartilage.mean",
    names_pattern = "X(.+)\\.mean$",
    names_transform = list(run = ~ str_replace_all(.x, "\\.", "-"))
  ) |>
  left_join(name_tibble, by = join_by(run)) |>
  mutate_at(vars(feature_id, subject, condition, tissue), factor) |> 
  arrange(feature_id, condition, tissue, subject) |> 
  select(feature_id, mz, run, cartilage.mean, subject, condition, tissue)

Now we can run the models, using the same formula as before. Cardinal::meansTest separates fixed and random effects as separate arguments. lme4::lmer combines them. We will also save the models and results. We can access a wide table of results with cartilage_res[[2]] and a list of models with cartilage_res[[1]].

cartilage_res <- model_and_extract(mse_frame, "cartilage.mean ~ condition * tissue + (1|subject)")

save(cartilage_res, file = file.path("Outputs", "cartilage_results.RData"))

rmarkdown::paged_table(head(cartilage_res[[2]]))

As above, many of the models are singular. Let’s explore the ICC values of these models.

knitr::kable(cartilage_res[[2]] |> 
  summarize(`% models with ICC at or near zero` = sum(icc <= 0.0001)/n()*100, .by = isSingular))

isSingular	% models with ICC at or near zero
TRUE	100
FALSE	0

As we can see, most of the singular models have ICC values near zero (less than 0.0001), indicating that the models are not able to (meaningfully) partition out within-subject variance from residual variance. Aside from the theoretical issues implied by singularity (overly complex random effects structure, low biological variance relative to technical variance, etc.), singularity also interferes with calculation of degrees of freedom via the Satterthwaite approximation. This method uses mean squares in the approximation, and when the model is singular, the mean squares for the random effect will be zero. lmer is aware of this problem and simply omits the random effect if singularity is detected, making it equivalent to a fixed effects model. Because we have the simplest random effects structure already, this solution is acceptable. However, for more complex random effects structures, one would instead iteratively remove random effect components until the model is no longer singular.

Evaluating Model Assumptions

Linear mixed effects models make several assumptions that must be checked to ensure statistical validity. There are many ways to express these assumptions, but in general they are that random effects and residuals are independent, are Normally distributed, and have constant variance. We can assess these assumptions with residual quantile-quantile (QQ) plots.

Simple QQ plots can be generated with the following base R functions.

qqnorm(resid(cartilage_res[[1]][[1]]), main = "Q-Q Plot of Residuals")
qqline(resid(cartilage_res[[1]][[1]]), col = "red")

We will generate more informative QQ plots, as well as some other diagnostic plots. This code creates a tibble of residuals and fitted values from each model.

qq_plot_frame <- NULL
set.seed(2025)
selected_fids <- sort(sample(names(cartilage_res[[1]]), 20))
for (fid in names(cartilage_res[[1]])) {
  model <- cartilage_res[[1]][[fid]]
  qq_plot_frame <- bind_rows(qq_plot_frame,
                        cbind(tibble(residuals = residuals(model),
                               fitted = fitted(model),
                               feature_id = fid),
                               model.frame(model)))
}

Then we can compute theoretical normal quantiles for each model.

qq_plot_frame <-
  qq_plot_frame |>
  mutate(feature_id = as.numeric(feature_id)) |>
  left_join(
    tibble(mz = mz(mse)) |>
      mutate(feature_id = row_number(),
             mz = round(mz, 3)),
    by = join_by("feature_id")
  ) |>
  group_by(mz) |>
  arrange(residuals) |>
  mutate(
    samp = residuals,
    theo = qnorm(ppoints(n()))
  ) |>
  ungroup()

qq_plot_frame$condition = factor(qq_plot_frame$condition, 
levels = c("Osteoarthritis","Control"))
qq_plot_frame$tissue = factor(qq_plot_frame$tissue, 
levels = c("Medial","Lateral"))
  
fids <- unique(qq_plot_frame$feature_id)

Let’s generate interaction plots that will help us see the signal and effect being modeled.

manage_directory("QQ-Itx-SII-Plots")

m <- readMSIData(DATA_PATH) |>
      addProcessing(zero_to_na, label="zero_to_na_before_modeling") |>
  process(spectra="intensity", index="mz", BPPARAM = bp_param(8L)) 

feature_ids <- fids
selected_mz <- fData(m)[feature_ids, , drop = FALSE]
m <- subsetFeatures(m, mz %in% selected_mz$mz)
k <- as.data.table(t(as.matrix(intensity(m))))
setnames(k, paste0("feature_", feature_ids))
pdata <- as.data.table(pData(m))[, .(x, y, tissue_zone, run, subject, condition, tissue)]
result <- cbind(pdata, k)
melted <- melt(result, id.vars = c("x", "y", "tissue_zone", "run", "subject", "condition", "tissue"), 
                measure.vars = patterns("^feature_"), 
                variable.name = "feature", 
                value.name = "intensity")
melted[, feature := sub("^feature_", "", feature)]

sii <- function(fid, bone_alpha = 0.3, background_alpha = 1){
  df <- melted |>
    filter(feature == as.character(fid)) |>
    left_join(name_tibble |> select(run, tid), by = c("run")) |>
    mutate(
      intensity = ifelse(tissue_zone == "background", NA, intensity),
      alpha_val = dplyr::case_when(
        tissue_zone == "cartilage" ~ 1,
        tissue_zone == "bone" ~ bone_alpha,
        TRUE ~ background_alpha
      ),
      tid = factor(tid, levels = c("6891-OM",
                                        "6891-OL",
                                        "6919-CM",
                                        "6919-CL",
                                        "6908-OM",
                                        "6908-OL",
                                        "6924-CM",
                                        "6924-CL",
                                        "6949-OM",
                                        "6949-OL",
                                        "6938-CM",
                                        "6938-CL",
                                        "6954-OM",
                                        "6954-OL",
                                        "6944-CM",
                                        "6944-CL"))
    )

  df|>
    arrange(subject, desc(tissue)) |>
    ggplot(aes(x = x, y = -1 * y, fill = intensity, alpha = alpha_val)) +
    geom_tile() +
    coord_equal() +
    theme_void() +
    xlab(NULL) +
    ylab(NULL) +
    facet_wrap(~ tid, ncol = 4) +
    scale_fill_viridis_c(
      option = "turbo",
      begin = 0,
      end = 1,
      na.value = "#bdbdbd"
    ) +
    scale_alpha_identity(guide = "none") +
    theme(
      legend.position = "bottom",
      panel.spacing.x = unit(0, "pt"),   # gap between columns
      panel.spacing.y = unit(1, "pt")    # gap between rows
    ) +
    labs(fill = "Intensity")
}

itx_fancy <- function(tmp_df){
  tmp_df_alt <- tmp_df |>
            left_join(name_tibble |> 
                      select(condition, tissue, subject, tid) |>
                      mutate(subject = as.factor(subject)),
                      by = join_by(condition, tissue, subject))|>
              mutate(tissue = factor(tissue, levels = c("Medial","Lateral")),
                     condition = factor(condition, levels = c("Osteoarthritis","Control")))
    
    p <- tmp_df_alt |> 
    ggplot(aes(x = tissue, y = cartilage.mean, group = subject, color = condition, fill = condition)) +
    ylab("Mean Intensity") +
    xlab(NULL) +
    #ylim(global_floor, global_ceiling) +
    theme_classic() + 
    theme(legend.position = 'bottom') +
    stat_summary(aes(group = condition, color = condition),fun = mean, geom = "point", 
                 shape = 21, size = 3.5, fill = "white", stroke = 1.5, alpha = .6) +
    stat_summary(aes(group = condition, color = condition), fun = mean, geom = "line", 
                 linewidth = 3.25, alpha = 0.3, lineend = "round") + 
    geom_point() +
    geom_line()

    p + 
    geom_text_repel(data = bind_rows(tmp_df_alt |>
                      filter(tissue == "Lateral"),
                      tmp_df_alt |> 
                        filter(tissue == "Lateral") |> 
                        mutate(suffix = ifelse(condition == "Control", "CL", "OL")) |> 
                        group_by(condition) |>
                        summarise(cartilage.mean = mean(cartilage.mean, na.rm = TRUE),
                               tissue = "Lateral",
                               suffix = dplyr::first(suffix),
                               tid = paste0("Mean ", suffix))|> 
                        ungroup()),
                    aes(label = tid),
                    color = "black",
                    hjust = -0.4,
                    vjust = 0.5,
                    size = 2.5,
                    segment.alpha = 0) +
    geom_text_repel(data = bind_rows(tmp_df_alt |>
                      filter(tissue == "Medial"),
                      tmp_df_alt |> 
                        filter(tissue == "Medial") |> 
                        mutate(suffix = ifelse(condition == "Control", "CM", "OM")) |> 
                        group_by(condition) |>
                        summarise(cartilage.mean = mean(cartilage.mean, na.rm = TRUE),
                               tissue = "Medial",
                               suffix = dplyr::first(suffix),
                               tid = paste0("Mean ", suffix)) |> 
                        ungroup()),
            aes(label = tid),
            color = "black",
            hjust = 1.4,
            vjust = 0.5,
            size = 2.5,
            segment.alpha = 0) +
    labs(color = "Condition", fill = "Condition")
}

This code will generate a set of plots for every model, saving them in the ./QQ-Itx-SII-Plots directory.

for (f in fids) {

tmp_df <- qq_plot_frame |> filter(feature_id == f)

qqs <- ggplot(tmp_df, aes(theo, samp, shape = tissue, color = condition)) +
  geom_point(size = 3, alpha = 0.7) +
  geom_abline(slope = 1, intercept = 0) +
  labs(
    x = "Theoretical (Normal) Quantiles",
    y = "Residual Quantiles",
    shape = "Tissue",
    color = "Condition"
  ) + 
  theme_linedraw() +
  theme(
  legend.position = "bottom",
  legend.box = "horizontal"   # color + shape guides side-by-side
) +
guides(
  color = guide_legend(
    direction = "vertical",   # entries stacked
    ncol = 1,
    byrow = TRUE,
    title.position = "left"   # title left of entries
  ),
  shape = guide_legend(
    direction = "vertical",
    ncol = 1,
    byrow = TRUE,
    title.position = "left"
  )
) +
theme(legend.key.width  = unit(0.4, "cm"),
      legend.spacing.x  = unit(0.1, "cm"))

var_condition <- ggplot(tmp_df, aes(y = condition, x = residuals)) +
  geom_boxplot(aes(group = condition), outlier.shape = NA, color = "black") +
  geom_point(aes(shape = tissue, color = condition), size = 3, alpha = 0.7) +
  labs(
    x = NULL,
    y = "Condition",
    shape = "Tissue",
    color = "Condition"
  ) + 
  theme_linedraw() +
  theme(
  legend.position = "none",
  axis.text.y = element_text(angle = 90, hjust = 0.5, vjust = 0.5))

var_tissue <- ggplot(tmp_df, aes(y = tissue, x = residuals)) +
  geom_boxplot(aes(group = tissue), outlier.shape = NA, color = "black") +
  geom_point(aes(shape = tissue, color = condition), size = 3, alpha = 0.7) +
  labs(
    x = NULL,
    y = "Tissue",
    shape = "Tissue",
    color = "Condition"
  ) + 
  theme_linedraw() +
  theme(
  legend.position = "none",
  axis.text.y = element_text(angle = 90, hjust = 0.5, vjust = 0.5))


var_boxplots <- ggarrange(var_condition, var_tissue, ncol = 2) |>
  annotate_figure(top = "Distribution of residuals by independent variable", 
            bottom = "Residuals",
            fig.lab = "D",
            fig.lab.size = 20,
            fig.lab.face = "bold") +
theme(panel.border = element_rect(colour = "#999999", fill=NA, linewidth=0.25))

qqs <- annotate_figure(qqs, top = "Quantile-quantile plots of residuals",
            fig.lab = "A",
            fig.lab.size = 20,
            fig.lab.face = "bold") +
theme(panel.border = element_rect(colour = "#999999", fill=NA, linewidth=0.25))

boxps <- itx_fancy(tmp_df)

boxps <- annotate_figure(boxps, top = "Mean intensity by tissue and condition",
            fig.lab = "B",
            fig.lab.size = 20,
            fig.lab.face = "bold")+
theme(panel.border = element_rect(colour = "#999999", fill=NA, linewidth=0.25))

sii_plot <- annotate_figure(sii(f), top = "Single ion images with cartilage highlighted",
            fig.lab = "C",
            fig.lab.size = 20,
            fig.lab.face = "bold")+
theme(panel.border = element_rect(colour = "#999999", fill=NA, linewidth=0.25))

ggarr <- ggarrange(ggarrange(qqs, 
                              boxps, 
                              nrow = 2), 
                    ggarrange(sii_plot, var_boxplots, nrow = 2, heights = c(5,3)),
                    ncol = 2, 
                    widths = c(1,2)) |>
         annotate_figure(bottom = paste0("Feature: ", round(unique(tmp_df$mz)[1], 3), " m/z" ))
ggsave(file.path("QQ-Itx-SII-Plots",paste0("QQ-", f ,".png")), plot = ggarr, height = 9, width = 11, units = "in")

if (round(unique(tmp_df$mz)[1], 2) == 1438.66) {
  ggsave(file.path("Recreated-Figures","Figure-Statistical-Modeling-Good-Fit.png"), plot = ggarr, height = 9, width = 11, units = "in")
}
}

Let’s take a look at the last one:

ggarr

Each plot can offer us some understanding of the fit of the model to our data.

QQ plot: Shows clear signs of departure from normality at both tails. Also note the bifurcation of color, indicating that the control samples have far less variance.
Interaction plot: We can see that the control samples have low variance, and seem to follow a similar increasing pattern from Medial to Lateral. This is not the case with the OA samples.
Single ion images of each sample: We can observe generally high sparsity, and low overall intensity. Some samples are very sparse.
Residual variance boxplots by independent variable: Both show signs of unequal variance, especially between conditions.

Here is the manual part: as we only have 124 model, we can manually diagnose assumption fit for each one using the set of plots in the ./QQ-Itx-SII-Plots directory. I (the author) have done this manual classification myself, and added my determinations to Outputs/feature-model-fit-evaluation.csv. We will refrain from testing our hypotheses on models with poor fit, as we can no longer be sure the tests will yield accurate results.

To demonstrate common QQ plot shapes, and especially, what is considered “Normal” with only 16 samples, we can simulate plots below.

set.seed(2025)

n <- 16

normal <- rnorm(n, mean = 0, sd = 1)
exp_dist <- rchisq(n, df = .5)
normal_with_outlier <- c(rnorm(n - 1, mean = 0, sd = 1), 5)

qq_data <- function(x) {
  qq <- qqnorm(x, plot.it = FALSE)
  data.frame(
    sample = qq$y,
    theoretical = qq$x
  )
}

df <- bind_rows(
  qq_data(normal) |> mutate(scenario = "QQ plot from Normally distributed data"),
  qq_data(exp_dist) |> mutate(scenario = "QQ plot from Right-skewed data"),
  qq_data(log2(exp_dist)) |> mutate(scenario = "QQ plot from Right-skewed data\nafter log transformation")
) |> 
  mutate(scenario = factor(scenario, levels = c("QQ plot from Normally distributed data",
                                      "QQ plot from Right-skewed data",
                                      "QQ plot from Right-skewed data\nafter log transformation")))

ggplot(df, aes(sample = sample)) +
  stat_qq() +
  stat_qq_line(distribution = stats::qnorm) +
  facet_wrap(~ scenario, scales = "free") +
  labs(x = "Theoretical (Normal) Quantiles",
       y = "Residual Quantiles")

ggsave(file.path("Recreated-Figures", "SI-Figure-Statistical-Modeling-QQ-Examples.png"), width = 8, height = 4)

Let us also look at Normality when we only have 8 biological replicates. Our models assume that the distribution of random effects is normal overall, and within each experimental variable. (In this example, we only have condition.)

set.seed(2025)

df <- NULL

n = 8

for (s in 1:8) {
    df <- bind_rows(df,
                    tibble(sample = rnorm(n, mean = 0, sd = 1)) |> 
                      mutate(subject = s,
                      condition = ifelse(s <= 4, "Control", "OA")))
}


ggarrange(
ggplot(df, aes(y = sample, x = condition)) +
  geom_boxplot(alpha = 0.5, outliers = F) +
  geom_jitter(width = 0.1, alpha = 0.2) +
  labs(y = "Residuals",
       x = "Condition") +
  ylim(-2,3),
ggplot(df, aes(y = sample, x = as.factor(subject))) +
  geom_boxplot(alpha = 0.5, outliers = F) +
  geom_jitter(width = 0.1, alpha = 0.2) +
  labs(x = "Subject",
       y = "Residuals")+
  ylim(-2,3),
nrow = 1, labels = c("A", "B"), widths = c(.35, 1))

ggsave(file.path("Recreated-Figures", "SI-Figure-Statistical-Modeling-Constant-Variance.png"), width = 6, height = 4)

As we can see, it is generally simple to diagnose serious deviations from model assumptions. However, even when we sample data from an empirically Normal distribution, small sample size and expected variation can inject uncertainty and abnormality.

Additionally, Normality of a sample is subjective and is better described as a continious scale from clearly abnormal to clearly normal. If we operate with the understanding that all sampled data have some variation and that model fit is not dichotomously good or bad, but a continuous spectrum from well fit to poorly fit, then we can say that so long as deviations are not serious, we can rely on the robustness of linear models to handle small departures from the model’s assumptions.

3-Statistical-Modeling

Ethan Rogers

03/20/2026

Load required packages

Load peak-picked, normalized, aggregated data

Specify Statistical Models

Method 1: Modeling in Cardinal

Modeling with `nlme::lme()` in Cardinal

Modeling with `lme4::lmer()` in Cardinal

Method 2: Modeling externally with `lmer`

Evaluating Model Assumptions

3-Statistical-Modeling

Ethan Rogers

03/20/2026

Load required packages

Load peak-picked, normalized, aggregated data

Specify Statistical Models

Method 1: Modeling in Cardinal

Modeling with nlme::lme() in Cardinal

Modeling with lme4::lmer() in Cardinal

Method 2: Modeling externally with lmer

Evaluating Model Assumptions

Modeling with `nlme::lme()` in Cardinal

Modeling with `lme4::lmer()` in Cardinal

Method 2: Modeling externally with `lmer`