Tutorial 9: Mixed models - part 1

March 18, 2025 (9th week of classes)
Data hierarchy & dependency, and the Linear Mixed Effects ANOVA (or mixed model ANOVA)

Remember: The tutorials were produced in such a way that you should be able to easily adapt the codes to your own data!

Simple simulations to help understanding the nature of data hierarchy & dependency

Let’s begin by simulating data for a scenario that does not involve hierarchical structure—such as data typically analyzed using a simple one-way ANOVA. This exercise will help build a clearer understanding of what hierarchical data means.

Consider an experiment designed to assess the effects of temperature on fish growth, as discussed in class. Suppose we have four different temperature levels, with three tanks per temperature level (serving as replicates), and each tank containing 10 fish.

To add realism to our simulated data, let’s assume that fish growth is measured as the difference in weight before and after one week of exposure to the treatment temperatures.

Now, let’s define the key characteristics of this experiment before generating the data.

n.treatments <- 4 
n.replicates.per.treatment <- 3 # Number of tanks per treatment
n.individuals.per.replicate <- 10

The total number of data points for the entire experiment N is then:

N <- n.treatments * n.replicates.per.treatment * n.individuals.per.replicate
N

Let’s generate data in which the null hypothesis is rejected—that is, where mean fish growth differs among treatments. To achieve this, we will assign distinct mean growth values to each of the four temperature treatments as follows:

lowest temperature (treatment 1) fish grew in average 200mg. sub-intermediate temperature (treatment 2) fish grew in average 220mg. intermediate temperature (treatment 3) fish grew in average 250mg. high temperature (treatment 4) fish grew in average 400mg.

We can set these values as follows:

treatment.means <- c(200,220,250,400)

Let’s set the residual error (i.e., standard deviation of residuals σ) to 3 as an example. Since we assume homoscedasticity, this means the residual standard deviation remains constant across the entire experiment—across all replicates (tanks) and temperature levels.

sigma <- 3

Let’s simulate residual variation for each individual fish. This variation represents the random error added to the true temperature effects for each treatment (i.e., the mean effects specified above for each temperature level). If the error is small (i.e., a lower σ), the statistical power to detect true differences among treatments increases.

residual.values <- rnorm(N, 0, sigma)  
residual.values

Now, we need to create a vector of labels that assigns a temperature treatment to each individual fish, just as we would in real data. The command below will repeat the treatment numbers (1 to 4, or 1 to n.treatments) according to the number of replicates (tanks) per treatment (3 tanks per temperature level) and the number of individuals per replicate (10 fish per tank).

treatments <- rep(1:n.treatments, rep(n.replicates.per.treatment * n.individuals.per.replicate, n.treatments)) 
treatments

There are multiple ways to simulate data, including more complex approaches using design matrices. However, for simplicity, we will use a straightforward method by adding residuals to the mean growth values for each treatment.

Before doing so, we first need to create a vector of mean growth values per treatment, ensuring that each of the 120 individual fish is assigned the appropriate mean based on its respective treatment.

means <- rep(treatment.means, rep(n.replicates.per.treatment * n.individuals.per.replicate, n.treatments)) means

Now, let’s generate the response variable Y (i.e., fish growth) as a function of the assigned treatment means and residual variation. Since we are simulating data for an ANOVA design, there is no need to explicitly add a constant—the overall mean of all treatments (267.5, calculated as mean(c(200, 220, 250, 400))) naturally represents the grand mean.

Y is a random sample, but its mean should be close to the true mean of all treatments combined (267.5) due to the way the data is generated.

Y <- means + residual.values
Y
mean(Y)

Let’s look at the simulated data (scroll down to see all values):

View(cbind(Y,treatments))

Let’s plot the data:

boxplot(Y~treatments, col="firebrick", xlab= "Treatments", ylab ="Growth", main= "", las =1, outline=FALSE)

Let’s calculate the mean value per treatment:

tapply(Y, treatments, mean)

Note that the mean growth values within each temperature treatment across the three replicates (tanks) do not differ significantly. Therefore, it is reasonable to group all individual fish together within each treatment, treating them as independent observations regardless of the tank they were in. However, this assumption can be formally tested later in the tutorial.

separate.replicates <- rep(1:12, each=n.individuals.per.replicate) 
separate.replicates

Now, let’s visualize the data by plotting each replicate (tank) separately for each treatment.

Notice that the mean growth per replicate within treatments is very similar, indicating that variation within treatments is random. In contrast, variation among treatments is not random, as we deliberately set different treatment means—leading to the rejection of the null hypothesis (i.e., that growth is equal among treatments).

This aligns with the assumptions of a fixed-effect ANOVA, where differences among treatment means are tested while assuming that within-group variation is due to random noise.

boxplot(Y~separate.replicates, col="firebrick", xlab= "Replicates per treatment", ylab ="Growth", main= "", las =1, outline=FALSE)
abline(v=c(3.5,6.5,9.5))

Vertical lines in the boxplot separate each temperature treatment (and their 3 replicates, i.e., tanks).

Let’s calculate the mean value per treatment per replicate:

matrix(tapply(Y, separate.replicates, mean),3,4)

Note that the mean growth values within each temperature treatment across the three replicates do not differ significantly. Therefore, it is reasonable to treat individual fish as independent observations and analyze them collectively within each treatment.

For these data, we can aggregate individual fish within treatments and proceed with the analysis accordingly. The lm function in R automatically constructs the necessary design matrix for modeling the relationship between treatment groups and the response variable.

anova(lm(Y~treatments))

As expected—given the way the data were simulated—we reject the null hypothesis that all treatments have the same effect on fish growth. This provides evidence that temperature influences fish growth, with growth increasing as temperature rises.

Since we generated the data while adhering to ANOVA assumptions (normality and homoscedasticity), there is no need to formally check these assumptions in this case. However, in real-world scenarios, verifying these assumptions is essential before drawing conclusions from an ANOVA analysis.

Simulating hierarchical data

To help you better understand the nature of hierarchical data and random effects, we will use the same experimental setup as in the fixed-effects ANOVA design above. That is, we will maintain the same structure with four temperature treatments, three replicate tanks per treatment, and ten fish per tank. However, instead of treating individual fish as independent observations within treatments, we will now account for hierarchical structure by considering tanks as a random effect.

n.treatments <- 4 # Number of treatments
n.replicates.per.treatment <- 3 # Number of tanks per treatment
n.individuals.per.replicate <- 10

The total number of data points N is (as before) then:

N <- n.treatments * n.replicates.per.treatment * n.individuals.per.replicate
N

In hierarchical data, replicates within treatments (i.e., growth variation within the same temperature treatment) exhibit differences in their mean values—meaning each aquarium has its own mean. This structure arises when individuals within the same replicate (aquarium) are more similar to each other than to individuals in different aquaria.

In the previous boxplot, we saw that the mean growth values within each temperature treatment were very similar across replicates, with variation driven solely by residual error. However, when data have a strong hierarchical structure, this will no longer be the case.

Now, let’s introduce variation in the mean growth values across replicates within each temperature treatment. In simple mixed-effects models, this variation is typically assumed to be constant across treatments—though in more complex models, variance can differ between treatments. For now, we’ll keep it simple, as this assumption is commonly used in biological studies.

within.treatment.sd <- 5

Generate random variation around the means within each temperature treatment:

treatment.means.rnd <- replicate(3, rnorm(4, mean=treatment.means, sd=within.treatment.sd))
treatment.means.rnd

Note that random variation around mean values within each treatment are in rows of the matrix treatment.means.rnd.

We need know to put this matrix into a format in which we can add random residuals later on to generate random variation within and between treatments:

treatment.means.rnd.t <- c(t(treatment.means.rnd))
treatment.means.rnd.t

Unlike fixed effects, where each treatment has a single, fixed mean, random effects introduce variation at the replicate level. In this hierarchical structure, the mean for each replicate (tank) is a random sample from the overall treatment effect, allowing for natural variation between replicates within the same treatment.

Now, we will generate these random replicate-level means within treatments and then add residual variation to simulate individual fish growth values. This approach captures the hierarchical nature of the data, where replicates are not identical but instead vary around the treatment-level mean.

means <- rep(treatment.means.rnd.t, each=10) 
means

As before, let’s set the residual standard deviation sigma at 1 for example. Note that we set sigma to be much smaller than within.treatment.sd (see below for consequences):

sigma <- 1

Generate residuals:

residual.values <- rnorm(N, 0, sigma) 
residual.values

Generate the response variable Y:

Y <- means + residual.values
Y

Looking at the simulated data

Before doing so, we need to create a label for each replicate (aquarium) across treatments. Since we have 4 temperature treatments, with 3 aquaria per treatment (resulting in 12 aquaria total, labeled 1 to 12), and 10 individual fish per aquarium, we need to ensure that each fish is correctly assigned to its respective aquarium.

Take a moment to review the data structure in View() below to ensure you fully understand how the hierarchical structure is represented.

separate.replicates <- rep(1:12, each=n.individuals.per.replicate) 
View(cbind(Y,treatments,separate.replicates))

Replicates (aquaria) 1 to 3 belong to treatment 1 (low temperatures), replicates 4 to 6 to treatment 2 (sub-intermediate temperatures), and so on. Now, let’s examine the distributions of the simulated data. While treatments still differ in their mean growth, there is now variation in mean growth among aquaria within each treatment—unlike in the first simulation.

boxplot(Y~separate.replicates, col="firebrick", xlab= "Replicates per treatment", ylab ="Growth", main= "", las =1, outline=FALSE)
abline(v=c(3.5,6.5,9.5))

Let’s calculate the mean value per treatment per replicate:

matrix(tapply(Y, separate.replicates, mean),3,4)

Note that, in these data, the mean growth values within each temperature treatment differ somewhat across the three replicates. While there is variation among treatments (temperature levels), the variation within treatments is not entirely random—individuals within the same replicate (tank) tend to be more similar in growth than individuals in other tanks within the same temperature treatment.

This hierarchical structure indicates that a mixed model is needed to assess differences in mean growth among treatments while accounting for variation within treatments (i.e., between aquaria within the same treatment).

Using random and one-factorial mixed-effects models

One of the most widely used and powerful packages for conducting mixed models (incorporating both random and fixed effects) is nlme. We have already used this package in a previous tutorial on Generalized Linear Models (GLM). Now, let’s analyze the simulated data.

If you need to install the nlme package on your computer, simply run:

install.packages("nlme")
library(nlme)

We need first to set each replicate (tank) as a factor which will be then consider as the random factor in the model:

tanks <- as.factor(separate.replicates)
tanks

The intraclass correlation (as seen in class) can be calculated by the gls function in package nlme. It’s called Rho in the output of the function:

gls(Y ~ 1, correlation=corCompSymm(form= ~ 1 | tanks))

As you can see, Rho is very close to one, indicating a strong hierarchical structure in these simulated data. This is because the standard deviation for the population means (between replicates within treatments) was set much higher (within.treatment.sd = 5) than the residual variation within populations (sigma = 1).

To explore this further, try re-running the simulation with a much smaller within.treatment.sd relative to sigma. This would result in a lower intraclass correlation, representing a weaker hierarchical structure in the data.

Now, let’s start by analyzing the random effect only (i.e., variation among replicates within and across treatments) to assess its strength. The model specification ~1 | tanks reflects the hierarchical nature of the data, where tanks are nested within the grand mean (which is represented by the model’s intercept, coded as 1 in nlme).

lme.fit <- lme(Y ~ tanks,random = ~1|tanks)
lme.fit

Since tank (replicate) is treated as a random effect, each replicate is assumed to have its own effect, meaning the model estimates the variation attributed to tanks.

The nlme package does not provide statistical tests for random effects models, and there is ongoing debate in the literature about whether p-values should be estimated for random effects (as discussed in class). However, we can use the car package to perform significance testing. In this case, the random effects model is tested using the chi-square distribution rather than the F-distribution, which is typically used in fixed-effects ANOVAs.

library(car)
Anova(lme.fit)

The random effect is clearly strong and statistically significant, as indicated by the high intraclass correlation.

These data clearly require a model that accounts for variation within treatments when comparing treatments, making a one-factorial mixed-effects ANOVA the appropriate approach.

Before running the mixed-effects model, we first need to set the treatment label as a fixed effect. Since we already have the treatment labels from the initial simulation, we just need to ensure they align with the updated data structure:

treatments <- rep(1:n.treatments, rep(n.replicates.per.treatment * n.individuals.per.replicate, n.treatments)) 
treatments

Let’s now run the mixed model effect with treatments (temperature classes) as a fixed factor and tanks as random:

data.df <- data.frame(Y=Y,treatments=as.factor(treatments),tanks=tanks)
mixed.anova <- lme(Y ~ treatments, random = ~1|tanks,data=data.df)
anova(mixed.anova)

As you can see, despite the strong hierarchical structure in the data (with significant variation in mean growth across aquaria within the same treatment), temperature still has a strong effect on fish growth.

Model diagnostics & surprises

fixed effect ANOVA

One of the challenges with hierarchical data is that they are often heteroscedastic. In this ANOVA design, since mean values per aquarium vary within each treatment level, the residuals from a standard fixed-effects ANOVA may exhibit heteroscedasticity.

Let’s begin by examining the residuals of the original (fixed-effects) ANOVA:

fixed.anova <- lm(Y ~ treatments,data=data.df)
plot(fixed.anova,which=3)

Perhaps is easier to observe this using a boxplot:

boxplot(residuals(fixed.anova) ~ treatments,data=data.df)

Indeed, treatement 2 seems to have greater variance in residuals than the othe treatments and treatment 3 smaller residuals.

As we can see there is variation that differs across treatments. Let’s run a formal statistical test to assess heteroscedasticity:

library(car)
leveneTest(residuals(fixed.anova) ~ treatments,data=data.df)

Levene’s test confirms our initial suspicion—residuals cannot be assumed to be homoscedastic. Interestingly, residuals should still be homoscedastic within tanks, as we assumed a common error term (sigma) for all tanks, even though mean growth varied among tanks within each treatment.

Since our primary interest is the effect of temperature (not tanks, which serve as replicates within treatments), the key concern is heteroscedasticity at the treatment level. With that in mind, let’s examine the variance of residuals within tanks:

boxplot(residuals(fixed.anova) ~ tanks,data=data.df)
abline(v=c(3.5,6.5,9.5))

Notice that residual variation across tanks appears relatively similar. However, the mean residuals per tank vary within treatments, which is what led to heteroscedasticity in the fixed-effects ANOVA when examining residual variation across treatment levels.

leveneTest(residuals(fixed.anova) ~ as.factor(tanks))

When assessing residual variance among tanks, the residuals can indeed be considered homoscedastic. However, since tanks are merely replicates within treatments, our primary interest lies in the treatment effect rather than tank variation. This analysis serves to deepen your understanding of random effects and why they are necessary in hierarchical data modeling.

Let’s assess the normality of the fixed model:

qqnorm(residuals(fixed.anova))
qqline(resid(fixed.anova), main = "Residuals")

The residuals do seem to depart from normality. Remember though that ANOVAs are quite robust against normality but not heteroscedasticity.

mixed effect ANOVA

Let’s run the diagnostics now for the mixed model ANOVA we conducted earlier.

Let’s start with the Q-Q plot to assess normality:

qqnorm(resid(mixed.anova), main = "Residuals")
qqline(resid(mixed.anova), main = "Residuals")

Notice that the residuals for the mixed model appear to be quite normally distributed. This is because incorporating the random effect tanks allowed the model to account for heteroscedasticity among tanks within treatments. As a result, the mixed model reduces the perturbation of residuals that we previously observed in the fixed-effects ANOVA, leading to a more balanced distribution of residuals.

Now, let’s check for heteroscedasticity in the mixed model. Unlike fixed-effect ANOVAs, the lme package does not provide a residual plot based on the square root of the absolute values of residuals, which is commonly used for diagnosing heteroscedasticity.

Instead, lme offers residual plots based on standardized residuals, known as Tukey-Anscombe residual plots. These plots allow us to assess the overall structure of residuals and detect potential patterns of heteroscedasticity in the mixed model.

plot(mixed.anova)

The residual values show no clear pattern with respect to the fitted values, indicating that heteroscedasticity among temperature treatments is not an issue in the mixed model.

If we want to reproduce the same plot based on the square root of the absolute residuals (as traditionally used in fixed-effect ANOVA diagnostics), we need to use another package, as shown below:

install.packages("lattice")
library(lattice)
xyplot(sqrt(abs(resid(mixed.anova))) ~ fitted(mixed.anova), type = c("p", "smooth", "g"),
    col.line = "black")

Let’s do the boxplot:

boxplot(residuals(mixed.anova) ~ as.factor(treatments))

The variation of residuals among treatments do seem much more similar now.

And finally a formal test for residual homoscedasticity:

leveneTest(residuals(mixed.anova) ~ as.factor(treatments))

We can indeed assume that residuals for the mixed model are homoscedastic.

This tutorial was designed to help you understand:

What hierarchical data are.
How hierarchical data often lead to heteroscedasticity in residuals for the main factor.
How mixed models account for variation within treatments across replicates (e.g., tanks) to ensure that model assumptions hold when testing the main effect of interest (in this case, temperature).