MANOVA (Multivariate Analysis of Variance) and MANCOVA (Multivariate Analysis of Covariance) are statistical techniques used to compare group means on multiple dependent variables.
Introduction
MANOVA is an extension of ANOVA that handles multiple outcome variables simultaneously.
MANCOVA extends MANOVA by including one or more continuous covariates, allowing us to control for their influence.
Use case
For example, we might want to explore whether different types of training programs (Control vs. Aerobic vs. Strength) influence two performance measures (SkillTest1 for accuracy and SkillTest2 for agility).
With > 1 DVs, this is a case for MANOVA.
Use case
But, we might also want to explore how the inclusion of a covariate (e.g., Age) might alter the group comparisons.
That’s a case for MANCOVA.
Exploring the Data
Basic Inspection
Show code
# Quick look at data structurehead(df)
ID TrainingGroup Age SkillTest1 SkillTest2
1 1 Control 32 77 36
2 2 Control 31 77 33
3 3 Control 20 68 34
4 4 Control 27 68 38
5 5 Control 35 80 35
6 6 Control 28 79 47
Show code
summary(df)
ID TrainingGroup Age SkillTest1 SkillTest2
Min. : 1.00 Aerobic :20 Min. :18.00 Min. :49.00 Min. :27.0
1st Qu.:15.75 Control :20 1st Qu.:23.75 1st Qu.:60.00 1st Qu.:38.0
Median :30.50 Strength:20 Median :27.00 Median :64.00 Median :41.0
Mean :30.50 Mean :27.05 Mean :65.27 Mean :41.6
3rd Qu.:45.25 3rd Qu.:31.00 3rd Qu.:71.00 3rd Qu.:45.0
Max. :60.00 Max. :35.00 Max. :83.00 Max. :62.0
Notes
We have 3 groups (“Control”, “Aerobic”, “Strength”) of participants. There are 60 total participants (20 in each group).
Each participant has two performance measures (DVs): SkillTest1 and SkillTest2.
Age serves as a continuous covariate that may affect performance.
Visualising Distributions
We can create boxplots to compare the distribution of SkillTest1 and SkillTest2 across the three training groups…
Visualising Distributions
Show code
par(mfrow =c(1, 2))boxplot(SkillTest1 ~ TrainingGroup, data = df,main ="SkillTest1 by TrainingGroup",ylab ="SkillTest1 Score", xlab ="Training Group")boxplot(SkillTest2 ~ TrainingGroup, data = df,main ="SkillTest2 by TrainingGroup",ylab ="SkillTest2 Score", xlab ="Training Group")
We might also want to see how Age relates to each skill test.
Show code
plot(df$Age, df$SkillTest1, col = df$TrainingGroup, pch =19, xlab ="Age", ylab ="SkillTest1 Score", main ="Age vs. SkillTest1")legend("topleft", legend =levels(df$TrainingGroup), col =1:3, pch =19, bty ="n")
MANOVA
Conceptual Overview
MANOVA evaluates whether there is a statistically significant difference in the mean vector of multiple dependent variables across groups.
In our example, the dependent variables are SkillTest1 and SkillTest2, and the factor is TrainingGroup.
Fitting the MANOVA Model
We use the base R function manova() to fit a MANOVA model.
Note that we place the two outcome variables in cbind() and specify TrainingGroup as the predictor.
Show code
# Fit the MANOVAmanova_model <-manova(cbind(SkillTest1, SkillTest2) ~ TrainingGroup, data = df)
Look at the p-values to see if the effect of TrainingGroup is significant on both dependent variables jointly.
A small p-value (e.g., < 0.05) suggests at least one group differs from the others in the joint distribution of SkillTest1 and SkillTest2.
Post-hoc Investigation
But where do the differences lie?
Using post-hoc tests
If the MANOVA is significant, we should do post-hoc checks to understand where the differences lie.
Univariate ANOVAs for each skill test
summary.aov(manova_model)
Response SkillTest1 :
Df Sum Sq Mean Sq F value Pr(>F)
TrainingGroup 2 2698.5 1349.27 48.212 5.552e-13 ***
Residuals 57 1595.2 27.99
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Response SkillTest2 :
Df Sum Sq Mean Sq F value Pr(>F)
TrainingGroup 2 775.6 387.80 15.624 3.889e-06 ***
Residuals 57 1414.8 24.82
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This gives us separate ANOVAs for SkillTest1 and SkillTest2 by TrainingGroup.
Pairwise Comparisons (e.g., using TukeyHSD()) on each ANOVA model let us see which groups differ significantly for each skill test…
Show code
# Tukey's HSD for each dependent variable# 1. Fit separate aov() models for each outcomeaov_skill1 <-aov(SkillTest1 ~ TrainingGroup, data = df)aov_skill2 <-aov(SkillTest2 ~ TrainingGroup, data = df)# Apply TukeyHSD on each model, specifying factor of interesttukey_skill1 <-TukeyHSD(aov_skill1, "TrainingGroup")tukey_skill2 <-TukeyHSD(aov_skill2, "TrainingGroup")# Inspect pairwise comparisons and p-valuestukey_skill1
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = SkillTest1 ~ TrainingGroup, data = df)
$TrainingGroup
diff lwr upr p adj
Control-Aerobic 13.3 9.274302 17.325698 0.0000000
Strength-Aerobic -1.7 -5.725698 2.325698 0.5697806
Strength-Control -15.0 -19.025698 -10.974302 0.0000000
Show code
tukey_skill2
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = SkillTest2 ~ TrainingGroup, data = df)
$TrainingGroup
diff lwr upr p adj
Control-Aerobic -4.1 -7.8912393 -0.3087607 0.0311497
Strength-Aerobic 4.7 0.9087607 8.4912393 0.0115073
Strength-Control 8.8 5.0087607 12.5912393 0.0000020
Notes
We re-fit (or explicitly fit) the univariate models:
aov_skill1 for SkillTest1 ~ TrainingGroup
aov_skill2 for SkillTest2 ~ TrainingGroup
We run TukeyHSD() on each model, specifying “TrainingGroup” as the factor.
We review the results (tukey_skill1 and tukey_skill2) to identify which group means differ significantly for each outcome measure.
MANCOVA
Including a covariate…
Overview
MANCOVA is similar to MANOVA but includes one or more covariates.
By including a covariate (e.g., Age), we can:
Control for the effect of age on the dependent variables.
See whether group differences remain after adjusting for age.
What’s a covariate?
A variable that is not the primary focus of a study but may influence the outcome
Include it to help to control for variability in statistical models
e.g., age, BMI, previous injury history, home vs. away, temperature
Fitting the MANCOVA Model
We add Age as an additional predictor in our formula:
Show code
mancova_model <-manova(cbind(SkillTest1, SkillTest2) ~ TrainingGroup + Age, data = df)summary(mancova_model, test ="Wilks")
Now the model tests whether TrainingGroupstill has a significant effect on the combined outcomes after controlling for Age.
We can also see if Age itself is significant in the multivariate model.
Post-hoc for MANCOVA
As before, we can do post-hoc tests to check where the differences lie:
Show code
summary.aov(mancova_model)
Response SkillTest1 :
Df Sum Sq Mean Sq F value Pr(>F)
TrainingGroup 2 2698.53 1349.27 58.743 1.778e-14 ***
Age 1 308.95 308.95 13.451 0.0005466 ***
Residuals 56 1286.25 22.97
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Response SkillTest2 :
Df Sum Sq Mean Sq F value Pr(>F)
TrainingGroup 2 775.60 387.80 15.7384 3.769e-06 ***
Age 1 34.94 34.94 1.4181 0.2387
Residuals 56 1379.86 24.64
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This provides separate univariate ANCOVA models for SkillTest1 and SkillTest2.
It shows the effect of TrainingGroup while controlling for Age.
It shows the effect of Age.
(If an interaction TrainingGroup:Age is theoretically relevant, you could include it in the model.)
Reporting the MANCOVA
A multivariate analysis of variance (MANOVA) was conducted to examine the effects of Training Group and Age on performance in two skill tests.
Results indicated a significant main effect of Training Group on both SkillTest1 (F(2, 56) = 58.74, p < .001) and SkillTest2 (F(2, 56) = 15.74, p < .001), suggesting that training group membership significantly influenced performance on both measures.
Age was also a significant predictor for SkillTest1 (F(1, 56) = 13.45, p = .001), but not for SkillTest2 (F(1, 56) = 1.42, p = .24).
These findings suggest that while training group consistently affects skill performance, age may play a role in some but not all skill outcomes.
Checking Assumptions
Assumptions
Both MANOVA and MANCOVA share a set of assumptions:
Multivariate normality of dependent variables (within each group).
Homogeneity of covariance matrices (the covariance structure of the dependent variables is the same across groups).
Linearity and Independence
For MANCOVA specifically, linear relationship between the covariate(s) and each dependent variable.
In practice, diagnostic plots or tests (e.g., Box’s M test from the heplots package) help verify the homogeneity of covariance matrices.
Visualising Results
Mean Profiles by Group
We can visualise how the groups differ on each skill test (adjusted or unadjusted for age) in various ways.
One simple approach is to look at group means with error bars…
Show code
library(dplyr)library(ggplot2)df_summary <- df %>%group_by(TrainingGroup) %>%summarise(mean_SkillTest1 =mean(SkillTest1),sd_SkillTest1 =sd(SkillTest1),mean_SkillTest2 =mean(SkillTest2),sd_SkillTest2 =sd(SkillTest2) )# Plot SkillTest1 meansp1 <-ggplot(df, aes(x = TrainingGroup, y = SkillTest1)) +stat_summary(fun.data = mean_sdl, fun.args =list(mult =1), geom ="errorbar", width =0.2) +stat_summary(fun = mean, geom ="point", size =3, color ="blue") +labs(title ="Group Means for SkillTest1", y ="Mean SkillTest1") +theme_minimal()# Plot SkillTest2 meansp2 <-ggplot(df, aes(x = TrainingGroup, y = SkillTest2)) +stat_summary(fun.data = mean_sdl, fun.args =list(mult =1), geom ="errorbar", width =0.2) +stat_summary(fun = mean, geom ="point", size =3, color ="red") +labs(title ="Group Means for SkillTest2", y ="Mean SkillTest2") +theme_minimal()p1
Notes
Blue points: Mean SkillTest1 per group with ±1 SD error bars.
Red points: Mean SkillTest2 per group with ±1 SD error bars.
For MANCOVA results, we might prefer plotting adjusted means (or least-squares means) using packages like emmeans for more advanced visuals.
Emmeans Example
Code
library(emmeans)# Fit linear model (ANCOVA) for SkillTest1 with TrainingGroup as a factor and Age as covariatemod_skill1 <-lm(SkillTest1 ~ TrainingGroup + Age, data = df)# Use emmeans to get estimated marginal means (adjusted means)emm_skill1 <-emmeans(mod_skill1, specs ="TrainingGroup")# Plot the emmeans, including comparisons (i.e., confidence intervals)plot(emm_skill1, comparison =TRUE)
Summary
MANOVA answers: “Do the training groups differ on the combined outcome of SkillTest1 and SkillTest2?”
If significant, inspect univariate ANOVAs to determine which outcome(s) drive the difference.
MANCOVA adds a covariate (e.g., Age):
Determines whether group differences persist after adjusting for a continuous variable.
Reveals whether Age itself has a significant effect on the combined outcomes.
Post-hoc Tests…
If a group effect is found, consider pairwise comparisons (e.g., Control vs. Aerobic, etc.) for each skill test to see which groups differ significantly.