B1700: Week Two
ggplot2 to create effective visualisationsBy the end of this session, you should be able to:
ggplot2You’ve had a week to explore RStudio. Let’s check:
ggplot2?With a partner, open RStudio and do the following:
ggplot2 packagempg datasetweek02_intro.Rlibrary(ggplot2) do?install.packages() and library()?Data visualisation is how you communicate your findings.
Code style is how you make your logic clear and your work reproducible.
Doing these things well early on will pay off later, especially in collaborative or professional environments.
ggplot2We’ll now build a plot together, step by step.
Each row represents a car model
Try View(mpg) to open it in a spreadsheet-like window
displ: engine size in litres
hwy: highway miles per gallon
Each dot = one car model
What’s Happening Here?
ggplot(data = mpg) creates the canvas
geom_point() adds a layer of points
aes() (short for aesthetics) maps variables to axes
Now each point is coloured by car class (SUV, compact, etc.)
This lets us spot patterns across categories
ggplot(data = mpg) +
geom_point(aes(x = displ, y = hwy, color = class)) +
labs(
title = "Fuel Efficiency vs Engine Size",
x = "Engine Displacement (L)",
y = "Highway MPG",
color = "Car Type"
) +
theme_minimal()This version is:
Easier to read
Better labelled
Presentation-ready
Here’s a version that’s technically correct but hard to read:
ggplot(data=mpg)+geom_point(mapping=aes(x=displ,y=hwy,color=class))+labs(title="Fuel vs Engine",x="Engine",y="MPG",color="Type")+theme_minimal()And here’s a version that’s better:
ggplot(data = mpg) +
geom_point(aes(x = displ, y = hwy, color = class)) +
labs(
title = "Fuel Efficiency vs Engine Size",
x = "Engine Displacement (L)",
y = "Highway MPG",
color = "Car Type"
) +
theme_minimal()Why does it matter?
Code should be readable by humans — not just the computer.
Change color = class to color = drv (front-wheel vs rear-wheel drive)
Add size = cyl inside aes() to show engine cylinders
Use facet_wrap(~ class) to split the plot by car type
Work in pairs. One laptop between two is fine.
Your task: build a simple plot using mpg, then refactor it for clarity.
Use this structure to start:
🎯 Extension options:
Add color = class or shape = drv
Add labs() with a meaningful title
Use theme_minimal()
Now rewrite the same plot, but with these rules in mind:
One function per line
Use clear spacing and indentation
Add a short comment at the top describing the purpose
đź’ˇ You can also rename the object to fuel_plot and use print(fuel_plot) at the end.
Swap scripts with another pair. Use these questions to guide your review:
Is it easy to read and follow the logic?
Are variable names and formatting consistent?
Does the plot title clearly describe what we’re looking at?
Add a small note of feedback at the end of their script.
đź§ Code Style Mini-Rubric Use this informal 3-point guide during feedback:
✅ Clarity – Clean layout, logical structure ✅ Naming – Object names are meaningful and consistent ✅ Comments – Brief but helpful explanations
Create a new object that filters the dataset before plotting:
Then plot displ vs hwy just for compact cars.
Can you highlight a pattern or insight in the title?
This time, you’ll design a new plot that explores a question or pattern of your choice.
Use one of the following datasets:
mpg (from ggplot2)diamonds (from ggplot2)penguins (from palmerpenguins, if installed)Choose a dataset and run:
Then ask yourself:
What relationships or trends might be interesting to visualise?
What would someone else find easy to understand?
Construct your plot using this checklist:
âś… Two meaningful variables (e.g. x = numeric, y = numeric or categorical) âś… At least one aesthetic (color, size, or shape) âś… Title, axis labels, and a caption using labs() âś… Clean and consistent code formatting âś… Optional: use facet_wrap() to break into subplots
Once your plot is complete:
Pair up with another group
Show them your plot — but don’t explain it yet
Ask:
What do you think this plot is trying to show?
Is it visually clear?
Is the code readable and well-structured?
Then share your intention and compare interpretations.
If you finish early:
Try a new dataset
Use geom_boxplot() or geom_smooth()
Save your plot as an image:
Now return to your completed script from the previous task.
This is a professional coding clean-up. Treat your code as something you’re submitting or sharing with a team.
Make the code:
Then re-run the code to ensure the output is unchanged.
Messy version:
ggplot(data=mpg)+geom_point(mapping=aes(x=displ,y=hwy,color=class))+labs(title="fuel vs engine",x="engine",y="mpg")Refactored version:
In groups of 3–4, discuss:
What helped make your final plot readable and meaningful?
What surprised you about how ggplot2 worked?
In what ways does clean code help when collaborating or revisiting code later?
✍️ Optional Individual Prompt If you’d rather reflect privately, write 3–4 bullet points on the same questions. You’ll be asked to revisit these later in the semester.
Today’s session brought together:
Data Visualisation (R4DS Ch. 1)
ggplot2ggplot(), aes(), and geomsWorkflow and Code Style (R4DS Ch. 2)
Week One Recap (Consolidated)
library()?, help(), and View()