Visualising Data & Writing Clear Code

B1700: Week Two

Overview

  • Using ggplot2 to create effective visualisations
  • Writing clean, consistent R code
  • Developing good programming habits from the start
  • Consolidating your skills from Week One

🎯 Learning Objectives

By the end of this session, you should be able to:

  • Construct a basic visualisation using ggplot2
  • Apply consistent and readable code styling
  • Critically evaluate code for clarity and structure
  • Understand how R visualisation compares to tools like Python or Excel

Week One Recap

You’ve had a week to explore RStudio. Let’s check:

  • Can you create a new R script?
  • Can you install and load a package like ggplot2?
  • Do you know how to run code from the script, not just the Console?
  • Have you opened an R Project?

Mini-Challenge (5 mins)

With a partner, open RStudio and do the following:

  1. Open or create a new R Project
  2. Create a new script
  3. Load the ggplot2 package
  4. Create a minimal scatterplot using mpg dataset
  5. Save the script as week02_intro.R

Quick Review Questions

  1. What does library(ggplot2) do?
  2. What’s the difference between install.packages() and library()?
  3. What does this line produce?
ggplot(data = mpg) +
  geom_point(mapping = aes(x = displ, y = hwy))
  1. What’s the advantage of writing code in a script file rather than directly in the Console?

Why Visualisation Matters

  • Data visualisation is how you communicate your findings.

  • Code style is how you make your logic clear and your work reproducible.

  • Doing these things well early on will pay off later, especially in collaborative or professional environments.

Demo: Building a Plot with ggplot2

We’ll now build a plot together, step by step.

Step 1: Load the Package

# Load the ggplot2 package (only after installing it once)
library(ggplot2)

Step 2: Start with the Dataset

# Look at the built-in 'mpg' dataset
head(mpg)
  • Each row represents a car model

  • Try View(mpg) to open it in a spreadsheet-like window

Step 3: Create a Basic Plot

ggplot(data = mpg) +
  geom_point(mapping = aes(x = displ, y = hwy))
  • displ: engine size in litres

  • hwy: highway miles per gallon

  • Each dot = one car model

ggplot(data = mpg) +
  geom_point(mapping = aes(x = displ, y = hwy))

What’s Happening Here?

  • ggplot(data = mpg) creates the canvas

  • geom_point() adds a layer of points

  • aes() (short for aesthetics) maps variables to axes

Step 4: Add Colour by Category

ggplot(data = mpg) +
  geom_point(mapping = aes(x = displ, y = hwy, color = class))
  • Now each point is coloured by car class (SUV, compact, etc.)

  • This lets us spot patterns across categories

Step 5: Add Labels and Theme

ggplot(data = mpg) +
  geom_point(aes(x = displ, y = hwy, color = class)) +
  labs(
    title = "Fuel Efficiency vs Engine Size",
    x = "Engine Displacement (L)",
    y = "Highway MPG",
    color = "Car Type"
  ) +
  theme_minimal()

This version is:

  • Easier to read

  • Better labelled

  • Presentation-ready

Let’s Talk About Code Style

Here’s a version that’s technically correct but hard to read:

ggplot(data=mpg)+geom_point(mapping=aes(x=displ,y=hwy,color=class))+labs(title="Fuel vs Engine",x="Engine",y="MPG",color="Type")+theme_minimal()

And here’s a version that’s better:

ggplot(data = mpg) +
  geom_point(aes(x = displ, y = hwy, color = class)) +
  labs(
    title = "Fuel Efficiency vs Engine Size",
    x = "Engine Displacement (L)",
    y = "Highway MPG",
    color = "Car Type"
  ) +
  theme_minimal()

Why does it matter?

Code should be readable by humans — not just the computer.

What You Can Try Now

Change color = class to color = drv (front-wheel vs rear-wheel drive)

Add size = cyl inside aes() to show engine cylinders

Use facet_wrap(~ class) to split the plot by car type

Paired Practice 1: Basic Plot + Code Style

Work in pairs. One laptop between two is fine.

Your task: build a simple plot using mpg, then refactor it for clarity.

Task 1: Build a Scatterplot

Use this structure to start:

ggplot(data = mpg) +
  geom_point(aes(x = cty, y = hwy))

🎯 Extension options:

Add color = class or shape = drv

Add labs() with a meaningful title

Use theme_minimal()

Task 2: Improve the Code Style

Now rewrite the same plot, but with these rules in mind:

One function per line

Use clear spacing and indentation

Add a short comment at the top describing the purpose

đź’ˇ You can also rename the object to fuel_plot and use print(fuel_plot) at the end.

Task 3: Style Reflection

Swap scripts with another pair. Use these questions to guide your review:

Is it easy to read and follow the logic?

Are variable names and formatting consistent?

Does the plot title clearly describe what we’re looking at?

Add a small note of feedback at the end of their script.

đź§  Code Style Mini-Rubric Use this informal 3-point guide during feedback:

✅ Clarity – Clean layout, logical structure ✅ Naming – Object names are meaningful and consistent ✅ Comments – Brief but helpful explanations

Bonus Challenge (Optional)

Create a new object that filters the dataset before plotting:

compact <- mpg[mpg$class == "compact", ]

Then plot displ vs hwy just for compact cars.

Can you highlight a pattern or insight in the title?

Paired Practice 2: Build a Plot from Scratch

This time, you’ll design a new plot that explores a question or pattern of your choice.

Use one of the following datasets:

  • mpg (from ggplot2)
  • diamonds (from ggplot2)
  • penguins (from palmerpenguins, if installed)

Task 1: Explore the Data

Choose a dataset and run:

head(dataset_name)
summary(dataset_name)
View(dataset_name)

Then ask yourself:

What relationships or trends might be interesting to visualise?

What would someone else find easy to understand?

Task 2: Build a Purposeful Plot

Construct your plot using this checklist:

âś… Two meaningful variables (e.g. x = numeric, y = numeric or categorical) âś… At least one aesthetic (color, size, or shape) âś… Title, axis labels, and a caption using labs() âś… Clean and consistent code formatting âś… Optional: use facet_wrap() to break into subplots

Sample Starter Code

ggplot(data = diamonds) +
  geom_point(aes(x = carat, y = price, color = cut)) +
  labs(
    title = "Diamond Price by Carat and Cut",
    x = "Carat Weight",
    y = "Price (USD)",
    color = "Cut Quality"
  ) +
  theme_minimal()

Task 3: Peer Sharing and Feedback

Once your plot is complete:

Pair up with another group

Show them your plot — but don’t explain it yet

Ask:

What do you think this plot is trying to show?

Is it visually clear?

Is the code readable and well-structured?

Then share your intention and compare interpretations.

Bonus Challenge (Optional)

If you finish early:

Try a new dataset

Use geom_boxplot() or geom_smooth()

Save your plot as an image:

ggsave("myplot.png", width = 7, height = 5)

Code Style Refactoring

Now return to your completed script from the previous task.

This is a professional coding clean-up. Treat your code as something you’re submitting or sharing with a team.

Task: Refactor Your Plot Script

Make the code:

  • Readable: One function per line, consistent spacing
  • Commented: Add a short explanation at the top
  • Styled: Use meaningful object names, indentations, and section breaks

Then re-run the code to ensure the output is unchanged.

Before vs After (Example)

Messy version:

ggplot(data=mpg)+geom_point(mapping=aes(x=displ,y=hwy,color=class))+labs(title="fuel vs engine",x="engine",y="mpg")

Refactored version:

# A scatterplot showing how engine size affects fuel efficiency
ggplot(data = mpg) +
  geom_point(aes(x = displ, y = hwy, color = class)) +
  labs(
    title = "Fuel Efficiency by Engine Size",
    x = "Engine Displacement (L)",
    y = "Highway MPG",
    color = "Vehicle Class"
  ) +
  theme_minimal()

Group Discussion – Reflecting on Code and Clarity

In groups of 3–4, discuss:

What helped make your final plot readable and meaningful?

What surprised you about how ggplot2 worked?

In what ways does clean code help when collaborating or revisiting code later?

✍️ Optional Individual Prompt If you’d rather reflect privately, write 3–4 bullet points on the same questions. You’ll be asked to revisit these later in the semester.

Final Thoughts

Summary: What We Learned Today

Today’s session brought together:

Data Visualisation (R4DS Ch. 1)

  • Introduced the grammar of graphics using ggplot2
  • Built layered plots using ggplot(), aes(), and geoms
  • Explored real datasets to uncover patterns and relationships
  • Used labels, themes, and facets to improve clarity and impact

Workflow and Code Style (R4DS Ch. 2)

  • Practised writing clean, readable, and professional code
  • Used meaningful names, spacing, and comments
  • Learned how good style improves collaboration and future usability
  • Reviewed and refactored each other’s scripts

Week One Recap (Consolidated)

  • Script vs Console: writing code in reusable scripts
  • Loading and managing packages with library()
  • Getting help with ?, help(), and View()
  • Saving your work and using R projects to stay organised

Final Thought

  • Clear code isn’t just about style — it’s about thinking clearly.
  • Every good plot begins with a good question.