Plot of mean with exact numbers using ggplot2

Reading time ~2 minutes

Often, both in academic research and more business-driven data analysis, we want to compare some (two in many cases) means. We will not discuss here that friends should not let friends plot barplots. Following the advise of Cleveland’s seminal book we will plot the means using dots, not bars.

However, at times we do not simply want the diagram, but we (or someone) is interested in the bare, plain, naked, exact numbers too. So we would like to put the numbers right into the diagram. One way to achieve this is the following:

First, let’s load some data and some packages (in R):

data(tips, package = "reshape2")  # load some data

library(dplyr)
library(tidyr)
library(ggplot2)

Then, summarize the variables (ie., compute means per group). Note that for ggplot (and many other graphing systems) it is necessary that the the variable depicted at (say) the X-axis conforms to one column in the data set. Thus, we often have to change the structure of the data set (but here not…).

tips %>% 
  group_by(sex, smoker) %>% 
  summarise(mean_group = mean(tip)) -> tips2

OK; now let’s plot it with ggplot2:

tips2 %>% 
    ggplot(aes(x = smoker, y = mean_group, 
             color = sex, shape = smoker,
             group = sex,
             label = round(mean_group, 2))) + 
  geom_point() +
  geom_line() +
  geom_text(aes(x = smoker, y = mean_group + 0.03))

The whole syntax can be accessed at Github.

Crashkurs Datenanalyse mit R

# Willkommen zum R-CrashkursNicht jeder liebt Datenanalyse und Statistik... in gleichem Maße! Das ist zumindest meine Erfahrung aus dem U...… Continue reading

Different ways to count NAs over multiple columns

Published on September 08, 2017

Different ways to present summaries in ggplot2

Published on September 08, 2017