New bar stacking with ggplot 2.2.0

Reading time ~3 minutes

Recently, ggplot2 2.2.0 was released. Among other news, stacking bar plot was improved. Here is a short demonstration.

Load libraries

library(tidyverse)
library(htmlTable)

… and load data:

data <- read.csv("https://osf.io/meyhp/?action=download")

DOI for this piece of data is 10.17605/OSF.IO/4KGZH.

The data consists of results of a survey on extraversion and associated behavior.

Say, we would like to visualize the responsed to the extraversion items (there are 10 of them).

So, let’s see. First, compute summary of the responses.

data %>% 
  select(i01:i10) %>% 
  gather %>% 
  dplyr::count(key, value) %>% 
  ungroup -> data2
  
htmlTable(head(data2))
key value n
1 i01 1 7
2 i01 2 39
3 i01 3 230
4 i01 4 223
5 i01 2
6 i02 1 176

Ok, so now let’s plot.

ggplot(data2) +
  geom_col(aes(x = key, fill = value, y = n))

plot of chunk unnamed-chunk-4

Note that geom_col is short-hand vor geom_bar(stat = "identity").

Of interest, “negative” stacking is now possible.

data %>% 
  select(i01:i10) %>% 
  gather %>% 
  na.omit %>% 
  count(key, value) %>% 
  ungroup %>% 
  group_by(key) %>% 
  mutate(rel_n = round(n - mean(n), 2)) %>% 
  ungroup -> data3



htmlTable(head(data3))
key value n rel_n
1 i01 1 7 -117.75
2 i01 2 39 -85.75
3 i01 3 230 105.25
4 i01 4 223 98.25
5 i02 1 176 51.5
6 i02 2 224 99.5
ggplot(data3) +
  geom_col(aes(x = key, fill = value, y = rel_n))

plot of chunk unnamed-chunk-6

Hm, at it comes to the color, we need to change to discrete colors. It appears that the color scheme is not able (out of the box) to adapt to this “negative stacking”.

So:

ggplot(data3) +
  geom_col(aes(x = key, fill = factor(value), y = rel_n))

plot of chunk unnamed-chunk-7

sessionInfo()
## R version 3.3.2 (2016-10-31)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: macOS Sierra 10.12.1
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] purrr_0.2.2.9000 readr_1.0.0      tibble_1.2       tidyverse_1.0.0 
## [5] knitr_1.15       htmlTable_1.7    ggplot2_2.2.0    tidyr_0.6.0     
## [9] dplyr_0.5.0     
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.7         magrittr_1.5        munsell_0.4.3      
##  [4] colorspace_1.3-0    R6_2.2.0            highr_0.6          
##  [7] stringr_1.1.0       plyr_1.8.4          tools_3.3.2        
## [10] grid_3.3.2          gtable_0.2.0        DBI_0.5-1          
## [13] htmltools_0.3.5     yaml_2.1.13         lazyeval_0.2.0.9000
## [16] assertthat_0.1      digest_0.6.10       rprojroot_1.1      
## [19] rsconnect_0.5       evaluate_0.10       rmarkdown_1.1.9016 
## [22] labeling_0.3        stringi_1.1.2       scales_0.4.1       
## [25] backports_1.0.4

Wie gut schätzt eine Stichprobe die Grundgesamtheit?

# DatenSie arbeiten bei der Flughafenaufsicht von NYC. Cooler Job.```rlibrary(nycflights13)data(flights)```## Pakete laden```rlibrary(mos...… Continue reading

Some thoughts on tidyveal and environments in R

Published on November 16, 2017

Yart - Yet Another Markdown Report Template

Published on November 15, 2017