Convert list to dataframe

March 08, 2017

A handy function to iterate stuff is the function purrr::map. It takes a function and applies it to all elements of a given vector. This vector can be a data frame - which is a list, tecnically - or some other sort of of list (normal atomic vectors are fine, too).

However, purrr::map is designed to return lists (not dataframes). For example, if you apply mosaic::favstats to map, you will get some favorite statistics for some variable:

library(mosaic)
library(tidyverse)

data(tips, package = "reshape2")
favstats(tips$tip)  ## min Q1 median Q3 max mean sd n missing ## 1 2 2.9 3.5625 10 2.998279 1.383638 244 0  Note that favstats does not accept several columns/variables as parameters; one only at a time is permitted. Ok, let’s apply favstats to each numeric column of our dataframe: tips %>% select_if(is.numeric) %>% map(mosaic::favstats)  ##$total_bill
##   min      Q1 median      Q3   max     mean       sd   n missing
##  3.07 13.3475 17.795 24.1275 50.81 19.78594 8.902412 244       0
##
## $tip ## min Q1 median Q3 max mean sd n missing ## 1 2 2.9 3.5625 10 2.998279 1.383638 244 0 ## ##$size
##  min Q1 median Q3 max     mean        sd   n missing
##    1  2      2  3   6 2.569672 0.9510998 244       0


Quite nice, but we are given back a list with several elements; each element is a “row” of our to-be dataframe. How to change this list to a regular dataframe (tibble)?

This trick can be solved by use of some sort of repeated rbind. rbind binds rows together, hence the name. But rbinds does only accept two elements as input. Now comes do.call to help. Effectiveley, do.call does something like: rbind(my_list[[1]], my_list[[2]], my_list[[2]], ...) (this is pseudo code).

tips %>%
select_if(is.numeric) %>%
map(mosaic::favstats) %>%
do.call(rbind, .)

##             min      Q1 median      Q3   max      mean        sd   n
## total_bill 3.07 13.3475 17.795 24.1275 50.81 19.785943 8.9024120 244
## tip        1.00  2.0000  2.900  3.5625 10.00  2.998279 1.3836382 244
## size       1.00  2.0000  2.000  3.0000  6.00  2.569672 0.9510998 244
##            missing
## total_bill       0
## tip              0
## size             0


Note the little dot in rbind.

Now be happy with the dataframe :-)