Skip to content

A comparison of Base R, Tidyverse, data.table, and python's pandas for data manimulations

Notifications You must be signed in to change notification settings

DenverRUG/2022-07-27-Data-Manipulation

 
 

Repository files navigation

Data Manipulation

A comparison of Base R, Tidyverse, data.table.

Each directory contains examples for common data manipulation tasks using different dialects.

NOTE: all the example scripts are expected to be evaluated from the project root directory.

R Dependencies

You will need the following packages to reproduce the examples here:

  • tidyverse
  • data.table
  • microbenchmark
  • profmem

Check for, and install if needed, the packages via the following:

pkgs <- c("tidyverse", "data.table", "microbenchmark", "profmem")
for (p in pkgs[!(pkgs %in% installed.packages())]) {
  install.packages(p)
}

Data Sets

Data sets used in the examples are provided in the 000_data_sets directory.

Please refer to the noted sources for additional detail on the data sets.

File Rows Columns
./000_data_sets/RECS/2009/recs2009_public.csv 12,083 940
./000_data_sets/RECS/2015/recs2015_public_v4.csv 5,686 759
./000_data_sets/RECS/2020/recs2020_public_v1.csv 18,496 601

About

A comparison of Base R, Tidyverse, data.table, and python's pandas for data manimulations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 95.7%
  • R 4.2%
  • Other 0.1%