In mtennekes/tabplot: Tableplot, a Visualization of Large Datasets

library(knitr)
library(tabplot)

Timings of Big Data visualization with `tabplot`

We test the speed of tabplot package with datasets over 1,00,000,000 records. For this purpose we multiply the diamonds dataset from the ggplot2 package 2,000 times. This dataset contains 53940 records and 10 variables.

Create testdata

require(ggplot2)
data(diamonds)
## add some NA's
is.na(diamonds$price) <- diamonds$cut == "Ideal"
is.na(diamonds$cut) <- (runif(nrow(diamonds)) > 0.8)

n <- nrow(diamonds)
N <- 200L * n

## convert to ff format (not enough memory otherwise)
require(ffbase)
diamondsff <- as.ffdf(diamonds)
nrow(diamondsff) <- N

# fill with identical data
for (i in chunk(diamondsff, by=n)){
  diamondsff[i,] <- as.data.frame(diamonds)
}

Prepare data

The preparation step is the most time consuming. Per column, the rank order is determined.

system.time(
    p <- tablePrepare(diamondsff)
)

Create tableplots

To focus on the processing time of the tableplot function, the plot argument is set to FALSE.

system.time(
    tab <- tableplot(p, plot=FALSE)
)

The following tableplots are samples with respectively 100, 1,000 and 10,000 objects per bin.

system.time(
    tab <- tableplot(p, sample=TRUE, sampleBinSize=1e2, plot=FALSE)
)

system.time(
    tab <- tableplot(p, sample=TRUE, sampleBinSize=1e3, plot=FALSE)
)

system.time(
    tab <- tableplot(p, sample=TRUE, sampleBinSize=1e4, plot=FALSE)
)

mtennekes/tabplot documentation built on March 8, 2021, 6:11 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

mtennekes/tabplot
Tableplot, a Visualization of Large Datasets

In mtennekes/tabplot: Tableplot, a Visualization of Large Datasets

Timings of Big Data visualization with `tabplot`

Create testdata

Prepare data

Create tableplots

R Package Documentation

Browse R Packages

We want your feedback!

mtennekes/tabplot Tableplot, a Visualization of Large Datasets

In mtennekes/tabplot: Tableplot, a Visualization of Large Datasets

Timings of Big Data visualization with tabplot

Create testdata

Prepare data

Create tableplots

R Package Documentation

Browse R Packages

We want your feedback!

mtennekes/tabplot
Tableplot, a Visualization of Large Datasets

Timings of Big Data visualization with `tabplot`