multitable-package: Simultaneous manipulation of multiple arrays of data, with...

Description Details Note Author(s) References Examples

Description

Data frames are integral to R. They provide a standard format for passing data to model-fitting and plotting functions, and this standard makes it easier for experienced users to learn new functions that accept data as a single data frame. Still, many data sets do not easily fit into a single data frame; data sets in ecology with a so-called fourth-corner problem provide important examples. Manipulating such inherently multiple-table data using several data frames can result in long and difficult-to-read workflows. We introduce the R multitable package to provide new data storage objects called data.list objects, which extend the data.frame concept to explicitly multiple-table settings. Like data frames, data lists are lists of variables stored as vectors; what is new is that these vectors have dimension attributes that make accessing and manipulating them easier. As data.list objects can be coerced to data.frame objects, they can be used with all R functions that accept an object that is coercible to a data.frame.

Details

Package: multitable
Type: Package
Version: 1.5
Date: 2012-11-09
Suggests: MASS, lattice, testthat, arm, ggplot2, rbenchmark, scales, vegan
License: GPL-2
URL: http://multitable.r-forge.r-project.org/
LazyLoad: yes

Note

Using multitable DOES NOT REQUIRE ANY OTHER PACKAGES, other than those that typically come with R. The suggested packages are only for examples in the help files and vignette, and for package development (i.e. testthat).

Author(s)

Maintainer: Steve Walker <steve.walker@utoronto.ca>

References

Steven C Walker, Guillaume Guenard, Peter Solymos, Pierre Legendre (2012). Multiple-Table Data in R with the multitable Package. Journal of Statistical Software, 51(8), 1-38. URL http://www.jstatsoft.org/v51/i08/

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
######################################################
# The package vignette (Walker et al. 2012 JSS) is a
# useful place to start
######################################################
vignette("multitable")


######################################################
# The structure of data lists
######################################################

# load the example data set in data list form.
data(fake.community)
fake.community

# print a summary of the relational structure
# of the data set.
summary(fake.community)

######################################################
# Subscripting data lists
######################################################

# extract two years of data.
fake.community[, c("2008", "2009"), ]

# extraction using both numerical and character
# vectors.
fake.community[1:3, "1537", 1]

# subscripting data lists is designed to be as
# intuitive as possible to R users.  above are
# the examples covered in the manuscript, but
# see the help file for more examples and 
# explanation.
?`[.data.list`

######################################################
# Transforming variables in data lists
######################################################

# transformation occurs much like it would with
# data frames.
fake.community$abundance <- log1p(fake.community$abundance)
fake.community$temperature[, "1537"] <- c(5, 10, 30, 20, -80, -10)	
fake.community$precipitation[, "1537"] <- c(5, 50, 75, 50, 2, 7)
fake.community$body.size["moss"] <- 1
fake.community

######################################################
# Simple analysis functions
######################################################

# we can pass data lists to lm just as we would pass
# data frames.
lm(abundance ~ body.size*temperature, data = fake.community)
lm(abundance ~ homeotherm*temperature, data = fake.community)

# this works for any function that tries to coerce 
# data to a data frame, such as the robust linear
# model function from MASS.
library("MASS")
rlm(abundance ~ body.size*temperature, data = fake.community)

######################################################
# Coercing data lists to data frames
######################################################

# data frames are easily coerced to data frames via
# the as.data.frame method for data.list objects.
fake.community.df <- as.data.frame(fake.community)
fake.community.df[, -6]

# therefore, data list objects can easily be passed to
# any R function accepting a data frame, after they
# have been converted to a data frame.
library(lattice)
xyplot(abundance ~ temperature | body.size, data = fake.community.df)

# for further information about coercing in multitable:
?as.data.list

######################################################
# How data lists are made
######################################################

# here are three example objects to be combined into
# a data list.
abundance <- data.frame(
	sites=c(
		"midlatitude", "subtropical", "tropical", "equatorial",
		"arctic", "midlatitude", "tropical", "equatorial",
		"subtropical"
	),
	species=c(rep("capybara", 4),rep("moss", 4), "vampire"),
	abundance=c(4, 10, 8, 7, 5, 6, 9, 3, 1)
)
environment <- data.frame(
	sites=c(
		"subarctic", "midlatitude", "subtropical",
		"tropical", "equatorial"
	),
	temperature=c(0, 10, 20, 50, 30),
	precipitation=c(40, 20, 100, 150, 200)
)
trait <- data.frame(
	species=c("capybara", "moss", "vampire"),
	body.size=c(140, 5, 190),
	metabolic.rate=c(20, 5, 0)
)
abundance
environment
trait

# we use the dlcast function to combine them.
# the dimids argument tells dlcast what dimensions
# (or columns as they are in 'long' format) are 
# shared among tables.  the fill argument tells 
# dlcast how to fill in any structural missing 
# values.
dl <- dlcast(list(abundance, environment, trait),
	dimids=c("sites", "species"),
	fill=c(0, NA, NA)
)
dl

# for other ways to create data list objects, see:
?data.list
?as.data.list
?read.multitable
?variable

stevencarlislewalker/multitable documentation built on May 30, 2019, 4:44 p.m.