resample.data.frame: Create Replicate Data Sets by Stratified Sampling

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

resample is generic. A method is defined for data.frame; a convienience wrapper is provided for passing names of files to be read and then resampled.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
	## S3 method for class 'character'
as.csv.filename(x, ...)
	## S3 method for class 'csv.filename'
resample(x, ...)
	## S3 method for class 'filename'
resample(x, ...)
	## S3 method for class 'data.frame'
resample(
	x, 
	names, 
	key = NULL, 
	rekey =FALSE, 
	out = NULL, 
	stratify = NULL, 
	ext='.csv',
	row.names=FALSE, 
	quote=FALSE, 
	sep=',',
	replace=TRUE,
	...
)

Arguments

x

a data.frame, or (second/third form) a file name for a file to read

names

a list of names for replicate data sets; can be a simple vector

key

a scalar character value naming the column in x that distinguishes unique individuals, (resampling targets); defaults to row names

rekey

If true, key values in resampled data sets will have unique values of key replaced with consecutive integers, starting at 1.

out

a (path and) directory in which to write resulting data sets

stratify

A list of factors, the interactions of which will be the levels of stratification. Each factor must have the same length as nrow(x). Or a character vector of names in names(x).

ext

a file extension

row.names

passed to write.table

quote

passed to write.table

sep

passed to write.table

replace

passed to sample

...

extra arguments, passed to sample and write.table

Details

Typical usages are

1
2
3
4

The argument key gives the name of the column in x to identify unique experimental units (individuals). If not supplied, a temporary key is constructed from the row names, and sampling occurs at the row level.

The number of resamplings is controlled by the length of names. names is coerced to character, and each value is used to name a ‘*.csv’ file, if out is supplied. If out is omitted, a list of data.frames is returned.

stratify is a list of factors, or items that can be coerced to factors. Currently stratify is coerced to a data.frame for convenient manipulation. Empty levels are dropped. If stratify is not supplied, the whole data set is treated as a single level. Otherwise, each resulting data set has as many keys in each level as the original. An error results if key is not nested within stratify.

The default behavior is to sample with replacement (replace=TRUE.) This and other arguments to sample can be modified.

Value

A list of data.frames, or if out is supplied, an invisible list of the numbers of rows of each data.frame written to file.

Author(s)

Tim Bergsma

References

http://metrumrg.googlecode.com

See Also

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
b <- resample(Theoph,key='Subject',names=1:3)
d <- resample(
	Theoph,
	key='Subject',
	rekey=TRUE,
	names=1:3,
	out='.',
	stratify=Theoph$Dose < mean(Theoph$Dose)
)
e <- resample(as.csv.filename('1.csv'),names='theoph')

metrumrg documentation built on May 2, 2019, 5:55 p.m.