folder: Folder of data sets

View source: R/folder.R

folderR Documentation

Folder of data sets

Description

Creates an object of class "folder" (called folder below), that is a list of data frames with the same column names. Thus, these data sets are on the same variables. They can be on the same individuals or not.

Usage

folder(x1, x2 = NULL, ..., cols.select = "intersect", rows.select = "")

Arguments

x1

data frame (can also be a tibble) or list of data frames.

  • If x1 is a data frame, x2 must be provided.

  • If x1 is a list of data frames, its elements are the datasets of the folder. In this case, there is no x2 argument.

x2

data frame. Must be provided if x1 is a data frame.

...

optional. One or several data frames. When x1 and x2 are data frames, these are the other data frames.

cols.select

string. Gives the method used to choose the column names of the data frames of the folder. This argument can be:

"intersect"

(default) the column names of the data frames in the folder are the intersection of the column names of all the data frames given as arguments.

"union"

the column names of the data frames in the folder are the union of the column names of all the data frames given as arguments. When necessary, the rows of the returned data frames are completed by NA.

If cols.select is a character vector, it gives the column names selected in the data frames given as arguments. The corresponding columns constitute the columns of the elements of the returned folder. Notice that when a column name is not present in all data frames (given as arguments), the data are completed by NA.

rows.select

string. Gives the method used to choose the row names of the data frames of the folder. This argument can be:

""

(default) the data frames of the folder have the same rows as those which were passed as arguments.

"intersect"

the row names of the data frames in the folder are the intersection of the row names of all the data frames given as arguments.

"union"

the row names of the data frames in the folder are the union of the row names of all the data frames given as arguments. When necessary, the columns of the data frames returned are completed by NA.

Details

The class folder has a logical attributes attr(,"same.rows").

The data frames in the returned folder all have the same column names. That means that the same variables are observed in every data sets.

If the rows.select argument is "union" or "intersect", the elements of the returned folder have the same rows. That means that the same individuals are present in every data sets. This allows to consider the evolution of each individual among time.

If rows.select is "", every rows of this folder are different, and the row names are made unique by adding the name of the data frame to the row names. In this case, The individuals of the data sets are assumed to be all different. Or, at least, the user does not mind if they are the same or not.

Value

Returns an object of class "folder", that is a list of data frames.

Author(s)

Rachid Boumaza, Pierre Santagostini, Smail Yousfi, Gilles Hunault, Sabine Demotes-Mainard

See Also

is.folder to test if an object is of class folder. folderh to build a folder of several data frames with a hierarchic relation between each pair of consecutive data frames.

Examples

# First example              
x1 <- data.frame(x = rnorm(10), y = 1:10)
x2 <- data.frame(x = rnorm(10), z = runif(10, 1, 10))
f1 <- folder(x1, x2)
print(f1)

f2 <- folder(x1, x2, cols.select = "union")
print(f2)

#Second example
data(iris)
iris.set <- iris[iris$Species == "setosa", 1:4]
iris.ver <- iris[iris$Species == "versicolor", 1:4]
iris.vir <- iris[iris$Species == "virginica", 1:4]
irisf1 <- folder(iris.set, iris.ver, iris.vir)
print(irisf1)

listofdf <- list(df1 = iris.set,df2 = iris.ver,df3 = iris.vir)
irisf2 <- folder(listofdf,x2 = NULL)
print(irisf2)

dad documentation built on Aug. 30, 2023, 5:06 p.m.