GetData: Extracting a data.frame from a data.frame

View source: R/GetData.R

GetDataR Documentation

Extracting a data.frame from a data.frame

Description

Extracting a data.frame from a data.frame

Usage

GetData(data, ..., removeNULL = TRUE, returnAsDataFrame = TRUE)

GetData1(...)

GetData2me(...)

Arguments

data

A data frame

...

Input specifying how to extract data (see examples).

removeNULL

When TRUE (default) variables specified as NULL are completely removed. Otherwise zero column matrices will be embedded. It is possible to specify removeNULL as a vector - one element for each variable.

returnAsDataFrame

When TRUE (default) a data.frame is returned. Otherwise a list is retuned.

Details

GetData returns a data frame with extra attributes (see examples).

To create data according to id matching: The id variable must be the first variable. This variable must be specified using the list type input. As opposed to other variable the first element of this list must be named an this name must be "id". See the examples. (I nyere versjoner trenger det ikke være første variabel??)

GetData1 is a single variable variant which returns the variable instead of a data.frame.

GetData2me returns only NULL, but breaks the rules for ordinary functions. That is, each variable is written to caller's environment (no data.frame).

Value

GetData returns a data frame by default (see details).

Examples

### Example data
z <- data.frame(aar = c(2014, 2015, 2016),
                ola = c(4.4, 6.6, 2.2, 3.2, 8.8, 9.9),
                kari = 10 * (1:6),
                tull = c("A", "A", "B", "B", "C", "C"))
print(z)

### Ordinary use: names or numbers
GetData(z, x = "kari", y = "ola")
GetData(z, A = 3, B  = 2, C = 1)

### With matrix embedded in one variable
a = GetData(z, x = c("kari","ola"), y = "aar")
print(a)
print(as.list(a)[-99]) # 99 tric to avoid printing of attributes

### Looking at attributes stored in output
attr(a,"origVars") # Original names corresponding to variables
attr(a,"origCols") # Original names corresponding to columns

### Using a named list to specify equality
GetData(z, x = list("kari",aar=2014), y = list("ola", aar=2015))
GetData(z, x = list("kari",aar=2016, tull="B"),  y = list("kari",aar=2014, tull="B"))

### With matrix input to obtain matrix embedded in output
a = GetData(z, x = list("kari",aar=t(c(2014,2015))), y = list("ola", aar=t(2015:2016)))
print(a);
print(as.list(a)[-99])
GetData(z, x = list("kari",aar=t(1:3000))) # Impossible values ignored, warning produced

### Effect of removeNULL
a = GetData(z, x = NULL, y = "ola")
print(a);
print(as.list(a)[-99])
a = GetData(z, x = NULL, y = "ola", removeNULL = FALSE)
print(a);
print(as.list(a)[-99])  # x is a 6x0 matrix

### Using "expression"
GetData(z, x = list("kari",expression(aar>2014)), y = list("kari", expression(tull != "B")))
GetData(z, x = list("kari",expression(aar>2014 & tull=="B" | tull=="C" )))
GetData(z, x = list("kari",expression(aar==min(aar))), y = list("kari", expression(aar==max(aar)-1)))

### Using names as list elements instead of named list
GetData(z, x = list("kari","aar", "2014"), y = list("ola", "aar", "2015"))
GetData(z, x = list("kari","aar", "2016", "tull", "B"))

### Using function to be run on each variable
GetData(z, x = list("kari",aar=2014:2015,function(x)(x+1))) # One function
GetData(z, x = list("kari",aar=2014:2015,function(x)(x+1),function(x)(x*10))) # Tow functions
GetData(z, x = list(c("kari","ola"),function(x)apply(x,1,paste,collapse="-")), y = "aar")

### Advanced examples
GetData(z, x = list(c("kari","ola"),aar=t(2014:2015)), y = list("ola", aar=2015))
GetData(z, x = list("kari",expression(aar==max(aar)),tull=t(c("B","C"))))
GetData(z, x = list("kari",expression(eval(as.symbol("aar"))>2014 & eval(as.symbol("tull"))=="B")))
GetData(z, x = list("ola",aar=cbind(2014:2015,2015:2016)))
GetData(z, x = list("kari",aar=2014:2015,function(x)(cbind(a=x,b=1000))))
GetData(z, x = list("kari",aar=2014:2015,function(x)(cbind(x=x,tid=date()))))

### GetData1
aAa <- GetData1(z, "kari")
bBb <- GetData1(z, x = c("kari","ola"))

### GetData2me
GetData2me(z, cCc = "kari", dDd = "ola")
cCc + dDd
GetData2me(z, eEe = list(c("kari","ola"),aar=t(2014:2015)))
print(eEe)

######  Using id  #######

#### Make new example data
z2 <- rbind(z,z)
z2$ola <- c(z$ola,2*z$ola)
z2 <- SortRows(z2)[1:11,]
rownames(z2) <- NULL
z2$ID=c(1:3,4,1:3,5,1:2,6)
print(z2)

# All possible ID-values in data
GetData(z2, iD = list(id="ID"), x = list("kari",aar=2014), y = list("ola", aar=2015))

# ID-values in union of 2014 and 2015
GetData(z2, iD = list(id="ID",aar=c(2014,2015)), x = list("kari",aar=2014), y = list("ola", aar=2015))

# ID-values in intersection of 2014 and 2015
# (matrix input similar to above but no matrix in output, instead intersection "of columns" created)
GetData(z2, iD = list(id="ID",aar=t(c(2014,2015))), x = list("kari",aar=2014), y = list("ola", aar=2015))

# Only ID-values in 2016
GetData(z2, iD = list(id="ID",aar=2016), x = list("kari",aar=2014), y = list("ola", aar=2015))

# ID-values in 2016 +intersection of 2014 and 2015
GetData(z2, iD = list(id="ID",aar=cbind(c(2014,2016),c(2015,2016))), x = list("kari",aar=2014), y = list("ola", aar=2015))

# Only first value used hven multiple id
GetData(z2, iD = list(id="ID"), x = "kari", y = "ola")

# Construct a single id from two variables
GetData(z2, iD = list(id=c("tull", "ID")), x = "kari", y = "ola")



statisticsnorway/Kostra documentation built on Sept. 25, 2024, 10:37 a.m.