codeplan: Describe structure of Data Sets and Importers

codeplanR Documentation

Describe structure of Data Sets and Importers

Description

The function codeplan() creates a data frame that describes the structure of an item list (a data.set object or an importer object), so that this structure can be stored and and recovered. The resulting data frame has a particular print method that delimits the output to one line per variable.

With setCodeplan an item list structure (as returned by codeplan()) can be applied to a data frame or data set. It is also possible to use an assignment like codeplan(x) <- value to a similar effect.

Usage

codeplan(x)
## S4 method for signature 'item.list'
codeplan(x)
## S4 method for signature 'item'
codeplan(x)
setCodeplan(x,value)
## S4 method for signature 'data.frame,codeplan'
setCodeplan(x,value)
## S4 method for signature 'data.frame,NULL'
setCodeplan(x,value)
## S4 method for signature 'data.set,codeplan'
setCodeplan(x,value)
## S4 method for signature 'data.set,NULL'
setCodeplan(x,value)
## S4 method for signature 'item,codeplan'
setCodeplan(x,value)
## S4 method for signature 'item,NULL'
setCodeplan(x,value)
## S4 method for signature 'atomic,codeplan'
setCodeplan(x,value)
## S4 method for signature 'atomic,NULL'
setCodeplan(x,value)
codeplan(x) <- value
read_codeplan(filename,type)
write_codeplan(x,filename,type,pretty)

Arguments

x

for codeplan(x) an object that inherits from class "item.list", i.e. can be a "data.set" object or an "importer" object, it can also be an object that inherits from class "item". For write_codeplan an object from class "codeplan".

value

an object as it would be returned by codeplan(x) or NULL.

filename

a character string, the name of the file that is to be read or to be written.

type

a character string (either "yaml" or "json") oder NULL (the default), gives the type of the file into which the codeplan is written or from which it is read. If type is NULL then the file type is inferred from the file name ending (".yaml" or ",yml" for "yaml", ".json" for "json").

pretty

a logical value, whether the JSON output created by write_codeplan(...) should be prettified.

Value

If applicable, codeplan returns a list with additional S3 class attribute "codeplan". For arguments for which the relevant information does not exist, the function returns NULL.

The list has at least one element or several elements, named after the variable in the "item.list" or "data.set" x. Each list element is a list itself with the following elements:

annotation

a named character vector,

labels

a named list of labels and labelled values

value.filter

a list with at least two elements named "class" and "filter", and optionally another element named "range". The "class" element determines the class of the value filter and equals either "missing.values", "valid.values", or "valid.range". An element named "range" may only be needed if "class" is "missing.values", as it is possible (like in SPSS) to have both individual missing values and a range of missing values.

mode

a character string that describes storage mode, such as "character", "integer", or "numeric".

measurement

a character string with the measurement level, "nominal", "ordinal", "interval", or "ratio".

If codeplan(x)<-value or setCodeplan(x,value) is used and value is NULL, all the special information about annotation, labels, value filters, etc. is removed from the resulting object, which then is usually a mere atomic vector or data frame.

Examples

Data1 <- data.set(
          vote = sample(c(1,2,3,8,9,97,99),size=300,replace=TRUE),
          region = sample(c(rep(1,3),rep(2,2),3,99),size=300,replace=TRUE),
          income = exp(rnorm(300,sd=.7))*2000
          )

Data1 <- within(Data1,{
  description(vote) <- "Vote intention"
  description(region) <- "Region of residence"
  description(income) <- "Household income"
  foreach(x=c(vote,region),{
    measurement(x) <- "nominal"
    })
  measurement(income) <- "ratio"
  labels(vote) <- c(
                    Conservatives         =  1,
                    Labour                =  2,
                    "Liberal Democrats"   =  3,
                    "Don't know"          =  8,
                    "Answer refused"      =  9,
                    "Not applicable"      = 97,
                    "Not asked in survey" = 99)
  labels(region) <- c(
                    England               =  1,
                    Scotland              =  2,
                    Wales                 =  3,
                    "Not applicable"      = 97,
                    "Not asked in survey" = 99)
  foreach(x=c(vote,region,income),{
    annotation(x)["Remark"] <- "This is not a real survey item, of course ..."
    })
  missing.values(vote) <- c(8,9,97,99)
  missing.values(region) <- c(97,99)
})
cpData1 <- codeplan(Data1)

Data2 <- data.frame(
          vote = sample(c(1,2,3,8,9,97,99),size=300,replace=TRUE),
          region = sample(c(rep(1,3),rep(2,2),3,99),size=300,replace=TRUE),
          income = exp(rnorm(300,sd=.7))*2000
          )
codeplan(Data2) <- cpData1
codeplan(Data2)
codebook(Data2)

# Note the difference between 'as.data.frame' and setting
# the codeplan to NULL:
Data2df <- as.data.frame(Data2)
codeplan(Data2) <- NULL
str(Data2)
str(Data2df)
codeplan(Data2) <- NULL # Does not change anything

# Codeplans of survey items can also be inquired and manipulated:
vote <- Data1$vote
str(vote)
cp.vote <- codeplan(vote)
codeplan(vote) <- NULL
str(vote)
codeplan(vote) <- cp.vote
vote

fn.json <- paste0(tempfile(),".json")
write_codeplan(codeplan(Data1),filename=fn.json)
codeplan(Data2) <- read_codeplan(fn.json)
codeplan(Data2)

memisc documentation built on Oct. 24, 2024, 5:07 p.m.