createDB: Create a relational data base file.

Description Usage Arguments Details Value Examples

View source: R/02_DB_creation.R

Description

Creates a relational data base from a list of data.frames (dfList). The list structure including the naming of dfList, pkList and fkList needs to be exactly the same. Keys (pkList and fkList$Keys) can either be character vectors with a single variable name or multiple variable names. Primary keys (pkList) have to be unique within a single data.frame. Foreign Keys (fkList) have to consist of a list with the referenced data frame (fkList$References) and the referencing keys (fkList$Keys). If a single data frame is to be converted to a data base, pkList can be dropped. Otherwise, both elements of fkList need to be set to NULL.

Usage

1
createDB(dfList, pkList, fkList = NULL, metaData = NULL, filePath)

Arguments

dfList

Named list of data frames. The order of the data.frames determines the merge order.

pkList

Named list of the primary keys corresponding to the data.frames.

fkList

Named list of a list per data.frame, including referenced data frame (fkList$References) and the corresponding keys fkList$Keys). Default is NULL, which should be used if only a single data frame is supplied. For multiple data.frames, fkList$References and fkList$Keys should be NULL for the first data.frame.

metaData

[optional] Data.frame including meta data information about the other data.frames.

filePath

Path to the db file to write (including name); has to end on .db.

Details

Primary keys guarantee uniqueness of cases within a single data.frame, and are single variables or combinations of variables. Foreign keys are used to merge data.frames. The foreign key for the first data set always has to be set to list(References = NULL, Keys = NULL). The order in which the data.frames are supplied determines the merge order. Currently, left joins are performed when merging data.frames. However, data.frames are stored separately in the relational data base and are only merged if pulled from the data base. \ Conventions for naming variables (columns) follow naming conventions of SQLite3. '.' and sqlite_keywords are prohibited. Two additional tables within the SQLite3 data base are created: Meta_Information, which contains a single character with the merge order that is used by dbPull and Meta_Data, which contains the meta data.frame supplied to the argument metaData.

Value

Creates a data base in the given path, returns NULL.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# Set up data frames
NoImp <- data.frame(ID = 1:5,
                   age = sample(12:17, size = 5, replace = TRUE),
                   weight = sample(40:60, size = 5, replace = TRUE))
Imp <- data.frame(ID = rep(1:5, 3),
                 imp = c(rep(1, 5), rep(2, 5), rep(3, 5)),
                 noBooks = sample(1:200, 15, replace = TRUE))
PVs <- data.frame(ID = rep(rep(1:5, 3), 2),
                 imp = rep(c(rep(1, 5), rep(2, 5), rep(3, 5)), 2),
                 subject = c(rep("math", 15), rep("reading", 15)),
                 pv = sample(seq(from = -1.75, to = 1.75, by = 0.05), 30, replace = TRUE),
                 stringsAsFactors = FALSE)

# Combine into named list
dfList <- list(NoImp = NoImp, Imp = Imp, PVs = PVs)

# Define primary and foreign keys accordingly
pkList <- list(NoImp = "ID",
               Imp = c("ID", "imp"),
               PVs = c("ID", "imp", "subject"))
fkList <- list(NoImp = list(References = NULL, Keys = NULL),
               Imp = list(References = "NoImp", Keys = "ID"),
               PVs = list(References = "Imp", Keys = c("ID", "imp")))

# Optional metaData
metaData <- data.frame(varName = c("ID", "age", "weight", "imp", "noBooks", "subject", "pv"),
                      varLabel = c("ID variable", "Age in years", "Body weight in kilogram",
                                   "Multiple Imputation number",
                                   "Number of books at home (self reported)",
                                   "Competence domain (Mathematical Literacy/Reading Literacy",
                                   "Plausible value"),
                      data_table = c(rep("NoImp", 3), rep("Imp", 2), rep("PVs", 2)),
                      stringsAsFactors = FALSE)

# Create in memory data base
createDB(dfList = dfList, pkList = pkList, fkList = fkList, metaData = metaData,
         filePath = ":memory:")

eatDB documentation built on Oct. 5, 2021, 5:06 p.m.