View source: R/database_schema.r
database_schema | R Documentation |
Enhances a relation_schema
object with foreign key reference
information.
database_schema(relation_schemas, references)
relation_schemas |
a |
references |
a list of references, each represented by a list containing four character elements. In order, the elements are a scalar giving the name of the child (referrer) schema, a vector giving the child attribute names, a scalar giving the name of the parent (referee) schema, and a vector giving the parent attribute names. The vectors must be of the same length and contain names for attributes present in their respective schemas, and the parent attributes must form a key. |
Unlike functional_dependency
and relation_schema
,
database_schema
is not designed to be vector-like: it only holds a
single database schema. This adheres to the usual package use case, where a
single data frame is being analysed at a time. However, it inherits from
relation_schema
, so is vectorised with respect to its relation
schemas.
As with relation_schema
, duplicate relation schemas, after
ordering by attribute, are allowed, and can be removed with
unique
.
References, i.e. foreign key references, are allowed to have different
attribute names in the child and parent relations; this can't occur in the
output for autoref
and normalise
.
Subsetting removes any references that involve removed relation schemas.
Removing duplicates with unique
changes references involving
duplicates to involve the kept equivalent schemas instead. Renaming relation
schemas with names<-
also changes their names in
the references.
A database_schema
object, containing relation_schemas
with references
stored in an attribute of the same name.
References are stored with their attributes in the order they appear in
their respective relation schemas.
attrs
, keys
, attrs_order
,
and references
for extracting parts of the information in a
database_schema
; create
for creating a
database
object that uses the given schema; gv
for converting the schema into Graphviz code; rename_attrs
for renaming the attributes in attrs_order
; reduce
for
filtering a schema's relations to those connected to a given relation by
foreign key references; subschemas
to return the
relation_schema
that the given schema contains;
merge_empty_keys
for combining relations with an empty key;
merge_schemas
for combining relations with matching sets of
keys.
rs <- relation_schema(
list(
a = list(c("a", "b"), list("a")),
b = list(c("b", "c"), list("b", "c"))
),
attrs_order = c("a", "b", "c", "d")
)
ds <- database_schema(
rs,
list(list("a", "b", "b", "b"))
)
print(ds)
attrs(ds)
keys(ds)
attrs_order(ds)
names(ds)
references(ds)
# relations can't reference themselves
## Not run:
database_schema(
relation_schema(
list(a = list("a", list("a"))),
c("a", "b")
),
list(list("a", "a", "a", "a"))
)
database_schema(
relation_schema(
list(a = list(c("a", "b"), list("a"))),
c("a", "b")
),
list(list("a", "b", "a", "a"))
)
## End(Not run)
# an example with references between differently-named attributes
print(database_schema(
relation_schema(
list(
citation = list(c("citer", "citee"), list(c("citer", "citee"))),
article = list("article", list("article"))
),
c("citer", "citee", "article")
),
list(
list("citation", "citer", "article", "article"),
list("citation", "citee", "article", "article")
)
))
# vector operations
ds2 <- database_schema(
relation_schema(
list(
e = list(c("a", "e"), list("e"))
),
attrs_order = c("a", "e")
),
list()
)
c(ds, ds2) # attrs_order attributes are merged
unique(c(ds, ds))
# subsetting
ds[1]
stopifnot(identical(ds[[1]], ds[1]))
ds[c(1, 2, 1, 2)] # replicates the foreign key references
c(ds[c(1, 2)], ds[c(1, 2)]) # doesn't reference between separate copies of ds
unique(ds[c(1, 2, 1, 2)]) # unique() also merges references
# another example of unique() merging references
ds_merge <- database_schema(
relation_schema(
list(
a = list(c("a", "b"), list("a")),
b = list(c("b", "c", "d"), list("b")),
c_d = list(c("c", "d", "e"), list(c("c", "d"))),
a.1 = list(c("a", "b"), list("a")),
b.1 = list(c("b", "c", "d"), list("b"))
),
c("a", "b", "c", "d", "e")
),
list(
list("a", "b", "b", "b"),
list("b.1", c("c", "d"), "c_d", c("c", "d"))
)
)
print(ds_merge)
unique(ds_merge)
# reassignment
# can't change keys included in references
## Not run: keys(ds)[[2]] <- list("c")
# can't remove attributes included in keys
## Not run: attrs(ds)[[2]] <- list("c", "d")
# can't remove attributes included in references
## Not run: attrs(ds)[[1]] <- c("a", "d")
ds3 <- ds
# can change subset of schema, but loses references between altered and
# non-altered subsets
ds3[2] <- database_schema(
relation_schema(
list(d = list(c("d", "c"), list("d"))),
attrs_order(ds3)
),
list()
)
print(ds3) # note the schema's name doesn't change
# names(ds3)[2] <- "d" # this would change the name
keys(ds3)[[2]] <- list(character()) # removing keys first...
attrs(ds3)[[2]] <- c("b", "c") # so we can change the attrs legally
keys(ds3)[[2]] <- list("b", "c") # add the new keys
# add the reference lost during subset replacement
references(ds3) <- c(references(ds3), list(list("a", "b", "b", "b")))
stopifnot(identical(ds3, ds))
# changing appearance priority for attributes
attrs_order(ds3) <- c("d", "c", "b", "a")
print(ds3)
# changing relation schema names changes them in references
names(ds3) <- paste0(names(ds3), "_long")
print(ds3)
# reconstructing from components
ds_recon <- database_schema(
relation_schema(
Map(list, attrs(ds), keys(ds)),
attrs_order(ds)
),
references(ds)
)
stopifnot(identical(ds_recon, ds))
ds_recon2 <- database_schema(
subschemas(ds),
references(ds)
)
stopifnot(identical(ds_recon2, ds))
# can be a data frame column
data.frame(id = 1:2, schema = ds)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.