View source: R/database_schema.r
| database_schema | R Documentation |
Enhances a relation_schema object with foreign key reference
information.
database_schema(relation_schemas, references)
relation_schemas |
a |
references |
a list of references, each represented by a list containing four character elements. In order, the elements are a scalar giving the name of the child (referrer) schema, a vector giving the child attribute names, a scalar giving the name of the parent (referee) schema, and a vector giving the parent attribute names. The vectors must be of the same length and contain names for attributes present in their respective schemas, and the parent attributes must form a key. |
Unlike functional_dependency and relation_schema,
database_schema is not designed to be vector-like: it only holds a
single database schema. This adheres to the usual package use case, where a
single data frame is being analysed at a time. However, it inherits from
relation_schema, so is vectorised with respect to its relation
schemas.
As with relation_schema, duplicate relation schemas, after
ordering by attribute, are allowed, and can be removed with
unique.
References, i.e. foreign key references, are allowed to have different
attribute names in the child and parent relations; this can't occur in the
output for autoref and normalise.
Subsetting removes any references that involve removed relation schemas.
Removing duplicates with unique changes references involving
duplicates to involve the kept equivalent schemas instead. Renaming relation
schemas with names<- also changes their names in
the references.
A database_schema object, containing relation_schemas
with references stored in an attribute of the same name.
References are stored with their attributes in the order they appear in
their respective relation schemas.
attrs, keys, attrs_order,
and references for extracting parts of the information in a
database_schema; create for creating a
database object that uses the given schema; gv
for converting the schema into Graphviz code; rename_attrs
for renaming the attributes in attrs_order; reduce for
filtering a schema's relations to those connected to a given relation by
foreign key references; subschemas to return the
relation_schema that the given schema contains;
merge_empty_keys for combining relations with an empty key;
merge_schemas for combining relations with matching sets of
keys.
rs <- relation_schema(
list(
a = list(c("a", "b"), list("a")),
b = list(c("b", "c"), list("b", "c"))
),
attrs_order = c("a", "b", "c", "d")
)
ds <- database_schema(
rs,
list(list("a", "b", "b", "b"))
)
print(ds)
attrs(ds)
keys(ds)
attrs_order(ds)
names(ds)
references(ds)
# relations can't reference themselves
## Not run:
database_schema(
relation_schema(
list(a = list("a", list("a"))),
c("a", "b")
),
list(list("a", "a", "a", "a"))
)
database_schema(
relation_schema(
list(a = list(c("a", "b"), list("a"))),
c("a", "b")
),
list(list("a", "b", "a", "a"))
)
## End(Not run)
# an example with references between differently-named attributes
print(database_schema(
relation_schema(
list(
citation = list(c("citer", "citee"), list(c("citer", "citee"))),
article = list("article", list("article"))
),
c("citer", "citee", "article")
),
list(
list("citation", "citer", "article", "article"),
list("citation", "citee", "article", "article")
)
))
# vector operations
ds2 <- database_schema(
relation_schema(
list(
e = list(c("a", "e"), list("e"))
),
attrs_order = c("a", "e")
),
list()
)
c(ds, ds2) # attrs_order attributes are merged
unique(c(ds, ds))
# subsetting
ds[1]
stopifnot(identical(ds[[1]], ds[1]))
ds[c(1, 2, 1, 2)] # replicates the foreign key references
c(ds[c(1, 2)], ds[c(1, 2)]) # doesn't reference between separate copies of ds
unique(ds[c(1, 2, 1, 2)]) # unique() also merges references
# another example of unique() merging references
ds_merge <- database_schema(
relation_schema(
list(
a = list(c("a", "b"), list("a")),
b = list(c("b", "c", "d"), list("b")),
c_d = list(c("c", "d", "e"), list(c("c", "d"))),
a.1 = list(c("a", "b"), list("a")),
b.1 = list(c("b", "c", "d"), list("b"))
),
c("a", "b", "c", "d", "e")
),
list(
list("a", "b", "b", "b"),
list("b.1", c("c", "d"), "c_d", c("c", "d"))
)
)
print(ds_merge)
unique(ds_merge)
# reassignment
# can't change keys included in references
## Not run: keys(ds)[[2]] <- list("c")
# can't remove attributes included in keys
## Not run: attrs(ds)[[2]] <- list("c", "d")
# can't remove attributes included in references
## Not run: attrs(ds)[[1]] <- c("a", "d")
ds3 <- ds
# can change subset of schema, but loses references between altered and
# non-altered subsets
ds3[2] <- database_schema(
relation_schema(
list(d = list(c("d", "c"), list("d"))),
attrs_order(ds3)
),
list()
)
print(ds3) # note the schema's name doesn't change
# names(ds3)[2] <- "d" # this would change the name
keys(ds3)[[2]] <- list(character()) # removing keys first...
attrs(ds3)[[2]] <- c("b", "c") # so we can change the attrs legally
keys(ds3)[[2]] <- list("b", "c") # add the new keys
# add the reference lost during subset replacement
references(ds3) <- c(references(ds3), list(list("a", "b", "b", "b")))
stopifnot(identical(ds3, ds))
# changing appearance priority for attributes
attrs_order(ds3) <- c("d", "c", "b", "a")
print(ds3)
# changing relation schema names changes them in references
names(ds3) <- paste0(names(ds3), "_long")
print(ds3)
# reconstructing from components
ds_recon <- database_schema(
relation_schema(
Map(list, attrs(ds), keys(ds)),
attrs_order(ds)
),
references(ds)
)
stopifnot(identical(ds_recon, ds))
ds_recon2 <- database_schema(
subschemas(ds),
references(ds)
)
stopifnot(identical(ds_recon2, ds))
# can be a data frame column
data.frame(id = 1:2, schema = ds)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.