tpc | R Documentation |
Like [pcalg::pc()], but takes into account a user-specified partial
ordering of the nodes/variables. This has two effects:
1) The conditional independence between x
and y
given S
is
ot tested if any variable in S
lies in the future of both x
and y
;
2) edges cannot be oriented from a higher-order to a lower-order node. In addition,
the user may specify individual forbidden edges and context variables.
tpc( suffStat, indepTest, alpha, labels, p, skel.method = c("stable", "stable.parallel"), forbEdges = NULL, m.max = Inf, conservative = FALSE, maj.rule = TRUE, tiers = NULL, context.all = NULL, context.tier = NULL, verbose = FALSE, numCores = NULL, cl.type = "PSOCK", clusterexport = NULL )
suffStat |
A [base::list()] of sufficient statistics, containing all necessary elements for the conditional independence decisions in the function [indepTest()]. |
indepTest |
A function for testing conditional independence. It is internally
called as |
alpha |
significance level (number in (0,1) for the individual conditional independence tests. |
labels |
(optional) character vector of variable (or "node") names.
Typically preferred to specifying |
p |
(optional) number of variables (or nodes). May be specified if |
skel.method |
Character string specifying method; the default, "stable" provides an order-independent skeleton, see [tpc::tskeleton()]. |
forbEdges |
A logical matrix of dimension p*p. If |
m.max |
Maximal size of the conditioning sets that are considered in the conditional independence tests. |
conservative |
Logical indicating if conservative PC should be used. Defaults to FALSE. See [pcalg::pc()] for details. |
maj.rule |
Logical indicating if the majority rule should be used. Defaults to TRUE. See [pcalg::pc()] for details. |
tiers |
Numeric vector specifying the tier / time point for each variable. Must be of length 'p', if specified, or have the same length as 'labels', if specified. A smaller number corresponds to an earlier tier / time point. |
context.all |
Numeric or character vector. Specifies the positions or names of global context variables. Global context variables have no incoming edges, i.e. no parents, and are themselves parents of all non-context variables in the graph. |
context.tier |
Numeric or character vector. Specifies the positions or names of tier-specific context variables. Tier-specific context variables have no incoming edges, i.e. no parents, and are themselves parents of all non-context variables in the same tier. |
verbose |
if |
numCores |
The numbers of CPU cores to be used. |
cl.type |
The cluster type. Default value is |
clusterexport |
Character vector. Lists functions to be exported to nodes if numCores > 1. |
See pcalg::pc
for further information on the PC algorithm.
The PC algorithm is named after its developers Peter Spirtes and Clark Glymour
(Spirtes et al., 2000).
Specifying a tier for each variable using the tier
argument has the
following effects:
1) In the skeleton phase and v-structure learing phases,
conditional independence testing is restricted such that if x is in tier t(x)
and y is in t(y), only those variables are allowed in the conditioning set whose
tier is not larger than t(x).
2) Following the v-structure phase, all
edges that were found between two tiers are directed into the direction of the
higher-order tier. If context variables are specified using context.all
and/or context.tier
, the corresponding orientations are added in this step.
An object of class
"pcAlgo
"
(see [pcalg::pcalgo] containing an estimate of the equivalence class of
the underlying DAG.
Original code by Markus Kalisch, Martin Maechler, and Diego Colombo. Modifications by Janine Witte (Kalisch et al., 2012).
M. Kalisch, M. Maechler, D. Colombo, M.H. Maathuis and P. Buehlmann (2012). Causal Inference Using Graphical Models with the R Package pcalg. Journal of Statistical Software 47(11): 1–26.
P. Spirtes, C. Glymour and R. Scheines (2000). Causation, Prediction, and Search, 2nd edition. The MIT Press. https://philarchive.org/archive/SPICPA-2.
# load simulated cohort data data(dat_sim) n <- nrow(dat_sim) lab <- colnames(dat_sim) # estimate skeleton without taking background information into account tpc.fit <- tpc(suffStat = list(C = cor(dat_sim), n = n), indepTest = gaussCItest, alpha = 0.01, labels = lab) pc.fit <- pcalg::pc(suffStat = list(C = cor(dat_sim), n = n), indepTest = gaussCItest, alpha = 0.01, labels = lab, maj.rule = TRUE, solve.conf = TRUE) identical(pc.fit@graph, tpc.fit@graph) # TRUE # estimate skeleton with temporal ordering as background information tiers <- rep(c(1,2,3), times=c(3,3,3)) tpc.fit2 <- tpc(suffStat = list(C = cor(dat_sim), n = n), indepTest = gaussCItest, alpha = 0.01, labels = lab, tiers = tiers) tpc.fit3 <- tpc(suffStat = list(C = cor(dat_sim), n = n), indepTest = gaussCItest, alpha = 0.01, labels = lab, tiers = tiers, skel.method = "stable.parallel", numCores = 2, clusterexport = c("cor", "ecdf")) if(requireNamespace("Rgraphviz", quietly = TRUE)){ data("true_sim") oldpar <- par(mfrow = c(1,3)) plot(true_sim, main = "True DAG") plot(tpc.fit, main = "PC estimate") plot(tpc.fit2, main = "tPC estimate") par(oldpar) } # require that there is no edge between A1 and A1, and that any edge between A2 and B2 # or A2 and C2 is directed away from A2 forb <- matrix(FALSE, nrow=9, ncol=9) rownames(forb) <- colnames(forb) <- lab forb["A1","A3"] <- forb["A3","A1"] <- TRUE forb["B2","A2"] <- TRUE forb["C2","A2"] <- TRUE tpc.fit3 <- tpc(suffStat = list(C = cor(dat_sim), n = n), indepTest = gaussCItest, alpha = 0.01,labels = lab, forbEdges = forb, tiers = tiers) if (requireNamespace("Rgraphviz", quietly = TRUE)) { # compare estimated CPDAGs data("true_sim") oldpar <- par(mfrow = c(1,2)) plot(tpc.fit2, main = "old tPC estimate") plot(tpc.fit3, main = "new tPC estimate") par(oldpar) } # force edge from A1 to all other nodes measured at time 1 # into the graph (note that the edge from A1 to A2 is then # forbidden) tpc.fit4 <- tpc(suffStat = list(C = cor(dat_sim), n = n), indepTest = gaussCItest, alpha = 0.01, labels = lab, tiers = tiers, context.tier = "A1") if (requireNamespace("Rgraphviz", quietly = TRUE)) { # compare estimated CPDAGs data("true_sim") plot(tpc.fit4, main = "alternative tPC estimate") } # force edge from A1 to all other nodes into the graph tpc.fit5 <- tpc(suffStat = list(C = cor(dat_sim), n = n), indepTest = gaussCItest, alpha = 0.01, labels = lab, tiers = tiers, context.all = "A1") if (requireNamespace("Rgraphviz", quietly = TRUE)) { # compare estimated CPDAGs data("true_sim") plot(tpc.fit5, main = "alternative tPC estimate") }
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.