CPF: Representation of a conditional probability table as a data...

CPFR Documentation

Representation of a conditional probability table as a data frame.

Description

A conditional probability table for a node can be represented as a data frame with a number of factor variables representing the parent variables and the remaining numeric values representing the conditional probabilities of the states of the nodes given the parent configuration. Each row represents one configuration and the corresponding conditional probabilities. A CPF is a special data.frame object which represents a conditional probability table.

Usage

is.CPF(x)
as.CPF(x)

Arguments

x

Object to be tested or coerced into a CPF.

Details

One way to store a conditional probability table is a table in which the first several columns indicate the states of the parent variables, and the last several columns indicate probabilities for several child variables. Consider the following example:

A B C.c1 C.c2 C.c3 C.c4
[1,] a1 b1 0.03 0.17 0.33 0.47
[2,] a2 b1 0.05 0.18 0.32 0.45
[3,] a1 b2 0.06 0.19 0.31 0.44
[4,] a2 b2 0.08 0.19 0.31 0.42
[5,] a1 b3 0.09 0.20 0.30 0.41
[6,] a2 b3 0.10 0.20 0.30 0.40

In this case the first two columns correspond to parent variables A and B. The variable A has two possible states and the variable B has three. The child variable is C and it has for possible states. The numbers in each row give the conditional probabilities for those states give the state of the child variables.

The class CPF is a subclass of data.frame (formally, it is class c("CPF","data.frame")). Although the intended interpretation is that of a conditional probability table, the normalization constraint is not enforced. Thus a CPF object could be used to store likelihoods, probability potentials, contingency table counts, or other similarly shaped objects. The function normalize scales the numeric values of CPF so that each row is normalized.

The [ method for a NeticaNode returns a CPF (if the node is not deterministic).

The function as.CPF() is designed to convert between CPAs (that is, conditional probability tables stored as arrays) and CPFs. In particular, as.CPF is designed to work with the output of NodeProbs() or a similarly formatted array. It assumes that names(dimnames(x)) are the names of the variables, and dimnames(x) is a list of character vectors giving the names of the states of the variables. (See CPA for details.) This general method should work with any numeric array for which both dimnames(x) and names(dimnames(x)) are specified.

The argument x of as.CPF() could also be a data frame, in which case it is permuted so that the factor variable are first and the class tag "CDF" is added to its class.

Value

The function is.CPF() returns a logical value indicating whether or not the is(x,"CDF") is true.

The function as.CPF returns an object of class c("CPF","data.frame"), which is essentially a data frame with the first couple of columns representing the parent variables, and the remaining columns representing the states of the child variable.

Note

The parent variable list is created with a call expand.grid(dimnames(x)[1:(p-1)]). This produces conditional probability tables where the first parent variable varies fastest. The Netica GUI displays tables in which the last parent variable varies fastest.

Note, this is an S3 class, as it is basically a data.frame with special structure.

Change in R 4.0. Note that under R 4.0, character vectors are no longer automaticall coerced into factors when added to data frames. This is probably a good thing, as the code can assume that a column that is a factor is an index for a variable, and one that is a character is a comment or some other data.

Author(s)

Russell Almond

See Also

NodeProbs(), Extract.NeticaNode, CPA, normalize()

Examples

# Note:  in R 4.0, the factor() call is required.
arf <- data.frame(A=factor(rep(c("a1","a2"),each=3)),
                  B=factor(rep(c("b1","b2","b3"),2)),
                  C.c1=1:6, C.c2=7:12, C.c3=13:18, C.c4=19:24)
arf <- as.CPF(arf)


arr <- array(1:24,c(2,3,4),
            dimnames=list(A=c("a1","a2"),B=c("b1","b2","b3"),
                          C=c("c1","c2","c3","c4")))
arrf <- as.CPF(arr)
stopifnot(
  is.CPF(arrf),
  all(levels(arrf$A)==c("a1","a2")),
  all(levels(arrf$B)==c("b1","b2","b3")),
  nrow(arrf)==6, ncol(arrf)==6
)

##Warning, this is not the same as arf, rows are permuted.
as.CPF(as.CPA(arf))

## Not run: 
  ## Requires RNetica
  as.CPF(NodeProbs(node))

## End(Not run)

ralmond/CPTtools documentation built on Dec. 27, 2024, 7:15 a.m.