OCP.CPF: Plots a conditional probability distribuiton with data...

View source: R/OCP.R

OCP.CPFR Documentation

Plots a conditional probability distribuiton with data overlayed

Description

The observable characteristic plot is a plot that compares a conditional probabiltiy distribution with a set of data nominally from that distribution. The exp argument is the conditional distribution in CPF format. This is plotted as a barchart. The obs argument is the data in CPF (rows not normalized). The data are plotted as cumulative probability estimates with error bars overlaid on the barplot. The function OCP2.CPF does the same thing, but contrasts two different data sets.

Usage

OCP.CPF(obs, exp, ..., baseCol = "chocolate",
        limits = c(lower = 0.025, upper = 0.975),
        Parents = getTableParents(exp),
        States = getTableStates(exp),
        nstates=length(States),
        key=list(points=FALSE,rectangles=TRUE,text=States),
        par.settings = list(),
        point.pch = paste((nstates-1):0),
        point.col = "black", point.cex = 1.25,
        line.col = "black", line.type=1)
OCP2.CPF(obs1, obs2, exp, ..., baseCol = "chocolate",
        limits = c(lower = 0.025, upper = 0.975),
        Parents = getTableParents(exp),
        States = getTableStates(exp),
        nstates=length(States),
        key=list(points=FALSE,rectangles=TRUE,text=States),
        par.settings = list(),
        point1.pch = paste((nstates-1):0),
        point1.col = "black", point1.cex = 1.25,
        line1.col = "black", line1.type=1,
        point2.pch = paste((nstates-1):0),
        point2.col = "cyan", point2.cex = 1.25,
        line2.col = "cyan", line2.type=2)

Arguments

obs, obs1, obs2

A CPF which contains the observed data. The 1 and 2 variants specify the upper and lower plotted data data for OCP2.CPF.

exp

A CPF which contains the candidate distribution.

...

Other arguments passed to barchart.

baseCol

The base color to use for the color gradients in the bars.

limits

A vector of two probabilities representing the quantiles of the beta distribution to be used in error bars.

Parents

A character vector giving the names of the parent variables.

States

A character vector giving the names of the child states.

nstates

An integer giving the nubmer of states.

par.settings

Setting passed to barchart.

key

Settings passed to simpleKey to construct the key.

point.pch, point1.pch, point2.pch

Characters used to plot data points.

point.col, point1.col, point2.col

The colour to be used for the overlay points.

point.cex, point1.cex, point2.cex

The size (cex) to be used for the overlay points.

line.col, line1.col, line2.col

The colour to be used for the overlay lines.

line.type, line1.type, line2.type

The type (lty) for the overlay lines.

Details

The cannonical use for this plot is to test the conditional probability model used in particular Bayesian network. The exp argument is the proposed table. If the parents are fully observed, then the obs argument would be a contingency table with the rows representing parent configurations and the columns the data. If the parents are not fully observed, then obs can be replaced by an expected contingincy table (see Almond, et al, 2015).

The bars are coloured using an intesity scale starting from a base colour. The baseColor argument gives the darkest colour in the scale. The base plot is very similar to barchart.CPF.

Point and interval estimates are made for the cumulative probabilities using a simple Bayesian estimator (this avoids potential issues with 0 cell counts). Basically, the expected values (exp) are added to the data as a prior. The intervals are formued using quantiles of the beta distribution. The limits argument gives the quantiles used.

The point and interval estimaters are plotted over the barchart using a numeric label (going from 0 smallest to k-1, were k is the number of states of the child variable) and line running from the lower to upper bound. If the line overlaps the corresponding bar in the barchart, then the data are consistent with the model (at least in that cell).

The parameters point.pch point.col, point.cex, line.col, and line.type control the appearance of the points and lines. These may either be scalars or vectors which match the number of states of the child variable.

The function OCP2.CPF is similar but it meant for situations in which two different data distributions are to be compared. In this one (obs1, point1.*, line1.*) is plotted in the upper half of the bar, and the other (obs2, point2.*, line2.*) in the lower half. Two important applications of this particular variation:

  • Two different data sets are related to the presence/absence of an additional parent variable. Thus, this is a test of whether or not the additional parent is relevant.

  • Two different data sets represent a focal and reference demographic group. This is a test of whether or not the item behaves similarly for both groups.

Value

Returns an object of class lattice::trellis.object. The default print method will plot this function.

Author(s)

Russell Almond

References

Almond, R.G., Mislevy, R.J., Steinberg, L.S., Williamson, D.M. and Yan, D. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 10.

Sinharay, S. and Almond, R.G. (2006). Assessing Fit of Cognitively Diagnostic Models: A case study. Educational and Psychological Measurement. 67(2), 239–257.

Sinharay, S., Almond, R. G. and Yan, D. (2004). Assessing fit of models with discrete proficiency variables in educational assessment. ETS Research Report. http://www.ets.org/research/researcher/RR-04-07.html

See Also

betaci, OCP, barchart.CPF, cptChi2

Examples



  skill1l <- c("High","Medium","Low") 
  skill2l <- c("High","Medium","Low","LowerYet") 
  correctL <- c("Correct","Incorrect") 
  pcreditL <- c("Full","Partial","None")
  gradeL <- c("A","B","C","D","E") 

  cpfTheta <- calcDPCFrame(list(),skill1l,numeric(),0,rule="Compensatory",
                           link="normalLink",linkScale=.5)

  ## Compatible data  
  datTheta <- cpfTheta ## Copy to get the shape
  datTheta[1,] <- rmultinom(1,25,cpfTheta[1,])

  OCP.CPF(datTheta,cpfTheta)

  ## Incompatible data
  datThetaX <- cpfTheta ## Copy to get the shape
  datThetaX[1,] <- rmultinom(1,100,c(.05,.45,.5))

  OCP.CPF(datThetaX,cpfTheta)
  OCP2.CPF(datTheta,datThetaX,cpfTheta)

  cptComp <- calcDPCFrame(list(S2=skill2l,S1=skill1l),correctL,
                          lnAlphas=log(c(1.2,.8)), betas=0,
                          rule="Compensatory")

  cptConj <- calcDPCFrame(list(S2=skill2l,S1=skill1l),correctL,
                          lnAlphas=log(c(1.2,.8)), betas=0,
                          rule="Conjunctive")

  datComp <- cptComp
  for (i in 1:nrow(datComp))
      ## Each row has a random sample size.
      datComp[i,3:4] <- rmultinom(1,rnbinom(1,mu=15,size=1),cptComp[i,3:4])

  datConj <- cptConj
  for (i in 1:nrow(datConj))
      ## Each row has a random sample size.
      datConj[i,3:4] <- rmultinom(1,rnbinom(1,mu=15,size=1),cptConj[i,3:4])

  ## Compatible
  OCP.CPF(datConj,cptConj)
  ## Incompatible
  OCP.CPF(datComp,cptConj)
  OCP2.CPF(datConj,datComp,cptConj)

  cptPC1 <- calcDPCFrame(list(S1=skill1l,S2=skill2l),pcreditL,
                         lnAlphas=log(1),
                         betas=list(full=c(S1=0,S2=999),partial=c(S2=999,S2=0)),
                         rule="OffsetDisjunctive")
  cptPC2 <- calcDPCFrame(list(S1=skill1l,S2=skill2l),pcreditL,
                         lnAlphas=log(1),
                         betas=list(full=c(S1=0,S2=999),partial=c(S2=1,S2=0)),
                         rule="OffsetDisjunctive")


  datPC1 <- cptPC1
  for (i in 1:nrow(datPC1))
      ## Each row has a random sample size.
      datPC1[i,3:5] <- rmultinom(1,rnbinom(1,mu=25,size=1),cptPC1[i,3:5])

  datPC2 <- cptPC2
  for (i in 1:nrow(datPC2))
      ## Each row has a random sample size.
      datPC2[i,3:5] <- rmultinom(1,rnbinom(1,mu=25,size=1),cptPC2[i,3:5])

  ## Compatible
  OCP.CPF(datPC1,cptPC1)
  ## Incompatible
  OCP.CPF(datPC2,cptPC1)
  OCP2.CPF(datPC1,datPC2,cptPC1)




ralmond/CPTtools documentation built on Dec. 27, 2024, 7:15 a.m.