LabApplStat: Miscellaneous Scripts from the Data Science Laboratory (UCPH)

Documented in plot.designDiagram print.designDiagram summary.designDiagram update.designDiagram

#' @title The \code{designDiagram} class and some basic methods
#'
#' @name designDiagram-class
#' @description
#' Objects of class \code{designDiagram} as generated by \code{\link{DD}} is a list with entries as specified below.
#' \describe{
#'   \item{\code{response}}{Logical stating whether a response variable was present.}
#'   \item{\code{terms}}{Named vector with all terms in the design.}
#'   \item{\code{random.terms}}{Vector with the random terms in the design.}
#'   \item{\code{relations}}{Named matrix with relations between variables with the following interpretation: "0"=linear indepent, "<"=row term is a subspace of column, "<-"=row term is a subspace of column term and no other terms are inbetween, ">" and "->" the similar interpretatioin between columns and rows, name=name of minimum between row and column term.}
#'   \item{\code{inner}}{Named matrix of squared inner products of subspaces with nesting subspaces removed. Rounded at order of \code{eps} in the call to \code{link{DD}}. Used to decide orthogonality of the design.}
#'   \item{\code{Nparm}}{Named vector with the number of parameters for the terms.}
#'   \item{\code{df}}{Named vector with the degrees of freedom for the terms.}
#'   \item{\code{SS}}{Named matrix with Sum-of-Squares if a response variable was specified.}
#'   \item{\code{MSS}}{Named matrix with Mean-Sum-of-Squares if a response variable was specified.}
#'   \item{\code{pvalue}}{Named matrix with p-values for Type-I F-tests. p-values are stated at the collapsed nesting, but F-test are done against the most coarse nested random effect.}
#'   \item{\code{sigma2}}{Named vector of random effects variance estimates.}
#'   \item{\code{varcov}}{Named list of variance-covariance matrix for fixed effects relative to each of the random effects. Rounded at order of \code{eps}.}
#'   \item{\code{coordinates}}{Data frame with node coordinates of the terms. Initialized in Sugiyama layout.}
#' }
#' 
#' @param x object of class \code{designDiagram}
#' @param object object of class \code{designDiagram}
#' @param circle character specifying which circles to draw at the terms: \code{"none"}=no circles, \code{"SS"}=a circle with area proportional to the associated Sum-of-Squares, \code{"MSS"}=a circle with area proportional to the associated Mean-Sum-of-Squares, \code{"I"}=a circle with area proportional to average information, \code{"I2"}=a circle with area proportional to average information of the parameter contrasts. Defaults to \code{"none"}.
#' @param pvalue boolean specifying whether p-values should be inserted on the graphs. This is only possible if a response variable was specified. Defaults to \code{TRUE} is \code{circle="MSS"} and \code{FALSE} otherwise.
#' @param sigma2 vector of random effects variances. Defaults to \code{NULL}, in which case the estimates are used (if present), otherwise all variances are set to 1.
#' @param kill formula specifying which cirlces not to plot. Defaults to \code{~1} corresponding to not plotting the intercept term (that otherwise may overweight the remaining terms).
#' @param ca boolean deciding whether collinearity analysis is visualized. If \code{NULL} then set \code{TRUE} for non-orthogonal designs, and to \code{FALSE} for orthogonal designs. Defaults to \code{FALSE}.
#' @param max.area numeric specifying the used maximal area of circles. If \code{NULL} then \code{max.area} is derived from \code{SS}, \code{MSS} or \code{I} according to value of \code{circle}. Defaults to \code{NULL}.
#' @param relative positive numeric, which specifies needed relative increase for an area to be visualized in the collinearity analysis. Defaults to \code{0.01}.
#' @param color color of circles when \code{ca=FALSE}. Defaults to \code{NULL} corresponding to preassigned choice of colors (see details below).
#' @param circle.scaling numeric specifying size scaling of circles. Defaults to \code{1}, which corresponds to the largest circle having a radius that is half of the shortest distance between two nodes.
#' @param arrow.type specifying arrow heads via \code{\link[grid]{arrow}}. Defaults to \code{arrow(angle=20,length=unit(4,"mm"))}.
#' @param xlim x-range of diagram plot. Defaults to \code{c(0,1)}.
#' @param ylim y-range of diagram plot. Defaults to \code{c(0,1)}.
#' @param horizontal boolean specifying if the design diagram should be drawn horizontally or vertically. Defaults to \code{TRUE}.
#' @param ... not used.
#' 
#' @details For \code{plot.designDiagram} the options \code{circle="SS"} and \code{circle="MSS"} are only available if a response variable was specified for the design. 
#' For \code{circle="I"} and \code{circle="I2"} the color of the circles visualize the coefficient of variation of the informations. For the computation of the informations the variances of the random effects are either estimated (if a response variable is present), all set to 1 (otherwise), or given via the option \code{sigma2}.
#' 
#' If \code{color=NULL} and \code{ca=FALSE}, then the defaults colors are \code{"lightgreen"} for Sum-of-Squares, \code{"lightblue"} for Mean-Sum-of-Squares, and a gradient from \code{"limegreen"} to \code{"orange"} for information spread. To specify a different color gradient in the latter case, then give a vector of two colors.
#' 
#' For \code{update.designDiagram} the second argument should be a data frame with new \code{coordinates}. This can be usefull for manually setting the coordinates for plotting.
#' 
#' @seealso \code{\link{DD}}
#' 
#' @rdname designDiagram-class
#' 
#' @export
print.designDiagram <- function(x,...) {
  # Table of dimensions
  if (any(x$inner[upper.tri(x$inner)]!=0)) {
    cat("Non-orthogonal design with dimensions:\n")
  } else {
    cat("Orthogonal design with dimensions:\n")
  }
  print(rbind(Nparm=x$Nparm,df=x$df))
}


#' @rdname designDiagram-class
#' @export
summary.designDiagram <- function(object,...) {
  # Table of dimensions
  if (any(object$inner[upper.tri(object$inner)]!=0)) {
    cat("Non-orthogonal design with specifications:\n")
  } else {
    cat("Orthogonal design with specifications:\n")
  }
  if (!object$response) {
    print(rbind(Nparm=object$Nparm,df=object$df))
  } else {
    print(rbind(Nparm=object$Nparm,df=object$df,SS=object$SS[1,],MSS=object$MSS[1,]))
  }
  
  # Table of inner products
  if (any(object$inner[upper.tri(object$inner)]!=0)) {
    cat("\n")
    cat("Note: Sum-of-Squares and p-values may depend on order of terms in an non-orthogonal design.\n")
    cat("\n")
    cat("Total inner products between subspaces (used to decide orthogonality):\n")
    print(object$inner)
  }
  
  # Additional output if response is present
  cat("\n")
  if (!object$response) {
    cat("No response variable specified.\n")
  } else {
    cat("\n")
    cat("P-values for F-tests against nested random effect:\n")
    print(signif(object$pvalue,6))
  }
  
  # Table of relations
  cat("\n")
  cat("Table of relations:\n")
  print(object$relations)
}

# Hack: Define generic functions by stealing from stats-package.
# setGeneric("update",stats::update)

#' @importFrom stats update

#' @rdname designDiagram-class
#' @export
update.designDiagram <- function(object,...) {
  # take variable name from call
  new = list(...)
  # take rownames for object
  ii <- match(rownames(object$coordinates),
              rownames(new$coordinates))
  if (any(is.na(ii))) stop(paste("Terms (",paste0(rownames(object$coordinate)[is.na(ii)],collapse=","),") do not appear in reference object",sep=""))
  object$coordinates <- new$coordinates[ii,]
  return(object)
}


#' @rdname designDiagram-class
#' @export
plot.designDiagram <- function(x,circle="none",pvalue=(circle=="MSS"),sigma2=NULL,
                               kill=~1,ca=FALSE,max.area=NULL,relative=0.01,
                               color=NULL,circle.scaling=1,
                               arrow.type=arrow(angle=20,length=unit(4,"mm")),
                               xlim=c(0,1),ylim=c(0,1),
                               horizontal=TRUE,
                               ...) {
  # Hack of issue "no visible binding for global variable":
  node1.text0 <- node2.text0 <- r <- text0 <- variable <- xend <- y <- yend <- NULL
  
  # possibly auto-choose ca
  if (is.null(ca)) ca <- any(x$inner[upper.tri(x$inner)]!=0)
  
  # data frame with edges
  from <- 1+(which(x$relations=="<-")-1)%/%length(x$terms)
  to   <- 1+(which(x$relations=="<-")-1)%%length(x$terms)
  ii   <- order(from,to,decreasing=TRUE)
  g.df <- data.frame(from=as.character(from[ii]),to=as.character(to[ii]),
                     pvalue=paste0("p=",signif(x$pvalue[x$relations=="<-"][ii],digits=3)))
  g.df$pvalue[g.df$pvalue=="p=NA"] <- NA 
  
  # Layout scaled in box specified by xlim and ylim
  g <- ggraph::create_layout(g.df,x$coordinates)
  if (!horizontal) {
    tmp <- g$y
    g$y <- max(g$x)-g$x
    g$x <- tmp
  }
  if (max(g$x)>min(g$x)) {
    g$x <- xlim[1] + (xlim[2]-xlim[1])*(g$x-min(g$x))/(max(g$x)-min(g$x))
  } else {
    g$x <- mean(xlim)
  }
  if (max(g$y)>min(g$y)) {
    g$y <- ylim[1] + (ylim[2]-ylim[1])*(g$y-min(g$y))/(max(g$y)-min(g$y))
  } else {
    g$y <- mean(ylim)
  }
  
  # Text labels
  g$text  <- paste0('"',x$terms,'"[',x$df,']^',x$Nparm)[as.numeric(g$name)]
  g$text0 <- paste0(x$terms,x$Nparm)[as.numeric(g$name)]

  # Text label sizes
  tmp <- ggraph::label_rect(g$text0,fontsize=18)
  text_width  <- grid::convertX(unit(grid::unit(vctrs::field(tmp, 'width'), vctrs::field(tmp, 'width_unit')),"cm"),"npc",valueOnly = TRUE)
  text_height <- grid::convertX(unit(grid::unit(vctrs::field(tmp, 'height'), vctrs::field(tmp, 'height_unit')),"cm"),"npc",valueOnly = TRUE)
  
  # make empty graph with space for text labels inside the plot
  p <- ggraph::ggraph(g) + 
    geom_blank(aes(x=x-0.5*diff(xlim)*text_width,y=y-0.5*diff(ylim)*text_height)) +
    geom_blank(aes(x=x+0.5*diff(xlim)*text_width,y=y+0.5*diff(ylim)*text_height))

  # Radii of circles
  if (is.element(circle,c("SS","MSS","I","I2"))) {
    # switch off collinearity analysis when information is visualized
    if (is.element(circle,c("I","I2"))) ca <- FALSE
    
    # use default colors?
    if (is.null(color)) color <- switch(circle,
                                        MSS="lightblue",
                                        SS="lightgreen",
                                        c("limegreen","orange"))

    # terms that will have circles
    myterms <- setdiff(x$terms,attr(terms(kill,keep.order=TRUE),"term.labels"))
    if (attr(terms(kill),"intercept")==1) myterms <- setdiff(myterms,"1")
    if (is.element(circle,c("I","I2"))) myterms <- intersect(myterms,names(x$varcov))
    if (circle=="I2") myterms <- intersect(myterms,names(x$Nparm[x$Nparm>1]))

    # set variance parameters
    if (is.null(sigma2))    sigma2 <- x$sigma2
    if (any(is.na(sigma2))) sigma2[] <- 1
    
    # choose maximal radius
    max.r <- circle.scaling*0.5*sqrt(outer(g$x,g$x,"-")^2 + outer(g$y,g$y,"-")^2)
    max.r <- min(max.r[upper.tri(max.r)])
    
    # visualize collinearity analysis?
    if (ca) {
      # with collinearity analysis
      if (circle=="SS") {SS <- x$SS[,myterms,drop=FALSE]} else {SS <- x$MSS[,myterms,drop=FALSE]}
      ii <- apply(SS,2,function(y){min(which(diff(c(y[!is.na(y)],-1))<0))})
      # make legend
      variables <- setdiff(x$terms,setdiff(setdiff(x$terms,myterms),rownames(SS)))
      p <- p + geom_blank(aes(col=variable,fill=variable),data.frame(variable=factor(variables,levels=variables)))
      # filled bullseye
      for (i in max(ii):1) {
        # which variables should be shown?
        if (i==1) {
          jj <- (ii>=i) 
        } else {
          jj <- (ii>=i) & ((SS[i,]-SS[1,])/SS[1,] > relative)
        }
        # find radii
        # TO DO: g$name appears to be a character! Is this always true?
        mydf <- g[order(as.numeric(g$name)),]
        mydf <- mydf[is.element(x$terms,myterms),]
        mydf <- mydf[jj,]
        area <- SS[i,jj]
        mydf$r <- max.r*sqrt(area/ifelse(is.null(max.area),max(SS,na.rm=TRUE),max.area))
        if (i > 1) {
          mydf$variable <- factor(rownames(SS)[i],levels=variables)
        } else {
          mydf$variable <- factor(colnames(SS),levels=variables)
        }
        p <- p + ggraph::geom_node_circle(aes(col=variable,fill=variable),data=mydf)
      }
      # circumference bullseye
      for (i in nrow(SS):2) {
        # which variables should be shown?
        # TO DO: What is nrow(x$SS)=1 ??
        jj <- (i>ii) & ((SS[i,]-SS[cbind(ii,1:length(ii))])/SS[cbind(ii,1:length(ii))] > relative)
        jj <- jj & (!is.na(jj))
        # find radii
        mydf <- g[order(as.numeric(g$name)),]
        mydf <- mydf[is.element(x$terms,myterms),]
        mydf <- mydf[jj,]
        if (nrow(mydf)>0) {
          area <- SS[i,jj]
          mydf$r <- max.r*sqrt(area/ifelse(is.null(max.area),max(SS,na.rm=TRUE),max.area))
          mydf$variable <- factor(rownames(SS)[i],levels=variables)
          p <- p + ggraph::geom_node_circle(aes(col=variable),data=mydf,lty=2,lwd=1.2)
        }
      }
      # coord fixed
      p <- p + coord_fixed()
    } else {
      # help function for I2: informations for pairwise contrasts
      myfct <- function(y) 1/apply(do.call("rbind",sapply(2:ncol(y),function(i) cbind(i,1:(i-1)),simplify=FALSE)),
                                   1,function(z) c(1,-1)%*%y[z,z]%*%c(1,-1))
      # without collinearity analysis
      area <- switch(circle,
                     SS={x$SS[1,myterms]},
                     MSS={x$MSS[1,myterms]},
                     I={unlist(lapply(lapply(lapply(x$varcov[myterms],
                               function(y) Map("*",sigma2,y)),
                               function(y) Reduce("+",y)),
                               function(y) mean(1/diag(y))))},
                     I2={unlist(lapply(lapply(lapply(lapply(x$varcov[myterms],
                                function(y) Map("*",sigma2,y)),
                                function(y) Reduce("+",y)),
                                myfct),
                                mean))})
      area <- ifelse(is.nan(area),0,area)

      mydf <- g[order(as.numeric(g$name)),]
      mydf <- mydf[is.element(x$terms,myterms),]
      mydf$r <- max.r*sqrt(area/ifelse(is.null(max.area),max(area,na.rm=TRUE),max.area))

      # add circles
      if (is.element(circle,c("I","I2"))) {
        # compute var(information)
        tmp <- switch(circle,
                      I={unlist(lapply(lapply(lapply(lapply(x$varcov[myterms],
                                function(y) Map("*",sigma2,y)),
                                function(y) Reduce("+",y)),
                                function(y) 1/diag(y)),
                                function(y) mean((y-mean(y))^2)))},
                      I2={unlist(lapply(lapply(lapply(lapply(x$varcov[myterms],
                                 function(y) Map("*",sigma2,y)),
                                 function(y) Reduce("+",y)),
                                 myfct),
                                 function(y) mean((y-mean(y))^2)))})
        cv <- ifelse(is.nan(tmp),0,sqrt(tmp)/area)
        mydf$cv <- cv
        # if completely balanced information, then only use 1 color
        if (max(cv)==0) color <- rep(color[1],2)
        # guide name
        tmp <- ifelse(circle=="I","CV(main)","CV(pairs)")
        # add circles
        p <- p + ggraph::geom_node_circle(aes(r=r,color=cv,fill=cv),data=mydf) +
          coord_fixed() + 
          ggplot2::scale_color_gradient(low=color[1],high=color[2],limits=c(0,NA)) + 
          ggplot2::scale_fill_gradient(low=color[1],high=color[2],limits=c(0,NA)) +
          ggplot2::labs(color=tmp,fill=tmp)
      } else {
        p <- p + ggraph::geom_node_circle(aes(r=r),data=mydf,col=color,fill=color) +
          coord_fixed()
      }
    }
  }

  # add term names, number of parameters and degrees of freedom
  p <- p + ggraph::geom_node_text(aes(x=x,y=y,label=text),parse=TRUE)
  
  # add arrows
  p <- p + ggraph::geom_edge_link(aes(start_cap=ggraph::label_rect(node1.text0),
                                  end_cap=ggraph::label_rect(node2.text0)),
                                  arrow=arrow.type)
  
  # add p-values
  if (pvalue) {
    p <- p + geom_label(aes(x=(xend+x)/2, 
                            y=(yend+y)/2 - diff(ylim)*grid::convertY(unit(4,"mm"),"npc",valueOnly=TRUE)*(yend==y), label=pvalue), 
                        ggraph::get_edges(),label.padding = unit(0.15,"lines"))
  }
  
  # return graph
  return(p)
}
bomarkussen/LabApplStat documentation built on Oct. 29, 2022, 1:24 p.m.
rdrr.io home R language documentation Run R code online
CRAN packages Bioconductor packages R-Forge packages GitHub packages
Note that we can't provide technical support on individual packages. You should contact the package authors for that.
bomarkussen/LabApplStat
Miscellaneous Scripts from the Data Science Laboratory (UCPH)

R/methods_designDiagram.R
In bomarkussen/LabApplStat: Miscellaneous Scripts from the Data Science Laboratory (UCPH)

Defines functions plot.designDiagram update.designDiagram summary.designDiagram print.designDiagram

Documented in plot.designDiagram print.designDiagram summary.designDiagram update.designDiagram

R Package Documentation

Browse R Packages

We want your feedback!

bomarkussen/LabApplStat Miscellaneous Scripts from the Data Science Laboratory (UCPH)

R/methods_designDiagram.R In bomarkussen/LabApplStat: Miscellaneous Scripts from the Data Science Laboratory (UCPH)

Defines functions plot.designDiagram update.designDiagram summary.designDiagram print.designDiagram

Documented in plot.designDiagram print.designDiagram summary.designDiagram update.designDiagram

R Package Documentation

Browse R Packages

We want your feedback!

bomarkussen/LabApplStat
Miscellaneous Scripts from the Data Science Laboratory (UCPH)

R/methods_designDiagram.R
In bomarkussen/LabApplStat: Miscellaneous Scripts from the Data Science Laboratory (UCPH)