R/multBinExact_2-4.R

Defines functions mfisher.test print.multfisher plot.multfisher C.ii.fn C.ii.t.fn vek2array.fn mult.fisher.2J.fn optimal.2J.fn dec2bin namen.fn subset.matrix.fn bb.fn check.cond.ii.fn check.cond.ii.remove.fn p.val.fun dhyper.fn bonferroni.greedy.fn

Documented in mfisher.test plot.multfisher print.multfisher

#' @title Optimal Exact Tests for Multiple Binary Endpoints
#'
#' @description Calculates global tests and multiple testing procedures to compare two groups with respect to multiple binary endpoints based on optimal rejection regions.
#'    
#' @param x a data frame of binary response vectors, or an array of numbers of failures and successes in the treatment group, or a list of marginal \emph{2 by 2} tables, see details.
#' @param y a vector of group allocations, or an array of numbers of failures and successes in the reference group, see details.
#' @param method a character variable indicating which optimization procedure to use.
#'	This can be one of \code{"alpha.greedy"}, \code{"alpha"}, \code{"number"}, \code{"power"} or \code{"bonferroni.greedy"}, see details.
#' @param alpha nominal significance level, the default is 0.025. Note that the test is one-sided.
#' @param p1 an array of assumed probabilities for failure and success in the treatment group, see details.
#' @param p0 an array of assumed probabilities for failure and success in the reference group, see details.
#' @param max.iter the maximal number of iterations in the branch and bound optimzation algorithm. Defaults to 10^5.
#' @param limit the value below which contributions to alpha are set to zero (and alpha is lowered accordingly) to speed up computation. Defaults to 0.
#' @param show.region logical, if \code{TRUE} a data frame indicating which possible outcome is element of the rejection region of the global test is added to the output. Defaults to \code{FALSE}.
#' @param closed.test logical, if \code{TRUE} adjusted p-values for the elementary null hypotheses are calculated by applying the specified test to all intersection hypotheses in a closed testing scheme. This can be 
#'	computer intensive, depending on the number of endpoints. 
#' @param consonant logical indicating if the global test should be constrained such that the resulting closed test is consonant. This option is only available for two endpoints. Note that the
#' Bonferroni greedy method is always consonant by construction.
#'
#' @details The null hypothesis for the global test is an identical multidimensional distribution of successes and failures in both groups.
#' The alternative hypothesis is a larger success proportion in the treatment group in at least one endpoint.
#'
#' \code{x} can be a data frame with one row per subject and one column for each endpoint. Only values of 0 or 1 are allowed,
#' with 0 indicating failure and 1 indicating success of the subject for the particular endpoint. In that case \code{y} needs to be a vector of group assignemnts with values 0 and 1,
#' where 0 is the reference group and 1 the treatment group.
#' Alternatively, \code{x} and \code{y} can be contingency tables in terms of \emph{2 by 2 by ... by 2} arrays. Each dimension of the array corresponds to one endpoint, the first coordinate position
#' in each dimension refers to failure in that endpoint, the second coordinate position refers to success. The array contains the number of subjects that were observed
#' for each possible combination of failures and successes. 
#' If \code{x} is a list of marginal \emph{2 by 2} tables, the Bonferroni greedy method is used. Matching the other input
#' variants, the \emph{2 by 2} tables are assumed to have the number of failures in the first row and the number of successes in the second row, and the first column to correspond to
#' the reference group, the second column to the treatment group.
#'
#' The methods \code{"alpha.greedy"}, \code{"alpha"}, \code{"number"} and \code{"power"} are based on the multivariate permutation distribution of the data conditional 
#' on the observed numbers of successes and failures across both groups. The method \code{"alpha.greedy"} uses a greedy algorithm aiming to exhaust the nominal significance level.
#' The methods \code{"alpha"}, \code{"number"} and \code{"power"} use a branch and bound algorithm to find rejection regions with, respectively,
#' maximal exhaustion of the nominal significance level, maximal number of elements or maximal power for the alternative given by \code{p1} and \code{p0}. 
#' The method \code{"bonferroni.greedy"} uses a greedy algorithm aiming to exhaust the nominal significance level of a weighted Bonferroni adjustment of multiple Fisher's exact tests.
#' See reference for further details.
#'
#' \code{p1} and \code{p0} are \emph{2 by 2 by ... by 2} arrays. Each dimension of the array corresponds to one endpoint, the first coordinate position
#' in each dimension refers to failure in that endpoint, the second coordinate position refers to success. 
#' The array contains the assumed true probabilities for each possible combination of failures and successes. 
#'
#' @return A list with class \code{multfisher} containing the following components:
#' \describe{
#'	\item{\code{call}}{the function call.}
#'	\item{\code{data}}{a data frame showing the aggregated input data. If \code{p1} and \code{p0} are provided they are included in vectorized form.}
#'	\item{\code{alpha}}{the value of \code{alpha}.}
#'	\item{\code{method}}{the chosen method as found by argument match to \code{method}.}
#'	\item{\code{statistic}}{the vector of test statistics, these are the marginal numbers of successes in the treatment group.}
#'	\item{\code{p.value}}{the p-value of the global test. See reference for details on the calculation.}
#'	\item{\code{conditional.properties}}{a list of the actual significance level, the number of elements and the power of the global test. The values are calculated from the permutation 
#'		distribution of the date and they are conditional on the observed total numbers of successes and failures. The power is calculated for the alternative defined through
#'		\code{p1} and \code{p0}. If \code{p1} and \code{p0} are not specified, the value for power is \code{NA}.}
#'	\item{\code{rej.region}}{Provided if \code{show.region} is \code{TRUE} and method is in \code{c("alpha","number","power","alpha.greedy")}. A data frame showing in the column rejection.region
#'		if a multidimensional test statistic, indicated by the previous columns, is element of the rejection region (value of 1) or not (value of 0) for the global level alpha test.
#'		The column alpha gives the probability of  observing the particular vector of test statistics under the null hypothesis and conditional on the observed total numbers of
#'		successes and failures. Values of 0 occur if a combination of test statistics is not possible in the conditional distribution. The column power shows the conditional probability
#'		under the alternative defined through \code{p1} and \code{p0}. If \code{p1} and \code{p0} are not specified, the values for power are \code{NA}.}
#'	\item{\code{elementary.tests}}{a data frame showing for each endpoint the marginal odds ratio, the unadjusted one-sided p-value of Fisher's exact test and the adjusted 
#'		p-value resulting from application of the optimal exact test in a closed testing procedure.}
#'	\item{\code{closed.test}}{a data frame indicating all intersection hypotheses in the closed test and giving their p-values.}
#'	\item{\code{consonant.constraint}}{logical indicating whether the consonance constraint was used.}
#'	\item{\code{OPT}}{a list summarizing the optimization success, if applicable. The number of iterations of the branch and bound algorithm is given, as well as the 
#'		specified maximal iteration number and a logical variable indicating whether the optimization (in all steps of the closed test, if applicable) was finished.
#'		The number of iterations may be 0, which indicates that the optimization problem was solved in a pre-processing step.}
#' }
#'
#' @author Robin Ristl, \email{robin.ristl@@meduniwien.ac.at}
#' @references Robin Ristl, Dong Xi, Ekkehard Glimm, Martin Posch (2018), Optimal exact tests for multiple binary endpoints. 
#'     \emph{Computational Statistics and Data Analysis}, \strong{122}, 1-17. doi: 10.1016/j.csda.2018.01.001 (open access)
#'
#' @seealso \code{\link{print.multfisher}}, \code{\link{plot.multfisher}}
#'
#' @examples
#' ## Examples with two endpoints
#' data<-data.frame(endpoint1=c(0,0,1,1,1,0,0,0,0,1,1,1,1,1,1, 0,0,1,0,0,1,1,1,1,1,1,1,1,1,1),
#'			  endpoint2=c(0,0,0,0,0,1,1,1,1,1,1,1,1,1,1, 0,0,0,1,1,1,1,1,1,1,1,1,1,1,1),
#'			  group=rep(c(0,1),each=15))
#' ## maximal power under a specified alternative
#' p1<-matrix(c(0.1,0.2,0.2,0.5),2,2)
#' p0<-matrix(c(0.75,0.1,0.1,0.05),2,2)
#' rownames(p1)<-rownames(p0)<-c("EP1_failure","EP1_success")
#' colnames(p1)<-colnames(p0)<-c("EP2_failure","EP2_success")
#' testpower<-mfisher.test(x=data[,c(1:2)],y=data$group,method="power",
#'		p1=p1,p0=p0,closed.test=TRUE,show.region=TRUE)
#' print(testpower)
#' plot(testpower,cex=2)
#' str(testpower)
#'
#' ## maximal alpha with consonance constraint and using aggregated data as input
#' tab1<-table(data$endpoint1[data$group==1],data$endpoint2[data$group==1])
#' tab0<-table(data$endpoint1[data$group==0],data$endpoint2[data$group==0])
#' testalpha<-mfisher.test(x=tab1,y=tab0,method="alpha",closed.test=TRUE,
#'		show.region=TRUE,consonant=TRUE)
#' print(testalpha)
#' plot(testalpha,cex=2)
#'
#' ## Examples with three endpoints
#' data3EP<-data.frame(endpoint1=c(0,0,0,0,0,1,1,0,0,0, 0,0,0,0,1,1,1,1,1,1),
#'			     endpoint2=c(0,0,0,0,0,1,0,1,0,0, 0,0,1,1,1,1,1,1,1,1),
#'			     endpoint3=c(0,0,0,0,0,0,0,0,1,1, 0,0,0,1,1,1,1,1,1,1),
#'			     group=rep(c(0,1),each=10))
#'
#' ## greedy alpha exhaustion
#' testgreedy3EP<-mfisher.test(x=data3EP[,1:3],y=data3EP$group,method="alpha.greedy",
#'		show.region=TRUE,closed.test=TRUE)
#' print(testgreedy3EP)
#' par(mfrow=c(3,3))
#' for(i in 1:9) {
#'	plot(testgreedy3EP,dim=c(1,2),slice=list(T3=i),show.titles=FALSE,cex=2,xlim=c(0,8),ylim=c(0,10))
#'	title(paste("T3 =",i))
#' }
#'
#' ## Bonferroni greedy
#' mfisher.test(x=data3EP[,1:3],y=data3EP$group,method="bonferroni.greedy",closed.test=TRUE)
#' ## Bonferroni greedy with alternative input of marginal tables
#' mfisher.test(x=list(table(data3EP$endpoint1,data3EP$group),
#'		table(data3EP$endpoint2,data3EP$group),table(data3EP$endpoint3,data3EP$group)),
#'		method="bonferroni.greedy",closed.test=TRUE)
#'
#' @export
mfisher.test <- function(x,y=NULL,method=c("alpha.greedy","alpha","number","power","bonferroni.greedy"),alpha=0.025,p1=NULL,p0=NULL,max.iter=10^5,limit=0,show.region=FALSE,closed.test=FALSE,consonant=FALSE) {

	method<-match.arg(method,choices=c("alpha.greedy","alpha","number","power","bonferroni.greedy"))

	if(!method%in%c("alpha","number","power","alpha.greedy","bonferroni.greedy")) stop('method must be one of "alpha", "number", "power", "alpha.greedy", "bonferroni.greedy".')
	if(!is.logical(consonant)) stop('consonant must be logical.')
	
	if(is.list(x) & !is.data.frame(x)) {
		if(method!="bonferroni.greedy") stop(paste("Method",method,"not applicable to the provided data."))
		if(any(is.na(x))) stop("No missing values allowed if passing aggregated data.")
		anz.dim<-length(x)
		if(is.null(names(x)) | any(names(x)=="")) EPnames<-paste("EP",1:anz.dim,sep="") else EPnames<-names(x)
		dataTab<-x
		names(dataTab)<-EPnames	
	} else {
		if(is.vector(y) & !is.matrix(y) & !is.array(y)) {
			x<-as.matrix(x)
			if(length(y)!=dim(x)[1]) stop("the number of rows in x must be identical to the length of y.")
			keep<-rowSums(cbind(is.na(x),is.na(y)))==0
			if(any(keep==FALSE)) {
				x<-x[keep,]
				y<-y[keep]
				warning(paste(sum(!keep),"observation(s) removed due to missing values."))
			}
			if(any(x!=1 & x!=0)) stop("x must contain 0 and 1 only.")
			if(any(y!=1 & y!=0)) stop("y must contain 0 and 1 only.")
			anz.dim<-dim(x)[2]
			if(is.null(colnames(x))) EPnames<-paste("EP",1:anz.dim,sep="") else EPnames<-colnames(x)
			basis<-2^c(0:(anz.dim-1))
			category<-factor(x%*%basis,levels=0:(2^anz.dim-1))
			tab<-matrix(table(category,y),ncol=2)
		} else {
			if(any(is.na(x)) | any(is.na(y))) stop("No missing values allowed if passing aggregated data.")
			if(length(dim(x))!=length(dim(y))) stop("x and y must be of same dimension.")
			anz.dim<-length(dim(x))
			tab<-matrix(c(as.vector(y),as.vector(x)),ncol=2)
			x.names<-names(dimnames(x))
			y.names<-names(dimnames(y))
			#if(any(x.names!=y.names)) warning("Names of x and y do not match. Is the data correct?")
			#if(any(x.names=="")) 
			EPnames<-paste("EP",1:anz.dim,sep="") #else EPnames<-x.names
		}
		M<-namen.fn(anz.dim)
		dataTab<-cbind(M,tab)
		colnames(dataTab)<-c(paste("Success_",EPnames,sep=""),"CountGroup0","CountGroup1")
		rownames(tab)<-rownames(dataTab)<-apply(M,1,paste,collapse="")
		
		if(method=="bonferroni.greedy") {
			#marginal table list
			#x als Liste von 2x2 Tafeln. Schema: Zeilen: Failure, Success; Spalten: Ctr, Trt
			x<-vector(mode="list",length=anz.dim)
			for(i in 1:anz.dim) {
				set<-rep(FALSE,anz.dim)
				set[i]<-TRUE
				P<-subset.matrix.fn(set,anz.dim)
				x[[i]]<-P%*%tab
			}			
		}

		tab<-tab[,c(2,1)]
		
		#Elementary test statistics
		M<-t(namen.fn(anz.dim))
		T<-as.numeric(M%*%tab[,1])	

		if(method=="power") if(is.null(p1) | is.null(p0)) stop('p1 and p0 must be provided for method "power".')
		if(!is.null(p1) | !is.null(p0)) {
			#if(length(p1)!=(2^anz.dim) | length(p0)!=(2^anz.dim)) stop("Incorrect length of p1 or p0.")
			if( any( c(dim(p1),dim(p0))!=2) | length(dim(p1))!=anz.dim | length(dim(p0))!=anz.dim) stop("Incorrect dimension of p1 or p0.")
			p1<-as.vector(p1)
			p0<-as.vector(p0)
		}

		if(method%in%c("alpha","number","power","alpha.greedy")) {
			if(consonant==TRUE & anz.dim!=2) {
				consonant<-FALSE
				warning('Consonance constraint is only available for two endpoints.')
			}
			#Permutation distribution
			A<-mult.fisher.2J.fn(tab=tab,p1=p1,p2=p0,calc.marg.p=consonant)
		}
	}

	#Rejection region
	if(method%in%c("alpha","number","power")) {
		R<-optimal.2J.fn(V=A$vek.out,objective=method,alpha=alpha,algorithm="BB",limit=limit,maxiter=max.iter,return.array=FALSE,elementary.consonance=consonant)
		V<-p.val.fun(V=R$V,T=T,anz.dim=anz.dim,consonant=FALSE,alpha=alpha)
		if(R$opt.status==-1) {
			OPT<-list(iterations=0,maxiter=max.iter,optimization.finished=TRUE)
		} else {
			OPT<-list(iterations=R$OPT$iterations,maxiter=R$OPT$maxiter,optimization.finished=R$OPT$optimization.finished)
		}
	}
	if(method=="alpha.greedy") {
		R<-list(V=A$vek.out,anz.dim=anz.dim)
		if(consonant==FALSE) {
			V<-p.val.fun(V=R$V,T=T,anz.dim=anz.dim,all.p.values=show.region,alpha=alpha)
			if(show.region) V$V$solution[!is.na(V$V$p.val) & V$V$p.val<=alpha]<-1
		} else {
			Vtemp<-p.val.fun(V=R$V,T=T,anz.dim=anz.dim,all.p.values=TRUE,consonant=TRUE,alpha=alpha)
			Vtemp$V$solution[!is.na(Vtemp$V$p.val) & Vtemp$V$p.val<=alpha & (Vtemp$V$pval1<=alpha | Vtemp$V$pval2<=alpha)]<-1
			V<-p.val.fun(V=Vtemp$V,T=T,anz.dim=anz.dim,consonant=FALSE,alpha=alpha)
			#if(show.region) V$V$solution[!is.na(V$V$p.val) & V$V$p.val<=alpha]<-1
		}
		OPT<-list(iterations=NA,maxiter=NA,optimization.finished=NA)
	}
	if(method=="bonferroni.greedy") {
		test<-bonferroni.greedy.fn(x)
		V<-list(p.value=test$p)
		OPT<-list(iterations=NA,maxiter=NA,optimization.finished=NA)
		T<-test$stat
	}

	#closed test
	if(closed.test==TRUE & anz.dim==1) {
		warning("Just one endpoint defined. No closed test is calculated.")
		closed.test<-FALSE	
	}
	if(closed.test==TRUE) {
		S<-namen.fn(anz.dim)[-1,] == 1
		k<-dim(S)[1]
		p.intersect<-rep(NA,k)
		opt.intersect<-rep(FALSE,k)
		if(method=="bonferroni.greedy") OR<-test$OR else OR<-rep(NA,anz.dim)
		
		p.intersect[k]<-V$p.value
		opt.intersect[k]<-OPT$optimization.finished

		#layer<-1+rowSums(!S) #there are as many layers as dimensions (Endpoints)
		#with k we get the full table and hence the global intersection test
		for(i in (k-1):1) {

			if(method%in%c("alpha","number","power","alpha.greedy")) {

				#Projection matrix
				P<-subset.matrix.fn(S[i,],anz.dim)
				tab.lokal<-P%*%tab
				if(dim(tab.lokal)[1]==2) {
					p<-fisher.test(tab.lokal[c(2,1),],alternative="greater")$p.value
					p.intersect[i]<-p
					opt.intersect[i]<-TRUE
					OR[(1:anz.dim)[S[i,]]]<-tab.lokal[2,1]/tab.lokal[1,1]*tab.lokal[1,2]/tab.lokal[2,2]
				} else {
					if(!is.null(p1)) p1.lokal<-as.numeric(P%*%p1) else p1.lokal<-NULL
					if(!is.null(p0)) p2.lokal<-as.numeric(P%*%p0) else p2.lokal<-NULL
					A.lokal<-mult.fisher.2J.fn(tab=tab.lokal,p1=p1.lokal,p2=p2.lokal)
					if(method%in%c("alpha","number","power")) {
						R.lokal<-optimal.2J.fn(V=A.lokal$vek.out,objective=method,alpha=alpha,algorithm="BB",limit=limit,maxiter=max.iter,return.array=FALSE)
						opt.intersect[i]<-R.lokal$opt.status
					} 
					if(method=="alpha.greedy") {
						R.lokal<-list(V=A.lokal$vek.out,anz.dim=A.lokal$anz.dim)
						opt.intersect[i]<-NA
					}
					V.lokal<-p.val.fun(V=R.lokal$V,T=T[S[i,]],anz.dim=R.lokal$anz.dim,alpha=alpha)
					p.intersect[i]<-V.lokal$p.value
				}
			}
			if(method=="bonferroni.greedy") {
				test.intersect<-bonferroni.greedy.fn(x,subset=(1:anz.dim)[S[i,]])
				p.intersect[i]<-test.intersect$p
				opt.intersect[i]<-NA
			}
		}
		
		#unadjusted p-values
		p<-p.intersect[rowSums(S)==1]
		#adjusted p-values
		p.adj<-rep(NA,anz.dim)
		for(i in 1:anz.dim) p.adj[i]<-max(p.intersect[S[,i]])
		out.CT<-data.frame(OR,p,p.adj)
		rownames(out.CT)<-EPnames
		CT.details<-as.data.frame(cbind(S,p.intersect))
		names(CT.details)[1:anz.dim]<-EPnames
		if(method%in%c("alpha","number","power")) OPT$optimization.finished<-all(opt.intersect!=0)
	} else {
		out.CT<-NA
		CT.details<-NA
	}

	if(method%in%c("alpha","number","power")) {
		if(OPT$optimization.finished==FALSE) warning("Maximal number of iterations reached, optimization was not finished.")
	}
	
	names(T)<-paste("T",1:anz.dim,sep="")
	if(method%in%c("alpha","number","power","alpha.greedy")) {
		dataTab<-as.data.frame(dataTab)
		dataTab$p0<-p0
		dataTab$p1<-p1
	}
	
	out<-list(call=match.call(),data=dataTab,alpha=alpha,method=method,statistic=T,p.value=V$p.value,conditional.properties=NA,rej.region=NA,elementary.tests=out.CT,closed.test=CT.details,consonance.constraint=consonant,OPT=OPT)
	#Properties of global test conditional on observed data
	if(method%in%c("alpha","number","power","alpha.greedy")) {
		out$conditional.properties<-list(alpha=sum(V$V$alpha[V$V$solution==1]),number=sum(V$V$solution),power=sum(V$V$power[V$V$solution==1]))
	} else {
		out$conditional.properties<-NULL
	}
	if(show.region) {
		if(method=="bonferroni.greedy") {
			warning("show.region ignored when method is bonferroni.greedy.")
		} else {
			out$rej.region<-V$V[c(paste("T",1:anz.dim,sep=""),"solution","alpha","power")]
			colnames(out$rej.region)[anz.dim+1]<-"rejection.region"
		}
	} else {
		out$rej.region<-NULL
	}
	if(closed.test==FALSE) {
		out$elementary.tests<-NULL
		out$closed.test<-NULL
	}
	class(out)<-"multfisher"
	return(out)
} 

#' @title Print Values from a \code{multfisher} Object
#'
#' @description Print the test results.
#'    
#' @param x an object of class \code{multfisher}
#' @param ... further arguments passed to other methods. Not used.
#' @author Robin Ristl, \email{robin.ristl@@meduniwien.ac.at}
#' @seealso \code{\link{mfisher.test}}, \code{\link{plot.multfisher}}
#' @export
print.multfisher<-function(x,...) {
	cat("\n    Optimal exact test for multiple binary endpoints\n\n")
	cat("Aggregated data:\n")
	print(x$data)
	cat("\n")
	if(x$method%in%c("alpha","number","power")) {
		if(x$consonance.constraint==TRUE) constext<-"       with consonance constraint\n" else constext<-""
		cat(paste('Method: Optimization of',paste('"',x$method,'"',sep=""),"for nominal alpha =",x$alpha,"\n",constext))
	} else {
		if(x$consonance.constraint==TRUE) constext<-"with consonance constraint\n" else constext<-"\n"
		cat(paste('Method:',paste('"',x$method,'"',sep=""),constext))
	}
	cat("\n")
	cat(paste("Global test p-value =",round(x$p.value,4),"\n"))
	cat("\n")
	cat("Null hypothesis: Equal success distribution in both groups.\n")
	cat("Alternative hypothesis: Larger success proportion in the treatment group in at least one endpoint.\n")
	if(!is.null(x$elementary.tests)) {
		cat("\nMarginal odds ratios and adjusted p-values:\n")
		x$elementary.tests[,1]<-round(x$elementary.tests[,1],2)
		x$elementary.tests[,2:3]<-round(x$elementary.tests[,2:3],4)
		print(x$elementary.tests)
	}
	cat("\n")
}

#' @title Plot Rejection Region from a \code{multfisher} Object
#'
#' @description Plot two dimensions of the rejection region.
#'    
#' @param x an object of class \code{multfisher}
#' @param dim a vector of length two, indicating which two dimensions of the rejection region to plot. The default is \code{c(1,2)}.
#'	The argument is ignored if \code{x} was calculated for only one endpoint.
#' @param slice a named list of numeric values at which the test statistics for the dimensions (assumedly) not included in \code{dim} are held constant, see details.
#' @param show.titles logical indicating whether the plot title and explanatory subtitles are shown.
#' @param ... further arguments passed to the generic \code{\link[graphics]{plot}} function.
#' @details The function produces plots of the multivariate rejection regions calculated by \code{\link{mfisher.test}}. 
#'	For more than two dimensions, the default \code{slice=NULL} shows a
#'	projection of the rejection region on the two dimensions indicated in \code{dim}. \code{slice} may be specified to produce plots of slices through the 
#'	multivariate rejection region. In that case \code{slice} must be a named list of numeric objects. The names must be of the form \code{Ti}, where \code{i} is replaced
#'	by the number of the dimension to be indicated. The numeric value defines at which value the test statistic for the indicated dimension is held constant.
#'	(If there are dimensions that are neither indicated in \code{dim} nor in \code{slice}, the plot is still a projection.)
#' @author Robin Ristl, \email{robin.ristl@@meduniwien.ac.at}
#' @seealso \code{\link{mfisher.test}}, \code{\link{print.multfisher}}
#'	
#' @examples
#' ## Example with two endpoints
#' data<-data.frame(endpoint1=c(0,0,1,1,1,0,0,0,0,1,1,1,1,1,1, 0,0,1,0,0,1,1,1,1,1,1,1,1,1,1),
#'			endpoint2=  c(0,0,0,0,0,1,1,1,1,1,1,1,1,1,1, 0,0,0,1,1,1,1,1,1,1,1,1,1,1,1),
#'			group=rep(c(0,1),each=15))
#' plot(mfisher.test(x=data[,c(1:2)],y=data$group,show.region=TRUE),cex=2)
#'
#' ## Example with three endpoints
#' data3EP<-data.frame(endpoint1=  c(0,0,0,0,0,1,1,0,0,0, 0,0,0,0,1,1,1,1,1,1),
#'			  endpoint2=  c(0,0,0,0,0,1,0,1,0,0, 0,0,1,1,1,1,1,1,1,1),
#'			  endpoint3=  c(0,0,0,0,0,0,0,0,1,1, 0,0,0,1,1,1,1,1,1,1),
#'			  group=rep(c(0,1),each=10))
#' testgreedy3EP<-mfisher.test(x=data3EP[,1:3],y=data3EP$group,method="alpha.greedy",
#'		show.region=TRUE,closed.test=TRUE)
#' ## Projecion on the first two dimensions
#' plot(testgreedy3EP,dim=c(1,2),cex=2)
#' ## Slice at a value of 5 for the third dimension
#' plot(testgreedy3EP,dim=c(1,2),slice=list(T3=5),cex=2)
#' ## Show all slices through the third dimension
#' par(mfrow=c(3,3))
#' for(i in 1:9) {
#'	plot(testgreedy3EP,dim=c(1,2),slice=list(T3=i),show.titles=FALSE,cex=2,xlim=c(0,8),ylim=c(0,10))
#'	title(paste("T3 =",i))
#'}
#'
#' @export
plot.multfisher<-function(x,dim=c(1,2),slice=NULL,show.titles=TRUE,...) {
	if(is.null(x$rej.region)) stop("Rejection region required. Use mfisher.test with show.region=TRUE.")
	R<-x$rej.region[x$rej.region$alpha>0,]
	pch.region<-ifelse(R$rejection.region==1,19,1)
	if(length(x$statistic)>1) {
		if(length(dim)!=2) stop("dim must be a vector of length two.")
		if(max(dim)>length(x$statistic)) stop("Values in dim must be less or equal the number of dimensions of the rejection region.")
		if(!is.null(slice)) {
			if(!is.list(slice)) stop("slice must be a list.")
			set<-colSums(t(R[,names(slice),drop=FALSE])==unlist(slice))==length(slice)
			if(sum(set)==0) stop("Provided value for slice results in an empty set.")
			R<-R[set,]
			pch.region<-pch.region[set]
		}
		plot(R[,dim], xlab=paste("Statistic endpoint",dim[1]),ylab=paste("Statistic endpoint",dim[2]),pch=pch.region,...)	
			abline(v=unique(R[,dim[1]]),col="grey")
		abline(h=unique(R[,dim[2]]),col="grey")
		if(!is.null(slice) & show.titles==TRUE) mtext(side=1,text=paste("Other statistics held constant at:",paste(names(slice),slice,sep="=",collapse=", ")),line=4,adj=0)
	} else {
		plot(R[,1],rep(1,dim(R)[1]), xlab=paste("Statistic endpoint",dim[1]),ylab="",pch=pch.region,axes=FALSE,...)	
		box()
		axis(1)
			abline(v=unique(R[,dim[1]]),col="grey")
		abline(h=1,col="grey")
	}
	if(show.titles==TRUE) {
		title("Optimal exact test rejection region")
		mtext(side=3,text="Solid: rejection region. Empty: not in rejection region. No dot: not in permutation distribution.")
	}
}


C.ii.fn<-function(ko,sort.it=FALSE, keep.empty=FALSE) {
	#Creates cond(ii) Constraint matrix
	#set keep.empty=TRUE to keep these constraints. This is necessary, if my branch and bound algorithm is used, because
	#it uses the n x n constraint matrix as lookup table for cond(ii).
	#constraint matrix for criterion (ii) (no holes)
	if(!is.matrix(ko) & !is.data.frame(ko)) ko<-matrix(ko,ncol=1)
	if(sort.it==TRUE) for(i in 1:dim(ko)[2]) ko<-ko[order(ko[,i]),] #so it also works for one-dimensional problems
	n<-dim(ko)[1]
	C.ii<-matrix(0,n,n) 
	if(n > 0) {
		not.empty<-rep(TRUE,n)
		for(i in 1:n) {
			for(j in 1:n) if(i!=j & all(ko[j,]>=ko[i,])) C.ii[i,j]<- -1
			summe<- -sum(C.ii[i,])
			
			if(summe>0)	{
				#C.ii.t[,i]<-C.ii.t[,i]/summe
				#C.ii.t[i,i]<-1
				C.ii[i,i]<- summe

			} else {
				if(keep.empty==TRUE) C.ii[i,i]<- -1 else not.empty[i]<-FALSE
				# any number <= 0 is possible,
				#better to remove empty constraints
			}
		}
		C.out<-C.ii[not.empty,] 
	} else {
		C.out<-C.ii
	}
	if(!is.matrix(C.out)) C.out<-matrix(C.out,nrow=n)
	C.out
}

C.ii.t.fn<-function(ko,sort.it=FALSE, keep.empty=FALSE) {
	#set keep.empty=TRUE to keep these constraints. This is necessary, if my branch and bound algorithm is used, because
	#it uses the n x n constraint matrix as lookup table for cond(ii).
	#constraint matrix for criterion (ii) (no holes)
	if(!is.matrix(ko) & !is.data.frame(ko)) ko<-matrix(ko,ncol=1)
	if(sort.it==TRUE) for(i in 1:dim(ko)[2]) ko<-ko[order(ko[,i]),] #so it also works for one-dimensional problems
	n<-dim(ko)[1]
	C.ii.t<-matrix(0,n,n) 
	if(n > 0) {
		not.empty<-rep(TRUE,n)
		for(i in 1:n) {
			for(j in 1:n) if(i!=j & all(ko[j,]>=ko[i,])) C.ii.t[j,i]<- -1
			summe<- -sum(C.ii.t[,i])
			
			if(summe>0)	{
				#C.ii.t[,i]<-C.ii.t[,i]/summe
				#C.ii.t[i,i]<-1
				C.ii.t[i,i]<- summe

			} else {
				if(keep.empty==TRUE) C.ii.t[i,i]<- -1 else not.empty[i]<-FALSE
				# any number <= 0 is possible,
				#better remove empty constraints
			}
		}
		C.out<-C.ii.t[,not.empty] #we calculate and return the transposed constrained matrix 
	} else {
		C.out<-C.ii.t
	}
	if(!is.matrix(C.out)) C.out<-matrix(C.out,nrow=n)
	C.out
}

vek2array.fn<-function(vek,d,name.shift=0) { #previously shift=-1, but we don't need this option anymore
	anz.dim<-length(d)
	dim.namen<-vector(length=anz.dim,mode="list")
	for(i in 1:anz.dim) dim.namen[[i]]<-(1:d[i])+name.shift
	array(vek,dim=d,dimnames=dim.namen)
}

mult.fisher.2J.fn<-function(tab=NULL,n=NULL,m=NULL,M=NULL,p1=NULL,p2=NULL,margin.prob=FALSE,calc.marg.p=FALSE) {
	#eiter a full table must be provided via argument tab
	#or the marginal sums are provided as n and m
	if(!is.null(tab)) {
		m<-rowSums(tab)
		n<-colSums(tab)
	}
	#m...row margins
	#n...column margins (number of observations in the two groups)
	
	#Matrix M to get test statistics as sums of the approproate cell entries
	#M%*%tab[,1] gives the test statistics
	#e.g. for two dimensions M=matrix(c(0,1,0,1,0,0,1,1),byrow=TRUE,nrow=2)
	if(is.null(M)) {
		anz.dim<-round(log2(length(m))) #round just in case there are numerical issues
		M<-t(namen.fn(anz.dim))
	} else {
		anz.dim<-dim(M)[1]
	}
	
	k<-length(m)
	x<-rep(NA,k)
	N<-sum(m)
	cur.lim<-matrix(NA,nrow=k,ncol=2)
	stop<-FALSE
	j<-1:k
	u<-0 #so that we start with the full loop
	ws<-NULL

	#Upper bound for the test statistics
	#this should be improved, so that we look only at the possible range of values for the test statistics
	m.marg<-as.numeric(M%*%m) #marginal sum of successess (succ A, succ B, ...)
	#now use the usual formula for 2x2 tables:
	low.marg<-up.marg<-rep(NA,anz.dim)
	for(i in 1:anz.dim) {
		#lowest value the i-th test stat can take:
		low.marg[i]<-max(0,m.marg[i]-n[2])
		#largest value the i-th test stat can take:
		up.marg[i]<-min(m.marg[i],n[1]) #maximum
	}

	#Coordinates and test statistics have this relation:
	#T_i = X_i + low.marg_i - 1 
	#X_i = T_i + 1 - low.marg_i
	#Tmax<-as.numeric(M%*%m) #this is also marginal number of successess
	#d<-Tmax+1 #in dimension i, the array ranges from 0 to d[i]
	d<- up.marg-low.marg + 1 #this is the size of the nonzero elements of the distribution in each dimension
	#anz.dim<-length(d) #number of dimensions
	#prepare vectorized output
	WA.vek<-Z.vek<-F.vek<-rep(0,prod(d))   # array(0,dim=d) #+1 because 0 is ingeneral a possible value for the test stats
	cp<-cumprod(d[-length(d)]) #requried to set indices
	WS<-NA
	WS.vek<-NA
	if(margin.prob==TRUE) ws.obs.margins<-0 else ws.obs.margins<-NA #probability, to observe a table with the provided margins, this may be used when calculating unconditional power
	#log.total.perm<-lchoose(sum(n),n[1])
	total.perm<-choose(sum(n),n[1])

	while(!stop) {
		#for i>u: calculate cur.lims anew and set x[i] to the lower limit. x are the observed frequencies for group 1 in the current permutation
		if(u+1<=k-1) {
			for(i in (u+1):(k-1)) {
				x.sum<-ifelse(i==1,0,sum(x[1:(i-1)])) #x sum set so far #sum(c(0,x[0:(i-1)]))
				m.sum<-sum(m[(i+1):k]) #remaining margin sum
				low<-max(n[1]-x.sum-m.sum,0) #c(0,x[0])=0
				up<-min(n[1]-x.sum,m[i])
				cur.lim[i,]<-c(low,up)
				x[i]<-low
			}
		}
		x[k]<-n[1]-sum(x[1:(k-1)])
		cur.lim[k,]<-c(x[k],x[k]) #here, only one possibility remains
		#Calculate probabiliy for current permutation
		mgl<-prod(choose(m,x))
		#ws<-c(ws,mgl)

		#using logs also works, but it seems to be less exact. Is it faster?
		#log.mgl<-sum(lchoose(m,x))
		#ws<-c(ws,log.mgl)

		#Current Vector of test statistics (now not plus 1, because the coordinates are calc below
		T.cur<-M%*%x # + 1
		#Current vector of coordinates X_i = T_i + 1 - low.marg_i
		X.cur<-T.cur + 1 - low.marg
		
		#Coordinates in vectorized matrix (also im Vektor)
		ko<-sum(cp*(X.cur[-1]-1))+X.cur[1]
		
		Z.vek[ko]<-Z.vek[ko]+mgl
		#F.vek[ko]<-F.vek[ko]+exp(log.mgl-log.total.perm)
		#F.vek[ko]<-F.vek[ko]+mgl/total.perm

		#old version using matrix: #Z[T.A+1,T.B+1]<-Z[T.A+1,T.B+1]+mgl
		
		#WS, Probability under the alternative
			if(!is.null(p1) & !is.null(p2) ) {
				x2<-m-x
				p.u.log<-sum(x*(log(p1)-log(p2)))-sum(lfactorial(x))-sum(lfactorial(x2))
				WA.vek[ko]<-WA.vek[ko]+exp(p.u.log)

				#unconditional probability of these table margins under given alternative p1 and p2:
				if(margin.prob==TRUE) ws.obs.margins<-ws.obs.margins+exp(dmultinom(x=x,prob=p1,log=TRUE)+dmultinom(x=x2,prob=p2,log=TRUE))
			
			}

		#Stop or next iteration
		if(all(x==cur.lim[,2])) {
			stop<-TRUE
		} else {
			#increase x[i] by 1, where i=max{j=1,...,k : x[j]<cur.lim[j,2]}
			u<-max(j[x<cur.lim[,2]])
			x[u]<-x[u]+1
		}
	}

		
	#Distribution under H0
	F.vek<-Z.vek/sum(Z.vek)
	#F.vek<-Z.vek/total.perm

	#done above, using log/log.total

	#make dimnames, and in the same loop, make the coordinate vectors X1, X2, ... for the vectorized output
	dim.namen<-vector(length=anz.dim,mode="list")
	koord<-matrix(NA,ncol=anz.dim,nrow=prod(d))
	colnames(koord)<-paste("X",1:anz.dim,sep="")

	#caclulate marginal p-values (one sided), these are used for the p-value based rejection regions
	marg.pval<-matrix(NA,ncol=anz.dim,nrow=prod(d))
	colnames(marg.pval)<-paste("pval",1:anz.dim,sep="")

	for(i in 1:anz.dim) {
		dim.namen[[i]]<-low.marg[i]:up.marg[i]     #0:(d[i]-1)

		#Coordinates, use recycling of vectors in R
		koord[,i]<-rep(1:d[i],each=prod( c(1,d)[1:i] ) )
			#c(1,d), so that in the first dimension each entry is repeated once, and otherwise the number of repeats is the number of values that occur in the previous dimension
		
		#and the marginal p-values:
		if(calc.marg.p==TRUE) {		
			t.temp<-low.marg[i]:up.marg[i]    #0:Tmax[i] #ist 0:(d[i]-1), Tmax ist auch gleichzeitig marginal number of successes
			p.temp<-phyper(q=t.temp,m=n[1],n=n[2],k=m.marg[i],lower.tail=FALSE)+
					dhyper(x=t.temp,m=n[1],n=n[2],k=m.marg[i])
			marg.pval[,i]<-rep(p.temp,each=prod( c(1,d)[1:i] ) )   #c(1,d), as above
		}
	}

	#Distributions under H0 and alternative in array form:
	#null distribution
	F<-array(F.vek,dim=d,dimnames=dim.namen)
	#Joint distribution of T under the alternative p1,p2
	if(!is.null(p1) & !is.null(p2) ) {
		WS.vek<-WA.vek/sum(WA.vek)
		WS<-array(WS.vek,dim=d,dimnames=dim.namen)
	}
	
	#Vektorized output, the LP requires this type of input anyways and it works for arbitrary dimensions
	#Teststatistics from coordinates
	#T_i = X_i + low.marg_i - 1 
	T<-matrix(NA,nrow=dim(koord)[1],ncol=dim(koord)[2])
	for(i in 1:dim(koord)[2]) T[,i]<-koord[,i] + low.marg[i] -1

	colnames(T)<-paste("T",1:anz.dim,sep="")
	vek.out<-data.frame(koord,T,index=1:prod(d),alpha=F.vek,power=WS.vek,solution=0,E=0,marg.pval)
	vek.out	

	if(!is.null(tab)) obs.prob<-prod(choose(m,tab[,1])) else obs.prob<-NA #number of Permutations that give the observed table (if provided)
	list(vek.out=vek.out,d=d,anz.dim=anz.dim,M=M,n=n,m=m,marginal.succ=m.marg,F=F,WS=WS,p1=p1,p2=p2,ws.obs.margins=ws.obs.margins,total.perm=total.perm,sumZ=sum(Z.vek),sum(Z.vek)==total.perm,dim.out=dim(vek.out),dim.koord=dim(koord),l.F.vek=length(F.vek))
}

optimal.2J.fn<-function(V,preproc.results=NULL,report.pp=TRUE,elementary.consonance=FALSE,objective=c("alpha","power","area","specified")[3],obj.spec=NULL,true.power=NULL,anz.dim=sum(grepl("X",names(V))),alpha=0.025,limit=0,limit.obj=0,allowed=NULL,koo.cur=NULL,subset=NULL,return.array=FALSE,rescale.obj=10000,algorithm=c("LP","Rglpk","BB")[1],keep.empty=(algorithm=="BB"),presolution=NULL,maxiter=100000,...) {
	if(objective=="number") objective<-"area"
	#if preproc.results, which contains all results of the preprocessing, the function skips the preprocessing, 
	#and directly starts the optimization.
	#This is much faster, because the preprocessing can take a second or so.
	#Also any parts referring to "allowed" are skipped, as we assume that the supplied X3 and C.mat
	#already are matching any such restrictions.
	#Take care: the setting for limit must be the same as in the calculation of the preprocessing

	#rescale.obj is a factor by which the objective contributions are multiplied when the objective is
	#aplha or power. Otherwise the lpsolver sometimes runs into problems.
	#limit is the limit under which contributions to alpha are set to zero (and alpha is lowered accordingly) to speed up computation
	#limit.opt is the limit under which contributions to the objective function are considered too small to allow stable computation
	#all such values are set to the value of limit.opt
	
	#koo.cur ... current coordinates and solutions of lower level tests in the closed test
	#points that are not contained in any one-step-lower level rejection region are set to "not allowed",
	#this enforces consonance
	#is used after pre-processing step 1, to make it faster
	
	#With a specified objective, e.g. power other than that in V can be used.
	#We could also use true.power, but take care: If preproc.result is used, the V will be supplied from there
	#so V must match the optimization aim.
	if(objective=="specified") {
		if(is.null(obj.spec)) stop("Objective vector obj.spec is required.")
		V$specified<-obj.spec
	}

   if(is.null(preproc.results)) {
	preproc.results<-NA

	#check general argument "allowed"
 	if(is.null(allowed)) allowed<-rep(TRUE,dim(V)[1]) #"allowed" can be used to exclude points and enforce consonance
	if(length(allowed)!=dim(V)[1]) stop("Vector 'allowed' not of same length as data.")

	#Add information on the level of the subset test in the closed test
	if(!is.null(subset)) {
		dim.sub<-sum(subset)
		V$subset.level<-dim.sub
	}
	
	#Check consonance constraint due to solution of lower level tests (the locally optimal bottom up approach).
	#This overrides the argument "allowed".
	#if(!is.null(koo.cur) & !is.null(subset)) {
	#	anz.dim.total<-sum(grepl("X",names(koo.cur)))
	#	#these are the points in the relevant dimensions that are lower level solutions
	#	G<-koo.cur[dim.sub-koo.cur$subset.level==1 & koo.cur$solution==1 , (1:anz.dim.total)[subset]]
	#	allowed<-rep(FALSE,dim(V)[1])
	#	#check for each relevant dimension, which points have lower level solutions
	#	if(dim(G)[1]>0 & dim(V)[1]>0) {  #otherwise there are no lower level solution points or no points in V
	#		for(i in 1:dim(V)[1]) {
	#			for(j in 1:dim(G)[1]) {
	#				delta<-V[i,1:anz.dim] - G[j,]
	#				if( all(delta[!is.na(delta)] )==0 ) allowed[i]<-TRUE
	#			}
	#		}
	#	}
	#
	#	#for(i in 1:dim.sub) allowed<-allowed & (V[,i]%in%G[,i]  )  #all subdimension coordinates must match
	#}

	#Check if consonance due to elementary tests is demanded. This uses the pvalues supplied with V
	#This step does not overrides the argument koo.cur. It may used inaddition to koo.cur, because then X is smaller
	#when it comes to checking the consonance with koo.cur
	if(elementary.consonance==TRUE) {
		allowed<-rep(TRUE,dim(V)[1])
		#only works for EP>1, otherwise PVALS will be a vector, not a matrix
		PVALS<-V[,grepl("pval",names(V))]
		for(i in 1:dim(V)[1]) if(all(PVALS[i,]>alpha)) allowed[i]<-FALSE
	}
	
	V$allowed<-allowed
	#plot.muf.fn(F=A$F,region=vek2array.fn(V$allowed,d=A$d),gr=1.25,r=1,main="Optimal power cons",print.area="prob.area",gr.label=1,label.line=4,gr.ax=1.25)


	#1) remove lines with prob 0 and lines that are excluded because of some external constraint (e.g. to enforce consonance)
	keep<-V$alpha>0 & V$allowed==TRUE
	#all((V$alpha>0) == (V$alpha!=0)) #ok
	X<-V[keep,]
	
    #check, if X is empty. If so, skip remaining parts and report V. We need to check this twice, to avoid a conflict with the preproc.results jump
    if(dim(X)[1]>0) {

	#2) actual pre-processing, cond (i) und (ii)
	K<-dim(X)[1] #number of points (rows)
	#condition (i)
	for(i in 1:K) {
		set<-rep(TRUE,K)
		for(j in 1:anz.dim) set<-set&(X[,j]>=X[i,j]) #the coordinates have to be in the first columns of V
		alpha.min<-sum(X$alpha[set])
		if(alpha.min<=alpha) X$E[i]<-1 #eligible points, according to cond(i)
	}
	#only keep eligible
	X1<-X[X$E==1,]
	K1<-dim(X1)[1]
	

	#here is the better place for testing the consonance contraint, because this can be slow
	#so it is better to have it after pp step 1 (it is much faster this way)
	#Check consonance constraint due to solution of lower level tests (the locally optimal bottom up approach).
	#This is in addition to the argument "allowed".
	if(!is.null(koo.cur) & !is.null(subset)) {
		anz.dim.total<-sum(grepl("X",names(koo.cur)))
		#these are the points in the relevant dimensions that are lower level solutions
		G<-koo.cur[dim.sub-koo.cur$subset.level==1 & koo.cur$solution==1 , (1:anz.dim.total)[subset]]
		#allowed<-rep(FALSE,dim(V)[1])
		#check for each relevant dimension, which points have lower level solutions
		if(dim(G)[1]>0 & K1>0) {  #otherwise there are no lower level solution points or no points in X1
			allowed.X1<-rep(FALSE,K1)
			for(i in 1:K1) {
				for(j in 1:dim(G)[1]) {
					delta<-X1[i,1:anz.dim] - G[j,]
					if( all(delta[!is.na(delta)] )==0 ) allowed.X1[i]<-TRUE
				}
			}
		}
		X1<-X1[allowed.X1,]
		K1<-dim(X1)[1]
		#for(i in 1:dim.sub) allowed<-allowed & (V[,i]%in%G[,i]  )  #all subdimension coordinates must match
	}
	
	max.alpha<-sum(X1$alpha)

	#condition (ii)
	if(K1>0) {
		for(i in 1:K1) {
			set<-rep(TRUE,K1)
			for(j in 1:anz.dim) set<-set&(X1[,j]<=X1[i,j])
			max.level.ohne.i<-max.alpha-sum(X1$alpha[set])
			max.level.mit.i<-max.level.ohne.i + X1$alpha[i]
			if(max.level.mit.i<=alpha) X1$E[i]<-2 #points, which are contained in eversy optimal rej. region
		}
		X2<-X1[X1$E==1,]
	} else {
		X2<-X1
	}
	alpha.frei<-alpha-sum(X1$alpha[X1$E==2])

	
	#3) To speed up comp.: alpha very close to zero are treated as 0
	#In the null distribution, one can set to zero all probabilities that are very small (close to zero)
	#such that the sum of these probabilities is below some pre-specified threshold, e.g. 10^-4. These points are then removed from the search space and the optimization is done for a nominal level alpha minus the threshold value.
	#In the end all these "almost zero prob." points can be added for free, subject to the "no holes" constraint. 
	
	K2<-dim(X2)[1]
	anz.quasi.0<-0 #this is required later, even if limit==0
	if(limit>0 & alpha.frei>limit & K2>0) {
		rk<-rank(X2$alpha, ties.method = "first")
		cums<-cumsum(sort(X2$alpha))
		anz.quasi.0<-sum(cums<=limit)
		if(anz.quasi.0>0) {
			set.not.0<- rk>anz.quasi.0 #I.e, those with rank larger than the number of quasi 0 elements
			X3<-X2[set.not.0,]
			alpha.new<-alpha.frei-cums[anz.quasi.0]
		} else {
		set.not.0<-NA
		X3<-X2
		alpha.new<-alpha.frei
		}
	} else {
		set.not.0<-NA
		X3<-X2
		alpha.new<-alpha.frei
	}
	
	K3<-dim(X3)[1]

	#4) Make constraint matrix
	#Criterion (i) is alpha
	#criterion (ii) is no holes:

	#C.t<-C.ii.t.fn(ko=X3[,1:anz.dim]) 
	C.mat<-C.ii.fn(ko=X3[,1:anz.dim],keep.empty=keep.empty) 
	
	#save preproc results
	if(report.pp) preproc.results<-list(X=X,X1=X1,X2=X2,X3=X3,K=K,K3=K3,keep=keep,alpha.new=alpha.new,C.mat=C.mat,set.not.0=set.not.0,anz.quasi.0=anz.quasi.0)
    } else {  #brackets closed from first check if X is empty
	if(report.pp) preproc.results<-list(X=X,X1=NA,X2=NA,X3=NA,K=NA,K3=NA,keep=keep,alpha.new=NA,C.mat=NA,set.not.0=NA,anz.quasi.0=NA)
    }
	

   } else { #brackets closed from checking if the preprocessing results were supplied to the function
	#use preproc.results
	X<-preproc.results$X
	X1<-preproc.results$X1
	X2<-preproc.results$X2
	X3<-preproc.results$X3
	K<-preproc.results$K
	K3<-preproc.results$K3
	keep<-preproc.results$keep
	alpha.new<-preproc.results$alpha.new
	C.mat<-preproc.results$C.mat
	set.not.0<-preproc.results$set.not.0
	anz.quasi.0<-preproc.results$anz.quasi.0

   }

    opt.status<- -1 #this means, if the preprocessing does everything (and this is possible) we see -1. 0 means optimization successfull, 1 means optimization not successfull
	OPT<-NA #in case there is no optimization
    #check, if X is empty second time. If so, skip remaining parts and report V. We need to check this twice, to avoid a conflict with the preproc.results jump
    if(dim(X)[1]>0) {


	#5) Optimization with LP, Rglpk or branch and bound
	if(K3>0) {
		C.all<-rbind(X3$alpha,C.mat)
		#rhs
		rhs<-c(alpha.new,rep(0,dim(C.mat)[1]))
	
		X3$area<-1
		#this optimizes area, and if there is more than one solution picks that with best alpha exhaustion
		X3$area.and.alpha<- 1 + X3$alpha 
		if(objective=="specified" & !is.null(preproc.results)) X3$specified<-obj.spec[X3$index]
		objs<-X3[[objective]]
		if(limit.obj>0) objs[objs<=limit.obj]<-limit
		if(objective!="area" & objective!="area.and.alpha") objs<-objs*rescale.obj
			
	
		#Optimization
		#if(algorithm=="LP") OPT<-lp(direction="max",objective.in=objs,const.mat=C.all, const.dir="<=" ,const.rhs=rhs,transpose.constraints=TRUE,all.bin=TRUE,num.bin.solns=1)
		#if(algorithm=="Rglpk") OPT<-Rglpk_solve_LP(obj=objs, mat=C.all, dir=rep("<=",dim(C.all)[1]), rhs=rhs, types = "B", max = TRUE)
		if(algorithm=="BB") OPT<-bb.fn(X=X3,C.mat=C.mat,obj=objs,alpha.new=alpha.new,presolution=presolution,maxiter=maxiter)
		if(algorithm=="none") OPT<-list(solution=X3$solution,status=NA)
		X3$solution<-OPT$solution
		#opt.status<-OPT$status
		opt.status<-OPT$optimization.finished

	}

	#Write the solution to V and X
	added.by.lp<-X3$index[X3$solution==1]
	added.by.pp<-X1$index[X1$E==2] 
	total<-c(added.by.lp,added.by.pp) #they are all unique
	V$solution[total]<-1 #because the rownumbers "index" apply to the original table V
	X$solution<-V$solution[keep] #keep are the points with truly alpha>0 haben and which are "allowed"
	#alternatively we could use X$solution[X$index%in%total]<-1 #I don't know which is faster

	#Fill up quasi alpha 0 points (this has to be done, otherwise there can be holes)
	#I.e. all "larger" points have to be in the solution or they are quasi 0 points themselves
	if(limit>0 & anz.quasi.0>0) {
		V$quasi.0<-0
		V$quasi.0[X2$index[!set.not.0]]<-1
		X$quasi.0<-V$quasi.0[keep] #Work at level of X, because it has less rows than V
		for(i in 1:K) {
			set<-rep(TRUE,K)
			for(j in 1:anz.dim) set<-set&(X[,j]>=X[i,j])
			if(all(X$solution[set]==1 | X$quasi.0[set]==1)) X$solution[i]<-1
		}
		V$solution[X$index[X$solution==1]]<-1
	}

     } #bracket closed from parts that are only executed if X is not empty, second time
	#Output:
	RX<-X$solution==1
	level<-sum(X$alpha[RX])
	if(is.null(true.power)) power<-sum(X$power[RX]) else power<-sum(true.power[keep][RX])
	area<-sum(RX)
	R<-NA
	if(return.array==TRUE) R<-vek2array.fn(V$solution,...)
	if(report.pp==FALSE) preproc.results<-NA

	list(V=V,OPT=OPT,preproc.results=preproc.results,anz.dim=anz.dim,level=level,power=power,area=area,R=R,opt.status=opt.status,algorithm=algorithm)
}

dec2bin<-function(x,digits=8) {
	b<-rep(0,digits)
	for(k in 1:digits) {
		b[k]<-x%%2	
		x<-(x - b[k])/2
	}
	b
}

#namen.fn<-function(anz.dim) {
#	#create binary numbers from 0 to 2^anz.dim - 1
#	x<-0:(2^anz.dim - 1)
#	x
#	M<-t(sapply(x,dec2bin,digits=anz.dim))
#	#sort, so that we get 0,0,0 ; 1,0,0 ; 0,1,0 ; ...
#	if(anz.dim>1) out<-M[order(rowSums(M)),] else out<-t(M)  #because in this case the t() in sapply is not required
#	out 
#}

namen.fn<-function(anz.dim) {
	#create binary numbers from 0 to 2^anz.dim - 1
	x<-0:(2^anz.dim - 1)
	M<-t(sapply(x,dec2bin,digits=anz.dim))
	if(anz.dim>1) out<-M else out<-t(M)  #because in this case the t() in sapply is not required, no sorting
	out 
}



subset.matrix.fn<-function(subset,anz.dim) {  #subset is a logical vector
	M<-namen.fn(anz.dim)
	n.sub.dim<-sum(subset)
	Msub<-namen.fn(n.sub.dim)
	nr<-2^n.sub.dim
	nc<-2^anz.dim
	K<-matrix(NA,nrow=nr,ncol=nc)
	for(i in 1:nr) {
		for(j in 1:nc) K[i,j]<-as.numeric(all(Msub[i,]==M[j,subset]))
	}
	K
}

bb.fn<-function(X,C.mat,obj,alpha.new,presolution=NULL,maxiter=10000) {
	C.1<-(C.mat!=0)

	#X is a data.frame such as the output V of mult.fisher.2J.fn,
	#possibly after pre-processing
	#C.mat is the constraint matrix, C.t is the transposed constraint matrix (transposed, because for lpSolve this is
	#supposed to be faster and we want to use the code we already have).
	#presolution is a feasible (probably non-optimal) solution that is supplied.
	#It can be found be faster algorithms such as marg.fisher.opt.fn or pval.regions.fn.
	#obj is the vector of contributions to the objective function (obj%*%solution is the value of the objective function)

	#set length of the solution vector
	d.x<-dim(X)[1]
	#check presolution and set current best lower bound using the presolution
	if(!is.null(presolution)) {
		max.low<-sum(obj[presolution==1])
		if(sum(X$alpha[presolution==1])>alpha.new | !all(C.mat%*%presolution <=0)) {
			#all(t(presolution)%*%C.t <=0)
			warning("Presolution not feasible.")
			presolution<-NULL
		}
	}
	if(is.null(presolution)) {
		presolution<-rep(0,d.x)
		max.low<-0
	}
	
	#the function to add a node:
	index<-1:d.x
	add.fn<-function(n,z) {
		x<-n[-c(1:3)]
		#find first not yet defined entry
		i<-min(index[x == -1])
		#set new entry and all others according to cond(ii) and return low, up, branched out, x, if cond(i) is met
		if(z==0) {
			x[C.1[,i]]<-0
			return(c(sum(obj[x==1]),sum(obj[x!=0]),all(x>=0),x))
		}
		if(z==1) {
			x[C.1[i,]]<-1
			#check cond(i), this is only required for z=1.
			if(sum(X$alpha[x==1])>alpha.new) return(NULL) else return(c(sum(obj[x==1]),sum(obj[x!=0]),all(x>=0),x))
		}
	}	

	#prepare matrix of nodes
	#the columns are: lower bound, upper bound, fully branched, and then the solution vector
	#for a branched out solution, low=up.
	#for a solution 1 means included, 0 means not included and -1 means not yet determined

	N<-matrix(
		c(max.low,max.low,1,presolution,
		0,0,0,rep(-1,d.x)),byrow=TRUE,nrow=2)
	
	nc<-dim(N)[2]		
	active<-N[,3]==0
	N.act<-matrix(N[active,],ncol=nc)
	N.finished<-matrix(N[!active,],ncol=nc)

	iter<-0L
	#maxiter<-807
	while(dim(N.act)[1]>0 & iter<maxiter) {
		iter<-iter+1L
		#select of all not fully branched nodes that with largest lower bound 
		#as alternative we might alternatingly also select that with largest upper bound.
		#so they get removed before everything has branched out.
		#In the example, this takes the same number of iterations, but is slower, probably because N can gets larger.
		#index 1 or 2:
		#indi<-1+iter%%2
		indi<-1
		j<-which(N.act[,indi]==max(N.act[,indi]))[1]
		n<-N.act[j,]
		#add new nodes and remove the one just branched (j), rbind(NULL,n) gives a matrix, so it N always stays a matrix
		N<-rbind(N.finished,N.act[-j,],add.fn(n,0),add.fn(n,1))
		#update current best lower bound
		max.low<-max(N[,1])
		#remove all nodes with up<max.low, N can become 0 x (3+d.x) Matrix, but it will not disappear
		N<-matrix(N[N[,2]>=max.low,],ncol=nc)
		#select active nodes
		active<-N[,3]==0
		N.act<-matrix(N[active,],ncol=nc)
		N.finished<-matrix(N[!active,],ncol=nc)


	}
	optimization.finished<-dim(N.act)[1]==0
	if(optimization.finished) {
		solution<-N[1,-c(1:3)]
	} else {
		#else we still provide a feasible solution, namely that with the largest lower bound
		j<-which(N[,1]==max(N[,1]))[1]
		solution<-N[j,-c(1:3)]
		solution[solution==-1]<-0 #this is always feasible because of the filling up in add.fn
	}
	list(solution=solution,obj.value=sum(obj[solution==1]),nodes=N,n.solutions=dim(N)[1],iterations=iter,maxiter=maxiter,optimization.finished=optimization.finished,status=as.numeric(!optimization.finished))
}

check.cond.ii.fn<-function(ind,VV,anz.dim) {
	#in each direction, the neighboring point needs to be in the solution, or be of prob zero	
	set<-rep(TRUE,dim(VV)[1])
	for(i in 1:anz.dim) set<-set&(VV[,i]>=VV[VV$index==ind,i])
	cond.ii<-sum(VV$solution.temp[set]) == (sum(set)-1)  #-1, because the point itself is in the set, but not yet in the solution
	#cond.i<- (sum(V1.s$alpha[V1.s$solution==1])+V1.s$alpha[V1.s$index==ind])  <= alpha
	#cond.i & cond.ii
	cond.ii
}

check.cond.ii.remove.fn<-function(ind,VV,anz.dim) {
	#in each direction, the neighboring point needs to be in the solution, or be of prob zero	
	set<-rep(TRUE,dim(VV)[1])
	for(i in 1:anz.dim) set<-set&(VV[,i]<=VV[VV$index==ind,i])
	cond.ii<-sum(VV$solution.temp[set]) == 1  #1, because the point itself is in the set
	#cond.i<- (sum(V1.s$alpha[V1.s$solution==1])+V1.s$alpha[V1.s$index==ind])  <= alpha
	#cond.i & cond.ii
	cond.ii
}

p.val.fun<-function(V,T,anz.dim,all.p.values=FALSE,consonant=FALSE,alpha) {
	V$p.val<-NA
	if(consonant==TRUE) {
		V$allowed<-rep(TRUE,dim(V)[1])
		#only works for EP>1, otherwise PVALS will be a vector, not a matrix
		PVALS<-V[,grepl("pval",names(V))]
		for(i in 1:dim(V)[1]) if(all(PVALS[i,]>alpha)) V$allowed[i]<-FALSE
	}
	
	#V should contain the vector solution. If not, the p-value calculation just gives the greedy test.
	#remove points with prob zero
	V1<-V[V$alpha>0,]

	

	indT<-V$index[colSums(t(V[,(anz.dim+1):(2*anz.dim)])==T)==anz.dim]
	
	#start with points that are not in the rejection region
	ZZ0<- sum(1-V1$solution)
	if(V[indT,"solution"]==0 | (all.p.values==TRUE & ZZ0>0) ) {
		#sort 
		if(consonant==FALSE) {
			V.s<-V1[order(V1$alpha),] #sort by allowed, in order to account for the consonance constraint
		} else {
			V.s<-V1[order(!V1$allowed,V1$alpha),]
		}
		p.curr<-sum(V.s$alpha[V.s$solution==1])
		V.s$solution.temp<-V.s$solution
		indices<-V.s$index[V.s$solution==0]
		outer.stop<-FALSE
		zz<-0
		while(!outer.stop) {
			zz<-zz+1
			Z<-length(indices)
			stop<-FALSE
			z<-0
			#ind<-min(ind.s[V.s$solution.temp==0]) #sum(V.s$solution)+1
			while(!stop) {
				z<-z+1
				#V.s[ind,]
				if(check.cond.ii.fn(indices[z],V.s,anz.dim)==TRUE) {
					p.curr<-p.curr+V$alpha[indices[z]]
					V$p.val[indices[z]]<-p.curr
					V.s$solution.temp[V.s$index==indices[z]]<-1
					if(indices[z]==indT & all.p.values==FALSE) outer.stop<-TRUE
					indices<-indices[-z]
					stop<-TRUE
				}
				#should not be necessary:
				if(z==Z) stop<-TRUE
			}
		if(zz==ZZ0) outer.stop<-TRUE
		}
	}

	#now regard points that are in the rejection region
	ZZ1<-sum(V1$solution)
	if(V[indT,"solution"]==1 | (all.p.values==TRUE & ZZ1>0)) {
	#if(number.in.solution>0) {
		V.s<-V1[order(V1$alpha,decreasing=TRUE),]
		#V.s$counter<-NA
		p.curr<-sum(V.s$alpha[V.s$solution==1])

		V.s$solution.temp<-V.s$solution

		indices<-V.s$index[V.s$solution==1]
		#for(zz in 1:(sum(V.s$solution))) {
		outer.stop<-FALSE
		zz<-0
		
		while(!outer.stop) {
			Z<-length(indices)
			stop<-FALSE
			z<-0
			while(!stop) {
				z<-z+1
				#V.s[ind,]
				if(check.cond.ii.remove.fn(indices[z],V.s,anz.dim)==TRUE) {
					V$p.val[indices[z]]<-p.curr
					p.curr<-p.curr-V$alpha[indices[z]]
					V.s$solution.temp[V.s$index==indices[z]]<-0
					#V.s$counter[V.s$index==indices[z]]<-zz
					if(indices[z]==indT & all.p.values==FALSE) outer.stop<-TRUE
					indices<-indices[-z]
					stop<-TRUE
				}
				#should not be necessary:
				if(z==Z) stop<-TRUE
			}
		if(zz==ZZ1) outer.stop<-TRUE
		}
	}
	list(V=V,indT=indT,p.value=V$p.val[indT])
}

dhyper.fn<-function(n1.,n.1,n..) {
	n2.<-n..-n1.
	n.2<-n..-n.1
	start<-max(0,n.1-n2.)
	ende<-min(n1.,n.1)
	prob<-dhyper(x=start:ende,m=n1.,n=n..-n1.,k=n.1)	
	out<-data.frame(t=start:ende,prob=prob)
	out
}

bonferroni.greedy.fn<-function(x,subset=1:length(x)) {
	anz.dim<-length(subset)
	tabList<-vector(mode="list",length=anz.dim)
	T<-OR<-rep(NA,anz.dim)
	zeile<-rep(NA,anz.dim)
	
	for(i in 1:anz.dim) {
		subtab<-x[[subset[i]]][c(2,1),c(2,1)]
		T[i]<-subtab[1,1]
		OR[i]<-subtab[1,1]/subtab[2,1]*subtab[2,2]/subtab[1,2]
		tabList[[i]]<-dhyper.fn(n1.=sum(subtab[1,]),n.1=sum(subtab[,1]),n..=sum(subtab))
		zeile[i]<-dim(tabList[[i]])[1]
	}
	p<-0
	stop<-FALSE
	while(!stop) {
		eligible<-rep(NA,anz.dim)
		for(i in 1:anz.dim) eligible[i]<-tabList[[i]][zeile[i],2]
		j<-which(eligible==min(eligible))[1]
		p<-p+tabList[[j]][zeile[j],2]
		if(tabList[[j]][zeile[j],1]==T[j]) stop<-TRUE else zeile[j]<-zeile[j]-1
	}
	list(stat=T,OR=OR,p=p)
}

Try the multfisher package in your browser

Any scripts or data that you put into this service are public.

multfisher documentation built on May 2, 2019, 3:32 p.m.