R/ds.table.R

Defines functions ds.table

Documented in ds.table

#' 
#' @title Generates 1-, 2-, and 3-dimensional contingency tables with option
#' of assigning to serverside only and producing chi-squared statistics
#' @description Creates 1-dimensional, 2-dimensional and 3-dimensional
#' tables using the \code{table} function in native R.
#' @details The \code{ds.table} function selects numeric, integer or factor
#' variables on the serverside which define a contingency table with up to
#' three dimensions. The native R \code{table} function basically operates on
#' factors and if variables are specified that are integers or numerics
#' they are first coerced to factors. If the 1-dimensional, 2-dimensional or
#' 3-dimensional table generated from a given study satisfies appropriate
#' disclosure-control criteria it can be returned directly to
#' the clientside where it is presented as a study-specific
#' table and is also included in a combined table across all studies.
#' 
#' The data custodian responsible for data security in a
#' given study can specify the minimum non-zero cell count that
#' determines whether the disclosure-control criterion can be viewed
#' as having been met. If the count in any one cell in a table falls
#' below the specified threshold (and is also non-zero)
#' the whole table is blocked and cannot be returned
#' to the clientside. However, even if a table is potentially
#' disclosive it can still be written to the serverside while
#' an empty representation of the structure of the table is returned
#' to the clientside. The contents of the cells in the serverside table
#' object are reflected in a vector of counts which is one component of
#' that table object.
#' 
#' The true counts in the studyside vector
#' are replaced by a sequential set of cell-IDs running from 1:n
#' (where n is the total number of cells in the table) in the empty
#' representation of the structure of the potentially disclosive table 
#' that is returned to the clientside. These cell-IDs reflect
#' the order of the counts in the true counts vector on the serverside.
#' In consequence, if the number 13 appears in a cell of the empty
#' table returned to the clientside, it means that the true count
#' in that same cell is held as the 13th element of the true count
#' vector saved on the serverside. This means that a data analyst
#' can still make use of the counts from a call to the {ds.table}
#' function to drive their ongoing analysis even when one or
#' more non-zero cell counts fall below the specified threshold
#' for potential disclosure risk. 
#' 
#' Because the table object on the
#' serverside cannot be visualised or transferred to the clientside,
#' DataSHIELD ensures that although it can, in this way, be used
#' to advance analysis, it does not create a direct risk of disclosure.
#' 
#' The <rvar> argument identifies the variable defining the rows
#' in each of the 2-dimensional tables produced in the output.  
#' 
#' The <cvar>
#' argument identifies the variable defining the columns in the 2-dimensional
#' tables produced in the output. 
#' 
#' In creating a 3-dimensional table the
#' <stvar> ('separate tables') argument identifies the variable that
#' indexes the set of two dimensional tables in the output {ds.table}.
#' 
#' As a minor technicality, it should be noted that
#' if a 1-dimensional table is required, one only need specify a value
#' for the <rvar> argument and any one dimensional table in the output
#' is presented as a row vectors and so technically the <rvar> variable
#' defines the columns in that 1 x n vector. However, the ds.table
#' function deals with 1-dimensional tables differently to 2 and 3
#' dimensional tables and key components  of the output
#' for one dimensional tables are actually two dimensional: with
#' rows defined by <rvar> and with one column for each of the studies.
#' 
#' The output list generated by {ds.table} contains tables based on counts
#' named "table.name_counts" and other tables reporting corresponding
#' column proportions ("table.name_col.props") or row proportions
#' ("table.name_row.props"). In one dimensional tables in the output the
#' output tables include _counts and _proportions. The latter are not
#' called _col.props or _row.props because, for the reasons noted
#' above, they are technically column proportions but are based on the
#' distribution of the <rvar> variable.
#' 
#' If the <report.chisq.tests> argument is set to TRUE, chisq tests
#' are applied to every 2-dimensional table in the output and reported
#' as "chisq.test_table.name". The <report.chisq.tests> argument
#' defaults to FALSE. 
#' 
#' If there is at least one expected cell
#' counts < 5 in an output table, the native R <chisq.test> function
#' returns a warning. Because in a DataSHIELD setting this often means that
#' every study and several tables may return the same warning
#' and because it is debatable whether this warning is really
#' statistically important, the <suppress.chisq.warnings> argument
#' can be set to TRUE to block the warnings. However, it is defaulted to FALSE.
#' @param rvar is a character string (in inverted commas) specifying the
#' name of the variable defining the rows in all of the 2 dimensional
#' tables that form the output. Please see 'details' above for more
#' information about one-dimensional tables when a variable name is provided
#' by <rvar> but <cvar> and <stvar> are both NULL
#' @param cvar is a character string specifying the
#' name of the variable defining the columns in all of the 2 dimensional
#' tables that form the output.
#' @param stvar is a character string specifying the
#' name of the variable that indexes the separate two dimensional
#' tables in the output if the call specifies a 3 dimensional table.
#' @param report.chisq.tests if TRUE, chi-squared tests
#' are applied to every 2 dimensional table in the output and reported
#' as "chisq.test_table.name". Default = FALSE.
#' @param exclude this argument is passed through to the {table} function in
#' native R which is called by {tableDS}. The help for {table} in native R
#' indicates that 'exclude' specifies any levels that should be deleted for
#' all factors in rvar, cvar or stvar. If the <exclude> argument
#' does not include NA and if the <useNA> argument is not specified,
#' it implies <useNA> = "always" in DataSHIELD. If you read the help for {table} in native R
#' including the 'details' and the 'examples' (particularly 'd.patho') you
#' will see that the response of {table} to different combinations of the
#' <exclude> and <useNA> arguments can be non-intuitive. This is particularly
#' so if there is more than one type of missing (e.g. missing by observation
#' as well as missing because of an NaN response to a mathematical
#' function - such as log(-3.0)). In DataSHIELD, if you are in one
#' of these complex settings (which should not be very common) and
#' you cannot interpret the output that has been approached
#' you might try: (1) making sure that the variable producing the strange results
#' is of class factor rather than integer or numeric - although integers and
#' numerics are coerced to factors by {ds.table} they can occasionally behave less
#' well when the NA setting is complex; (2) specify both an <exclude> argument
#' e.g. exclude = c("NaN","3") and a <useNA> argument e.g. useNA= "no";
#' (3) if you are excluding multiple levels e.g exclude = c("NA","3")
#' then you can reduce this to one e.g. exclude = c("NA") and then remove
#' the 3s by deleting rows of data, or converting the 3s to a different value.
#' @param useNA this argument is passed through to the {table} function in
#' native R which is called by {tableDS}. In DataSHIELD, this argument can take 
#' two values: "no" or "always" which indicate whether to include NA values in the table.
#' For further information, please see the help for the <exclude> argument (above)
#' and/or the help for the {table} function in native R. Default value is set to "always".
#' @param suppress.chisq.warnings if set to TRUE, the default warnings are
#' suppressed that would otherwise be produced by the {table} function in
#' native R whenever an expected cell count in one or more cells is less than 5.
#' Default is FALSE. Further details can be found under 'details' and the
#' help provided for the <report.chisq.tests> argument (above).
#' @param table.assign is a Boolean argument set by default to FALSE. If it is
#' FALSE the {ds.table} function acts as a standard aggregate function -
#' it returns the table that is specified in its call to the clientside
#' where it can be visualised and worked with by the analyst. But if
#' <table.assign> is TRUE, the same table object is also written to
#' the serverside. As explained under 'details' (above), this may be
#' useful when some elements of a table need to be used to drive forward the
#' overall analysis (e.g. to help select individuals for an analysis
#' sub-sample), but the required table cannot be visualised or returned
#' to the clientside because it fails disclosure rules.  
#' @param newobj this a character string providing a name for the output
#' table object to be written to the serverside if <table.assign> is TRUE.
#' If no explicit name for the table object is specified, but <table.assign>
#' is nevertheless TRUE, the name for the serverside table object defaults
#' to \code{table.newobj}.
#' @param datasources a list of \code{\link{DSConnection-class}} objects obtained after login. If the <datasources>
#' the default set of connections will be used: see \link{datashield.connections_default}.
#' If the <datasources> is to be specified, it should be set without
#' inverted commas: e.g. datasources=connections.em or datasources=default.connections. If you wish to
#' apply the function solely to e.g. the second connection server in a set of three,
#' the argument can be specified as: e.g. datasources=connections.em[2].
#' If you wish to specify the first and third connection servers in a set you specify:
#' e.g. datasources=connections.em[c(1,3)].
#' @param force.nfilter if <force.nfilter> is non-NULL it must be specified as
#' a positive integer represented as a character string: e.g. "173". This
#' the has the effect of the standard value of 'nfilter.tab' (often 1, 3, 5 or 10
#' depending what value the data custodian has selected for this particular
#' data set), to this new value (here, 173). CRUCIALLY, the {ds.table} function
#' only allows the standard value to be INCREASED. So if the standard value has
#' been set as 5 (as one of the R options set in the serverside connection), "6" and
#' "4981" would be allowable values for the <force.nfilter> argument but "4" or
#' "1" would not. The purpose of this argument is for the user or developer
#' to force the table to fail the disclosure control tests so the he/she can
#' see what then happens and check that it is behaving as anticipated/hoped. 
#' @return Having created the requested table based on serverside data
#' it is returned to the clientside for the analyst to visualise (unless
#' it is blocked because it fails the disclosure control criteria or
#' there is an error for some other reason). 
#' 
#' The clientside output from
#' {ds.table} includes error messages that identify when the creation of a
#' table from a particular study has failed and why. If table.assign=TRUE,
#' {ds.table} also writes the requested table as an object named by
#' the <newobj> argument or set to 'newObj' by default. 
#' 
#' Further information
#' about the visible material passed to the clientside, and the optional
#' table object written to the serverside can be seen under 'details' (above).
#' @author Paul Burton and Alex Westerberg for DataSHIELD Development Team, 01/05/2020
#' @export
#' 
ds.table <- function(rvar=NULL, cvar=NULL, stvar=NULL, report.chisq.tests=FALSE,
					          exclude=NULL,	useNA ="always", suppress.chisq.warnings=FALSE,
				          	table.assign=FALSE,	newobj=NULL, datasources=NULL, 
				          	force.nfilter=NULL){
  
  # if no connection login details are provided look for 'connection' objects in the environment
  if(is.null(datasources)){
    datasources <- datashield.connections_find()
  }

  # ensure datasources is a list of DSConnection-class
  if(!(is.list(datasources) && all(unlist(lapply(datasources, function(d) {methods::is(d,"DSConnection")}))))){
    stop("The 'datasources' were expected to be a list of DSConnection-class objects", call.=FALSE)
  }

  # check if a value has been provided for rvar
  if(is.null(rvar)){
    return("Error: rvar must have a value which is a character string naming the row variable for the table")
  }
  
  # check if the input object is defined in all the studies
  isDefined(datasources, rvar)

  if(!is.null(cvar)&&!is.character(cvar)){
     return("Error: if cvar is not null, it must have a value which is a character string naming the column variable for the table")
  }
  
  if(!is.null(cvar)){
    isDefined(datasources, cvar)
  }

  if(!is.null(stvar)&&!is.character(stvar)){
     return("Error: if stvar is not null, it must have a value which is a character string naming the variable coding separate tables for the table")
  }
  
  if(!is.null(stvar)){
    isDefined(datasources, stvar)
  }

  if(useNA!="no" && useNA!="always"){
     stop("useNA must be either 'no' or 'always'.")
	 }

  if(!is.null(force.nfilter)&&!is.character(force.nfilter)){
     return("Error: if force.nfilter is not null, it must have a value which is a character string specifying an integer for the forced value of the nfilter")
  }


 #All arguments should be directly transmittable
 rvar.transmit<-rvar
 
 if(is.null(cvar))
 {
 cvar.transmit<-NULL
 }
 else
 {
 cvar.transmit<-cvar
 }

 if(is.null(stvar))
 {
 stvar.transmit<-NULL
 }
 else
 {
 stvar.transmit<-stvar
 }
 
 
 if(is.null(exclude))
 {
 exclude.transmit<-NULL
 }
 else
 {
  exclude.transmit<-paste0(as.character(exclude),collapse=",")
 }
 
 useNA.transmit<-useNA
 
 if(is.null(force.nfilter))
 {
 force.nfilter.transmit<-NULL
 }
 else
 {
 force.nfilter.transmit<-force.nfilter
 }
 #CALL THE asFactorDS1 SERVER SIDE FUNCTION (AN AGGREGATE FUNCTION)
 # FOR rvar, cvar AND stvar  
 #TO DETERMINE ALL OF THE LEVELS REQUIRED
 
 rvar.asfactor.calltext <- call("asFactorDS1", rvar)
 rvar.all.levels <- DSI::datashield.aggregate(datasources, rvar.asfactor.calltext)
 
 numstudies <- length(datasources)
 
 rvar.all.levels.all.studies <- NULL
 
 for(j in 1:numstudies){
   rvar.all.levels.all.studies <- c(rvar.all.levels.all.studies,rvar.all.levels[[j]])
 }

 if (length(rvar.all.levels.all.studies) == 0) {
     stop(paste0("Unable to obtain factors for rvar: '", rvar , "'"), call. = FALSE)
 }
 
 rvar.all.unique.levels <- as.character(unique(rvar.all.levels.all.studies))
 
 rvar.all.unique.levels.transmit <- paste0(rvar.all.unique.levels, collapse=",")
 
 ########################################################
 
 if(!is.null(cvar)){
   cvar.asfactor.calltext <- call("asFactorDS1", cvar)
   cvar.all.levels <- DSI::datashield.aggregate(datasources, cvar.asfactor.calltext)
   
   numstudies <- length(datasources)
   
   cvar.all.levels.all.studies <- NULL
   
   for(j in 1:numstudies){
     cvar.all.levels.all.studies <- c(cvar.all.levels.all.studies,cvar.all.levels[[j]])
   }
   
   if (length(cvar.all.levels.all.studies) == 0) {
     stop(paste0("Unable to obtain factors for cvar: '", cvar , "'"), call. = FALSE)
   }

   cvar.all.unique.levels <- as.character(unique(cvar.all.levels.all.studies))
   
   cvar.all.unique.levels.transmit <- paste0(cvar.all.unique.levels, collapse=",")
 }else{
   cvar.all.unique.levels.transmit<-NULL
 }
 ########################################################
 
 if(!is.null(stvar)){
   stvar.asfactor.calltext <- call("asFactorDS1", stvar)
   stvar.all.levels <- DSI::datashield.aggregate(datasources, stvar.asfactor.calltext)
   
   numstudies <- length(datasources)
   
   stvar.all.levels.all.studies <- NULL
   
   for(j in 1:numstudies){
     stvar.all.levels.all.studies <- c(stvar.all.levels.all.studies,stvar.all.levels[[j]])
   }
   
   if (length(stvar.all.levels.all.studies) == 0) {
     stop(paste0("Unable to obtain factors for stvar: '", stvar , "'"), call. = FALSE)
   }

   stvar.all.unique.levels <- as.character(unique(stvar.all.levels.all.studies))
   
   stvar.all.unique.levels.transmit <- paste0(stvar.all.unique.levels, collapse=",")
 }else{
   stvar.all.unique.levels.transmit<-NULL
 }

#ASSIGN TABLE TO SERVERSIDE IF REQUIRED
if(table.assign)
	{

	if(is.null(newobj))
		{
		newobj<-"table.newobj"
		}
		

# CALL THE MAIN SERVER SIDE ASSIGN FUNCTION
 
  calltext.assign <- call("tableDS.assign", rvar.transmit=rvar.transmit, cvar.transmit=cvar.transmit,
                    stvar.transmit=stvar.transmit, rvar.all.unique.levels.transmit=rvar.all.unique.levels.transmit,
                    cvar.all.unique.levels.transmit=cvar.all.unique.levels.transmit,
                    stvar.all.unique.levels.transmit=stvar.all.unique.levels.transmit,
                    exclude.transmit=exclude.transmit, useNA.transmit=useNA.transmit
					)
	
	
DSI::datashield.assign(datasources, newobj, calltext.assign)
	}
  
# CALL THE MAIN SERVER SIDE AGGREGATE FUNCTION

  calltext <- call("tableDS", rvar.transmit=rvar.transmit, cvar.transmit=cvar.transmit,
                    stvar.transmit=stvar.transmit,rvar.all.unique.levels.transmit=rvar.all.unique.levels.transmit,
                   cvar.all.unique.levels.transmit=cvar.all.unique.levels.transmit,
                   stvar.all.unique.levels.transmit=stvar.all.unique.levels.transmit,
                    exclude.transmit=exclude.transmit, useNA.transmit=useNA.transmit,
					force.nfilter.transmit=force.nfilter.transmit)

 
 table.out<-DSI::datashield.aggregate(datasources, calltext)


#END OF MAIN FUNCTION LEADING UP TO CALL
##############################################################################
###############################################################################

#Take serverside output and set up arrays with the correct dimensions
#so all values of a given dimension in any study are included in the
#tables produced for each individual study and for the combined values
#across studies 
 
 #How many studies have returned tables
numsources.orig<-length(table.out)
table.out.orig<-table.out

#Check whether return is a table or an error message
valid.output<-rep(1,numsources.orig)
error.messages<-table.out
sum.valid<-0
study.names.valid<-NULL
list.temp<-NULL


for(ns in 1:numsources.orig)
{
    if("character" %in% class(table.out[[ns]]))
	{
	    valid.output[ns]<-0
	    error.messages[[ns]]<-table.out[[ns]]
	}	
	else
	{
	    error.messages[[ns]]<-"No errors reported from this study"
	}
}

num.valid.studies<-sum(valid.output)


if(num.valid.studies==0)
{
	if ((! table.assign) || report.chisq.tests)
	{
	    validity.message<-"All studies failed for reasons identified below"
	    cat("\n",validity.message,"\n\n")
	    for(ns in 1:numsources.orig)
	    {
	        cat("\nStudy",ns,": ",error.messages[[ns]],"\n")
	    }

	    return(list(validity.message=validity.message,error.messages=error.messages))
	}
	else
	{
	    return(NULL)
	}
}


for(ns in 1:numsources.orig)
{
  if(valid.output[ns])
  {
    sum.valid<-sum.valid+1
    list.temp<-paste0(list.temp,"table.out.orig[[",ns,"]]")
    if(sum.valid < num.valid.studies)
    {
        list.temp<-paste0(list.temp,",")
    }
    study.names.valid<-c(study.names.valid,names(datasources)[ns])
  }
}


table.out.valid <- FALSE
list.text<-paste0("table.out.valid<-list(",list.temp,")")


eval(parse(text=list.text))


table.out<-table.out.valid
numsources<-length(table.out)



if(num.valid.studies>0&&num.valid.studies<numsources.orig)
{
validity.message<-"At least one study failed for reasons identified by 'error.messages':"

for(ns in 1:numsources.orig)
	{
	message.add<-paste0("Study",ns,": ",error.messages[[ns]])
	validity.message<-c(validity.message,message.add)
	}

#	cat("\n",validity.message,"\n")
#	for(ns in 1:numsources.orig)
#		{
#		cat("\nStudy",ns,": ",error.messages[[ns]])
#		}
#		cat("\n\n")

#table.out<-table.out.valid
#numsources<-length(table.out)
}

if(num.valid.studies==numsources.orig)
{
    validity.message<-"Data in all studies were valid"
    if (! table.assign)
    {
	    cat("\n",validity.message,"\n")
	    for(ns in 1:numsources.orig)
		{
		    cat("\nStudy",ns,": ",error.messages[[ns]])
		}
		cat("\n\n")
    }
}

#check all tables from all sources have the same number of dimensions

table.dimensions<-rep(0,numsources)
for(ns in 1:numsources)
  {
  table.dimensions[ns]<-length(dim(table.out[[ns]]))
  }

#table.dimensions

all.dims.same<-TRUE
if(numsources>1)
{
  for(ns in 1:(numsources-1))
    {
      if(table.dimensions[ns]!=table.dimensions[ns+1])
      {
      all.dims.same<-FALSE
      return.message<-"Warning: tables in different sources have different numbers of dimensions. Please analyse and combine yourself from study.specific tables above" 
      print(return.message)
      return(return.message)   
      }
    }

  if(all.dims.same) 
  {
    num.table.dims<-table.dimensions[1]
  }
  else
  {
    num.table.dims<-NA
  }
}


if(numsources==1)
{
  num.table.dims<-table.dimensions[1]
  
}

#num.table.dims
 
######################################################################
#Work first with three dimensional tables
if(num.table.dims==3)
{
 
 #identify all possible values of each dimension
  
  rvar.dimnames<-NULL
  cvar.dimnames<-NULL
  stvar.dimnames<-NULL
  
  for(ns in 1:numsources)
      {
      rvar.dimnames<-c(rvar.dimnames,dimnames(table.out[[ns]])$rvar)
      rvar.dimnames[is.na(rvar.dimnames)]<-"NA"
      cvar.dimnames<-c(cvar.dimnames,dimnames(table.out[[ns]])$cvar)
      cvar.dimnames[is.na(cvar.dimnames)]<-"NA"
      stvar.dimnames<-c(stvar.dimnames,dimnames(table.out[[ns]])$stvar)
      stvar.dimnames[is.na(stvar.dimnames)]<-"NA"
      }

rvar.dimnames.unique<-unique(rvar.dimnames)
cvar.dimnames.unique<-unique(cvar.dimnames)
stvar.dimnames.unique<-unique(stvar.dimnames)

#print(rvar.dimnames)

#print(rvar.dimnames.unique)
#print(cvar.dimnames.unique)
#print(stvar.dimnames.unique)

numcells.all.sources<-length(rvar.dimnames.unique)*length(cvar.dimnames.unique)*length(stvar.dimnames.unique)
#numcells.all.sources

empty.table.all.sources.col.1<-rep(rvar.dimnames.unique,times=(length(cvar.dimnames.unique)*length(stvar.dimnames.unique)))
empty.table.all.sources.col.1

empty.table.all.sources.col.2<-rep(rep(cvar.dimnames.unique,each=length(rvar.dimnames.unique)),times=length(stvar.dimnames.unique))
empty.table.all.sources.col.2

empty.table.all.sources.col.3<-rep(rep(stvar.dimnames.unique,each=(length(rvar.dimnames.unique)*length(cvar.dimnames.unique))))
empty.table.all.sources.col.3

empty.table.all.sources.col.4<-rep(0,times=(length(rvar.dimnames.unique)*length(cvar.dimnames.unique)*length(stvar.dimnames.unique)))
empty.table.all.sources.col.4

empty.table.all.sources<-cbind(empty.table.all.sources.col.1,empty.table.all.sources.col.2,
                               empty.table.all.sources.col.3,empty.table.all.sources.col.4)

#dimnames(empty.table.all.sources)<-list(NULL,NULL)


empty.table.all.sources[is.na(empty.table.all.sources)]<-"NA"

#print(empty.table.all.sources) #1 table length 36 empty values

 

dim.vector.all.sources<-c(length(rvar.dimnames.unique),length(cvar.dimnames.unique),
                          length(stvar.dimnames.unique),numsources)

array.all.sources<-array(data=NA,dim=dim.vector.all.sources,
                         dimnames=list(rvar.dimnames.unique,cvar.dimnames.unique,stvar.dimnames.unique,NULL))

names(dimnames(array.all.sources))<-c(rvar,cvar,stvar,"study")
#print(array.all.sources) #length 108 all NAs so the study specific template for dimnames
						 #has been correctly expanded to
                         #include all studies

#KEY LOOP
for(ns in 1:numsources)
{
#start with study 1 then 2 etc etc



numcells<-length(table.out[[ns]])
#print(numcells)
study.specific.dim.vect<-dim(table.out[[ns]])
study.specific.dim.vect

count.in.cell<-rep(NA,numcells)
stvar.mark<-rep("",numcells)
cvar.mark<-rep("",numcells)
rvar.mark<-rep("",numcells)
cells.so.far<-0

  for(ss in 1:study.specific.dim.vect[3])
  {
      for(cc in 1:study.specific.dim.vect[2])
      {
          for(rr in 1:study.specific.dim.vect[1])
          {
           cells.so.far<-cells.so.far+1
           count.in.cell[cells.so.far]<-table.out[[ns]][cells.so.far]
           rvar.mark[cells.so.far]<-rvar.dimnames[rr]
           cvar.mark[cells.so.far]<-cvar.dimnames[cc]
           stvar.mark[cells.so.far]<-stvar.dimnames[ss]
          }
      }
  }


table.current.study<-cbind(rvar.mark,cvar.mark,stvar.mark,count.in.cell)
table.current.study[is.na(table.current.study)]<-"NA"
#cat("current study =",ns)
#print(table.current.study)


array.current.study<-array(data=table.current.study[,4],dim=dim.vector.all.sources[1:3],
                           dimnames=list(rvar.dimnames.unique,cvar.dimnames.unique,stvar.dimnames.unique))
names(dimnames(array.current.study))<-c(rvar,cvar,stvar)
#array.current.study

#IS TABLE FOR CURRENT STUDY IDENTICAL IN STRUCTURE TO EMPTY TABLE OVERALL?

etas<-as.vector(empty.table.all.sources[,1:3])
tss<-as.vector(table.current.study[,1:3])

tables.identical<-FALSE

if((sum(etas==tss))==length(etas))tables.identical<-TRUE

#if the table structure is identical to the structure of the empty table from all sources
#then simply write the counts from the study specific table to the empty table from all sources
#and then that becomes the study.specific.table for the given study

#if structure not identical match rows in study specific table to rows in array.all.sources using
#the dimnames.x.num to index the equivalent rows then the dimnames.x to check

if(tables.identical)
{
#array.all.sources[,,,ns]<-array.current.study
}

#if(!tables.identical)
{
#set up sequential numeric code for each of the sorted unique values in each dimnames
dimnames.1<-dimnames(array.all.sources)[[1]]
dimnames.1.num<-1:length(dimnames.1)
#cbind(dimnames.1,dimnames.1.num)

dimnames.2<-dimnames(array.all.sources)[[2]]
dimnames.2.num<-1:length(dimnames.2)
#cbind(dimnames.2,dimnames.2.num)

dimnames.3<-dimnames(array.all.sources)[[3]]
dimnames.3.num<-1:length(dimnames.3)
#cbind(dimnames.3,dimnames.3.num)

#Map row in table from current study with equivalent row in array containing unique
#values from every study and check that the identified row in the full array contains
#the same dimnames values as the row in the current study table. If it does not,
#then rather than trying to second guess all possible ways this could go wrong
#simply stop processing and ask user to work with study specific tables to
#create table statistics he/she requires

index.current.study<-rep(NA,length(table.current.study[,1]))
index.overall<-rep(NA,(dim.vector.all.sources[1]*dim.vector.all.sources[2]*dim.vector.all.sources[3]))

for(oo in 1:length(table.current.study[,1]))
{
  d1<-table.current.study[oo,1]
  n1<-dimnames.1.num[dimnames.1==d1]
  
  d2<-table.current.study[oo,2]
  n2<-dimnames.2.num[dimnames.2==d2]

  d3<-table.current.study[oo,3]
  n3<-dimnames.3.num[dimnames.3==d3]
 
  index.current.study<-oo

#index.overall applies to empty.table.all.sources.col.1 etc which are all of length dim1*dim2*dim3 because
#this was created before array.all.sources was replicated to include space for all studies. So calculations
#of index.overall do not need to take account of ns value

 index.overall<-n1+dim.vector.all.sources[1]*(n2-1)+
                 dim.vector.all.sources[1]*dim.vector.all.sources[2]*(n3-1)

  
  count.current.study.current.row<-table.current.study[oo,4]
  
  #check dimnames all match
  d1.a<-empty.table.all.sources.col.1[index.overall]

  d2.a<-empty.table.all.sources.col.2[index.overall]

  d3.a<-empty.table.all.sources.col.3[index.overall]

   
  if(d1.a!=d1||d2.a!=d2||d3.a!=d3)
  {
  return.message=  "Dimensions of tables not behaving sensibly across studies.Please check the data in each study and calculate counts and percentages, yourself, using the counts from the individual studies"
  cat(return.message)
  return(return.message) 
  }
  
  #dimension markers match so allocate count to correct cell

  #index.overall.extended.over.all.studies applies to array.all.sources which is of length dim1*dim2*dim3*numsources
  #must therefore add dim.vector.all.sources[1]*dim.vector.all.sources[2]*dim.vector.all.sources[3]*(ns-1)
  #to identify correct row in extended array which has the full dimnames structure replicated for each study
  
  index.overall.extended.over.all.studies<-index.overall+
              dim.vector.all.sources[1]*dim.vector.all.sources[2]*dim.vector.all.sources[3]*(ns-1)
  
  array.all.sources[index.overall.extended.over.all.studies]<-count.current.study.current.row
  }

dimnames(array.all.sources)[4]<-list(as.character(1:numsources))
#print(array.all.sources)





		}#end of tables not identical loop

	}#end of ns loop


#Combine across studies if requested
####################################

combine.array.all.sources<-as.numeric(array.all.sources[,,,1])

if(numsources>1)
{
	for(ns in 2:numsources)
	{
	combine.array.all.sources<-combine.array.all.sources+as.numeric(array.all.sources[,,,ns])
	}
}

combine.array.all.sources<-array(data=combine.array.all.sources,dim=dim(array.all.sources)[1:num.table.dims],
							dimnames=dimnames(array.all.sources)[1:num.table.dims])
#print(combine.array.all.sources)

######################

#return(array.all.sources)

}#end of 3 dims loop


######################################################################

#Next work with two dimensional tables
if(num.table.dims==2)
{
 
 #identify all possible values of each dimension
  
  rvar.dimnames<-NULL
  cvar.dimnames<-NULL
  
  for(ns in 1:numsources)
      {
      rvar.dimnames<-c(rvar.dimnames,dimnames(table.out[[ns]])$rvar)
      rvar.dimnames[is.na(rvar.dimnames)]<-"NA"
      cvar.dimnames<-c(cvar.dimnames,dimnames(table.out[[ns]])$cvar)
      cvar.dimnames[is.na(cvar.dimnames)]<-"NA"
      }

rvar.dimnames.unique<-unique(rvar.dimnames)
cvar.dimnames.unique<-unique(cvar.dimnames)


numcells.all.sources<-length(rvar.dimnames.unique)*length(cvar.dimnames.unique)

empty.table.all.sources.col.1<-rep(rvar.dimnames.unique,times=(length(cvar.dimnames.unique)))
empty.table.all.sources.col.1

empty.table.all.sources.col.2<-rep(cvar.dimnames.unique,each=length(rvar.dimnames.unique))
empty.table.all.sources.col.2

empty.table.all.sources.col.3<-rep(0,times=(length(rvar.dimnames.unique)*length(cvar.dimnames.unique)))
empty.table.all.sources.col.3

empty.table.all.sources<-cbind(empty.table.all.sources.col.1,empty.table.all.sources.col.2,
                               empty.table.all.sources.col.3)


empty.table.all.sources[is.na(empty.table.all.sources)]<-"NA"

#print(empty.table.all.sources) #1 table length 36 empty values

 

dim.vector.all.sources<-c(length(rvar.dimnames.unique),length(cvar.dimnames.unique),numsources)

array.all.sources<-array(data=NA,dim=dim.vector.all.sources,
                         dimnames=list(rvar.dimnames.unique,cvar.dimnames.unique,NULL))

names(dimnames(array.all.sources))<-c(rvar,cvar,"study")
#print(array.all.sources) #length 108 all NAs so the study specific template for dimnames
						 #has been correctly expanded to
                         #include all studies

#KEY LOOP
for(ns in 1:numsources)
{
#start with study 1 then 2 etc etc

numcells<-length(table.out[[ns]])
#print(numcells)
study.specific.dim.vect<-dim(table.out[[ns]])
study.specific.dim.vect

count.in.cell<-rep(NA,numcells)
cvar.mark<-rep("",numcells)
rvar.mark<-rep("",numcells)
cells.so.far<-0

  
      for(cc in 1:study.specific.dim.vect[2])
      {
          for(rr in 1:study.specific.dim.vect[1])
          {
           cells.so.far<-cells.so.far+1
           count.in.cell[cells.so.far]<-table.out[[ns]][cells.so.far]
           rvar.mark[cells.so.far]<-rvar.dimnames[rr]
           cvar.mark[cells.so.far]<-cvar.dimnames[cc]
          }
      }
  


table.current.study<-cbind(rvar.mark,cvar.mark,count.in.cell)
table.current.study[is.na(table.current.study)]<-"NA"
#cat("current study =",ns)
#print(table.current.study)


array.current.study<-array(data=table.current.study[,3],dim=dim.vector.all.sources[1:2],
                           dimnames=list(rvar.dimnames.unique,cvar.dimnames.unique))
names(dimnames(array.current.study))<-c(rvar,cvar)
#array.current.study

#IS TABLE FOR CURRENT STUDY IDENTICAL IN STRUCTURE TO EMPTY TABLE OVERALL?

etas<-as.vector(empty.table.all.sources[,1:2])
tss<-as.vector(table.current.study[,1:2])

tables.identical<-FALSE

if((sum(etas==tss))==length(etas))tables.identical<-TRUE

#if the table structure is identical to the structure of the empty table from all sources
#then simply write the counts from the study specific table to the empty table from all sources
#and then that becomes the study.specific.table for the given study

#if structure not identical match rows in study specific table to rows in array.all.sources using
#the dimnames.x.num to index the equivalent rows then the dimnames.x to check

if(tables.identical)
{
#array.all.sources[,,,ns]<-array.current.study
}

#if(!tables.identical)
{
#set up sequential numeric code for each of the sorted unique values in each dimnames
dimnames.1<-dimnames(array.all.sources)[[1]]
dimnames.1.num<-1:length(dimnames.1)
#cbind(dimnames.1,dimnames.1.num)

dimnames.2<-dimnames(array.all.sources)[[2]]
dimnames.2.num<-1:length(dimnames.2)
#cbind(dimnames.2,dimnames.2.num)


#Map row in table from current study with equivalent row in array containing unique
#values from every study and check that the identified row in the full array contains
#the same dimnames values as the row in the current study table. If it does not,
#then rather than trying to second guess all possible ways this could go wrong
#simply stop processing and ask user to work with study specific tables to
#create table statistics he/she requires

index.current.study<-rep(NA,length(table.current.study[,1]))
index.overall<-rep(NA,(dim.vector.all.sources[1]*dim.vector.all.sources[2]))

for(oo in 1:length(table.current.study[,1]))
{
  d1<-table.current.study[oo,1]
  n1<-dimnames.1.num[dimnames.1==d1]
  
  d2<-table.current.study[oo,2]
  n2<-dimnames.2.num[dimnames.2==d2]
 
  index.current.study<-oo

#index.overall applies to empty.table.all.sources.col.1 etc which are all of length dim1*dim2*dim3 because
#this was created before array.all.sources was replicated to include space for all studies. So calculations
#of index.overall do not need to take account of ns value

 index.overall<-n1+dim.vector.all.sources[1]*(n2-1)
 #                dim.vector.all.sources[1]*dim.vector.all.sources[2]*(n3-1)

  
  count.current.study.current.row<-table.current.study[oo,3]
  
  #check dimnames all match
  d1.a<-empty.table.all.sources.col.1[index.overall]

  d2.a<-empty.table.all.sources.col.2[index.overall]


#test effect of discrepency
#   d1<-"999"
   
  if(d1.a!=d1||d2.a!=d2)
  {
  return.message=  "Dimensions of tables not behaving sensibly across studies.Please check the data in each study and calculate counts and percentages, yourself, using the counts from the individual studies"
  cat(return.message)
  return(return.message) 
  }
  
  #dimension markers match so allocate count to correct cell

  #index.overall.extended.over.all.studies applies to array.all.sources which is of length dim1*dim2*dim3*numsources
  #must therefore add dim.vector.all.sources[1]*dim.vector.all.sources[2]*dim.vector.all.sources[3]*(ns-1)
  #to identify correct row in extended array which has the full dimnames structure replicated for each study
  
  index.overall.extended.over.all.studies<-index.overall+
              dim.vector.all.sources[1]*dim.vector.all.sources[2]*(ns-1)
  
  array.all.sources[index.overall.extended.over.all.studies]<-count.current.study.current.row
  }

dimnames(array.all.sources)[3]<-list(as.character(1:numsources))
#print(array.all.sources)





		}#end of tables not identical loop

	}#end of ns loop

 
#Combine across studies if requested
####################################

combine.array.all.sources<-as.numeric(array.all.sources[,,1])

if(numsources>1)
{
	for(ns in 2:numsources)
	{
	combine.array.all.sources<-combine.array.all.sources+as.numeric(array.all.sources[,,ns])
	}
}

combine.array.all.sources<-array(data=combine.array.all.sources,dim=dim(array.all.sources)[1:num.table.dims],
							dimnames=dimnames(array.all.sources)[1:num.table.dims])
#print(combine.array.all.sources)

######################

#return(array.all.sources)


}#end of 2 dims loop

######################################################################

#Next work with one dimensional tables
if(num.table.dims==1)
{
 
 #identify all possible values of each dimension
  
  rvar.dimnames<-NULL
  
  
  for(ns in 1:numsources)
      {
      rvar.dimnames<-c(rvar.dimnames,dimnames(table.out[[ns]])$rvar)
      rvar.dimnames[is.na(rvar.dimnames)]<-"NA"
      }

rvar.dimnames.unique<-unique(rvar.dimnames)

numcells.all.sources<-length(rvar.dimnames.unique)

empty.table.all.sources.col.1<-rvar.dimnames.unique
empty.table.all.sources.col.1

empty.table.all.sources.col.2<-rep(0,times=(length(rvar.dimnames.unique)))
empty.table.all.sources.col.2

empty.table.all.sources<-cbind(empty.table.all.sources.col.1,empty.table.all.sources.col.2)


empty.table.all.sources[is.na(empty.table.all.sources)]<-"NA"

#print(empty.table.all.sources) #1 table length 36 empty values

 

dim.vector.all.sources<-c(length(rvar.dimnames.unique),numsources)

array.all.sources<-array(data=NA,dim=dim.vector.all.sources,
                         dimnames=list(rvar.dimnames.unique,NULL))

names(dimnames(array.all.sources))<-c(rvar,"study")

#KEY LOOP
for(ns in 1:numsources)
{
#start with study 1 then 2 etc etc

numcells<-length(table.out[[ns]])
#print(numcells)
study.specific.dim.vect<-dim(table.out[[ns]])
study.specific.dim.vect

count.in.cell<-rep(NA,numcells)
rvar.mark<-rep("",numcells)
cells.so.far<-0

  
      
          for(rr in 1:study.specific.dim.vect[1])
          {
           cells.so.far<-cells.so.far+1
           count.in.cell[cells.so.far]<-table.out[[ns]][cells.so.far]
           rvar.mark[cells.so.far]<-rvar.dimnames[rr]
          }

  


table.current.study<-cbind(rvar.mark,count.in.cell)
table.current.study[is.na(table.current.study)]<-"NA"
#cat("current study =",ns)
#print(table.current.study)


array.current.study<-array(data=table.current.study[,2],dim=dim.vector.all.sources[1],
                           dimnames=list(rvar.dimnames.unique))
names(dimnames(array.current.study))<-c(rvar)

#IS TABLE FOR CURRENT STUDY IDENTICAL IN STRUCTURE TO EMPTY TABLE OVERALL?

etas<-as.vector(empty.table.all.sources[,1])
tss<-as.vector(table.current.study[,1])

tables.identical<-FALSE

if((sum(etas==tss))==length(etas))tables.identical<-TRUE

#if the table structure is identical to the structure of the empty table from all sources
#then simply write the counts from the study specific table to the empty table from all sources
#and then that becomes the study.specific.table for the given study

#if structure not identical match rows in study specific table to rows in array.all.sources using
#the dimnames.x.num to index the equivalent rows then the dimnames.x to check

if(tables.identical)
{
#array.all.sources[,,,ns]<-array.current.study
}

#if(!tables.identical)
{
#set up sequential numeric code for each of the sorted unique values in each dimnames
dimnames.1<-dimnames(array.all.sources[])[[1]]
dimnames.1.num<-1:length(dimnames.1)


#Map row in table from current study with equivalent row in array containing unique
#values from every study and check that the identified row in the full array contains
#the same dimnames values as the row in the current study table. If it does not,
#then rather than trying to second guess all possible ways this could go wrong
#simply stop processing and ask user to work with study specific tables to
#create table statistics he/she requires

index.current.study<-rep(NA,length(table.current.study[,1]))
index.overall<-rep(NA,(dim.vector.all.sources[1]))

for(oo in 1:length(table.current.study[,1]))
{
  d1<-table.current.study[oo,1]
  n1<-dimnames.1.num[dimnames.1==d1]
   
  index.current.study<-oo

#index.overall applies to empty.table.all.sources.col.1 etc which are all of length dim1*dim2*dim3 because
#this was created before array.all.sources was replicated to include space for all studies. So calculations
#of index.overall do not need to take account of ns value

 index.overall<-n1
 #                +dim.vector.all.sources[1]*(n2-1)
 #                dim.vector.all.sources[1]*dim.vector.all.sources[2]*(n3-1)

  
  count.current.study.current.row<-table.current.study[oo,2]
  
  #check dimnames all match
  d1.a<-empty.table.all.sources.col.1[index.overall]


  if(d1.a!=d1)
  {
  return.message=  "Dimensions of tables not behaving sensibly across studies.Please check the data in each study and calculate counts and percentages, yourself, using the counts from the individual studies"
  cat(return.message)
  return(return.message) 
  }
  
  #dimension markers match so allocate count to correct cell

  #index.overall.extended.over.all.studies applies to array.all.sources which is of length dim1*dim2*dim3*numsources
  #must therefore add dim.vector.all.sources[1]*dim.vector.all.sources[2]*dim.vector.all.sources[3]*(ns-1)
  #to identify correct row in extended array which has the full dimnames structure replicated for each study
  
  index.overall.extended.over.all.studies<-index.overall+
              dim.vector.all.sources[1]*(ns-1)
  
  array.all.sources[index.overall.extended.over.all.studies]<-count.current.study.current.row
  }

dimnames(array.all.sources)[2]<-list(as.character(1:numsources))
#print(array.all.sources)

		}#end of tables not identical loop

	}#end of ns loop



#Combine across studies if requested
####################################

combine.array.all.sources<-as.numeric(array.all.sources[,1])

if(numsources>1)
{
	for(ns in 2:numsources)
	{
	combine.array.all.sources<-combine.array.all.sources+as.numeric(array.all.sources[,ns])
	}
}

combine.array.all.sources<-array(data=combine.array.all.sources,dim=dim(array.all.sources)[1:num.table.dims],
							dimnames=dimnames(array.all.sources)[1:num.table.dims])



}#end of 1 dims loop

######################
#clean and process output tables

array.all.sources.temp<-array.all.sources

array.all.sources<-as.numeric(array.all.sources.temp)

array.all.sources<-array(data=array.all.sources,dim=dim(array.all.sources.temp),
							dimnames=dimnames(array.all.sources.temp))

output.text.temp<-paste0(",TABLES.COMBINED_all.sources_counts=combine.array.all.sources)")

	for(ns in numsources:1)
	{
	name.array.study<-paste0("array.study.",ns)
	commas.vect<-rep(",",num.table.dims)
	commas.vect<-paste(commas.vect,collapse="")
	calltext<-paste0(name.array.study,"<-array.all.sources[",commas.vect,ns,"]")
	eval(parse(text=calltext))

		output.text.temp<-paste0(",TABLE_STUDY.",study.names.valid[ns],"_counts=array.study.",ns,output.text.temp)
	}



##################################################
#NOW MOVE TO CALCULATE ROW AND COLUMN PROPORTIONS#
##################################################

##########################
#TABLES WITH 3 DIMENSIONS#
##########################

if(num.table.dims==3)
{
#start with combined table				  

combine.array.all.sources.row.props<-combine.array.all.sources
combine.array.all.sources.col.props<-combine.array.all.sources

for(st in 1:length(stvar.dimnames.unique))
{
  numrows<-dim(combine.array.all.sources[,,st])[1]
  numcols<-dim(combine.array.all.sources[,,st])[2]

	for(nr in 1:numrows)
	{
		sum.row<-sum(combine.array.all.sources[nr,,st],na.rm=TRUE)
		combine.array.all.sources.row.props[nr,,st]<-signif(combine.array.all.sources[nr,,st]/sum.row,3)
	}

	for(nc in 1:numcols)
	{
		sum.col<-sum(combine.array.all.sources[,nc,st],na.rm=TRUE)
		combine.array.all.sources.col.props[,nc,st]<-signif(combine.array.all.sources[,nc,st]/sum.col,3)
	}


}

	calltext2<-paste0("TABLE.COMBINED_row.props<-combine.array.all.sources.row.props")
	calltext3<-paste0("TABLE.COMBINED_col.props<-combine.array.all.sources.col.props")
        TABLE.COMBINED_row.props <- NULL
        TABLE.COMBINED_col.props <- NULL
	eval(parse(text=calltext2))
	eval(parse(text=calltext3))


#######################	
#study specific tables#
#######################
	for(ns in numsources:1)
	{
#	name.array.study<-paste0("array.study.",ns)
	commas.vect<-rep(",",num.table.dims)
	commas.vect<-paste(commas.vect,collapse="")
	calltext<-paste0("study.specific.table<-array.all.sources[",commas.vect,ns,"]")
        study.specific.table<-NULL
	eval(parse(text=calltext))

study.specific.table.row.props<-study.specific.table
study.specific.table.col.props<-study.specific.table


for(st in 1:length(stvar.dimnames.unique))
{
  numrows<-dim(study.specific.table[,,st])[1]
  numcols<-dim(study.specific.table[,,st])[2]


    for(nr in 1:numrows)
    {
      sum.row<-sum(study.specific.table[nr,,st],na.rm=TRUE)
      study.specific.table.row.props[nr,,st]<-signif(study.specific.table[nr,,st]/sum.row,3)
    }
	for(nc in 1:numcols)
	{
		sum.col<-sum(study.specific.table[,nc,st],na.rm=TRUE)
		study.specific.table.col.props[,nc,st]<-signif(study.specific.table[,nc,st]/sum.col,3)
	}
}

  calltext4<-paste0("TABLE.STUDY_row.props.",ns,"<-study.specific.table.row.props")
  calltext5<-paste0("TABLE.STUDY_col.props.",ns,"<-study.specific.table.col.props")
 
#print(calltext4)
#print(calltext5)
 
  eval(parse(text=calltext4))
  eval(parse(text=calltext5))

}

		output.text.temp<-paste0(",TABLES.COMBINED_all.sources_row.props=TABLE.COMBINED_row.props,
						   TABLES.COMBINED_all.sources_col.props=TABLE.COMBINED_col.props",
								output.text.temp)


  for(ns in numsources:1)
  {
	if(ns>1)
	{	
		output.text.temp<-paste0(",TABLE.STUDY.",study.names.valid[ns],"_row.props=TABLE.STUDY_row.props.",ns,",",
						     "TABLE.STUDY.",study.names.valid[ns],"_col.props=TABLE.STUDY_col.props.",ns,
							 output.text.temp)

	}
	else	
	{
		output.text.props.counts<-
		paste0("output.list<-list(TABLE.STUDY.",study.names.valid[ns],"_row.props=TABLE.STUDY_row.props.",ns,",",
						     "TABLE.STUDY.",study.names.valid[ns],"_col.props=TABLE.STUDY_col.props.",ns,
							 output.text.temp)
		
	}
  }

eval(parse(text=output.text.props.counts))

	return.list.first<-list(output.list=output.list,validity.message=validity.message)

	if(!report.chisq.tests&&!table.assign)
	{
	return(return.list.first)
	}

}#END second dim=3 loop


##########################
#TABLES WITH 2 DIMENSIONS#
##########################
if(num.table.dims==2)
{
#start with combined table				  

combine.array.all.sources.row.props<-combine.array.all.sources
combine.array.all.sources.col.props<-combine.array.all.sources


  numrows<-dim(combine.array.all.sources[,])[1]
  numcols<-dim(combine.array.all.sources[,])[2]

	for(nr in 1:numrows)
	{
		sum.row<-sum(combine.array.all.sources[nr,],na.rm=TRUE)
		combine.array.all.sources.row.props[nr,]<-signif(combine.array.all.sources[nr,]/sum.row,3)
	}

	for(nc in 1:numcols)
	{
		sum.col<-sum(combine.array.all.sources[,nc],na.rm=TRUE)
		combine.array.all.sources.col.props[,nc]<-signif(combine.array.all.sources[,nc]/sum.col,3)
	}




	calltext2<-paste0("TABLE.COMBINED_row.props<-combine.array.all.sources.row.props")
	calltext3<-paste0("TABLE.COMBINED_col.props<-combine.array.all.sources.col.props")

	eval(parse(text=calltext2))
	eval(parse(text=calltext3))


#######################	
#study specific tables#
#######################
	for(ns in numsources:1)
	{
#	name.array.study<-paste0("array.study.",ns)
	commas.vect<-rep(",",num.table.dims)
	commas.vect<-paste(commas.vect,collapse="")
	calltext<-paste0("study.specific.table<-array.all.sources[",commas.vect,ns,"]")
	eval(parse(text=calltext))

study.specific.table.row.props<-study.specific.table
study.specific.table.col.props<-study.specific.table



  numrows<-dim(study.specific.table[,])[1]
  numcols<-dim(study.specific.table[,])[2]


    for(nr in 1:numrows)
    {
      sum.row<-sum(study.specific.table[nr,],na.rm=TRUE)
      study.specific.table.row.props[nr,]<-signif(study.specific.table[nr,]/sum.row,3)
    }
	for(nc in 1:numcols)
	{
		sum.col<-sum(study.specific.table[,nc],na.rm=TRUE)
		study.specific.table.col.props[,nc]<-signif(study.specific.table[,nc]/sum.col,3)
	}


  calltext4<-paste0("TABLE.STUDY_row.props.",ns,"<-study.specific.table.row.props")
  calltext5<-paste0("TABLE.STUDY_col.props.",ns,"<-study.specific.table.col.props")
 
#print(calltext4)
#print(calltext5)
 
  eval(parse(text=calltext4))
  eval(parse(text=calltext5))

}


output.text.temp<-paste0(",TABLES.COMBINED_all.sources_row.props=TABLE.COMBINED_row.props,
						   TABLES.COMBINED_all.sources_col.props=TABLE.COMBINED_col.props",
								output.text.temp)


  for(ns in numsources:1)
  {
	if(ns>1)
	{	
	output.text.temp<-paste0(",TABLE.STUDY.",study.names.valid[ns],"_row.props=TABLE.STUDY_row.props.",ns,",",
						     "TABLE.STUDY.",study.names.valid[ns],"_col.props=TABLE.STUDY_col.props.",ns,
							 output.text.temp)
#print(output.text.temp)

	}
	else	
	{
	output.text.temp<-
	paste0("TABLE.STUDY.",study.names.valid[ns],"_row.props=TABLE.STUDY_row.props.",ns,",",
						     "TABLE.STUDY.",study.names.valid[ns],"_col.props=TABLE.STUDY_col.props.",ns,
							 output.text.temp)
	}
  }

	output.text.props.counts.dim.2<-paste0("output.list=list(",output.text.temp)
	eval(parse(text=output.text.props.counts.dim.2))

	return.list.first<-list(output.list=output.list,validity.message=validity.message)

	if(!report.chisq.tests&&!table.assign)
	{
	return(return.list.first)
	}
	
}#END second dim=2 loop


##########################
#TABLES WITH 1 DIMENSIONS#
##########################

if(num.table.dims==1)
{
#start with combined table				  

	combine.array.all.sources.col.props<-combine.array.all.sources

	numcols<-length(combine.array.all.sources)

	for(nc in 1:numcols)
	{
		sum.col<-sum(combine.array.all.sources,na.rm=TRUE)
		combine.array.all.sources.col.props<-signif(combine.array.all.sources/sum.col,3)
	}




	calltext3<-paste0("TABLE.COMBINED_col.props<-combine.array.all.sources.col.props")

	eval(parse(text=calltext3))


########################	
#rvar by source tables #
########################

	array.all.sources_counts<-array.all.sources
	array.all.sources_row.props<-array.all.sources
	array.all.sources_col.props<-array.all.sources

	numrows<-dim(array.all.sources)[1]
	numcols<-dim(array.all.sources)[2]

	row.sum<-rep(NA,numrows)
	col.sum<-rep(NA,numcols)
	
	for(nr in 1:numrows)
	{
	row.sum[nr]<-sum(array.all.sources[nr,],na.rm=TRUE)
	array.all.sources_row.props[nr,]<-array.all.sources[nr,]/row.sum[nr]
	}


	for(nc in 1:numcols)
	{
	col.sum[nc]<-sum(array.all.sources[,nc],na.rm=TRUE)
	array.all.sources_col.props[,nc]<-array.all.sources[,nc]/col.sum[nc]
	}

	dimnames(array.all.sources_counts)[2]<-list(study.names.valid)

	dimnames(array.all.sources_col.props)[2]<-list(study.names.valid)

	dimnames(array.all.sources_row.props)[2]<-list(study.names.valid)

	output.list<-list(
					  TABLE_rvar.by.study_row.props=array.all.sources_row.props,
					  TABLE_rvar.by.study_col.props=array.all.sources_col.props,
					  TABLE_rvar.by.study_counts=array.all.sources_counts,
					  TABLES.COMBINED_all.sources_proportions=TABLE.COMBINED_col.props,
					  TABLES.COMBINED_all.sources_counts=combine.array.all.sources)

	return.list.first<-list(output.list=output.list,validity.message=validity.message)

	if(!report.chisq.tests&&!table.assign)
	{
	return(return.list.first)
	}


}#END second dim=1 loop


################################
#NOW UNDERTAKE CHISQUARED TESTS#
################################

if(report.chisq.tests)
{
##########################
#TABLES WITH 3 DIMENSIONS#
##########################

#Suppress.chisq.warnings by default
if(suppress.chisq.warnings)
{
options(warn=-1)
}
##################
#Combined studies#
##################


if(num.table.dims==3){

	numtests<-dim(combine.array.all.sources)[num.table.dims]

	chisq.list.temp<-")"
	
	for(nt in numtests:1)
	{
		chisqtext<-paste0("chisq.test_TABLES.COMBINED.",nt,"<-stats::chisq.test(combine.array.all.sources[,,nt])")
		eval(parse(text=chisqtext))
		chisq.list.temp<-paste0(",chisq.test_TABLES.COMBINED_all.sources_counts_table.",nt,"=chisq.test_TABLES.COMBINED.",nt,chisq.list.temp)	
	}


##################
#Separate studies#
##################


for(ns in numsources:1)
 {
    input.calltext<-paste0("input.array.source.specific<-array.study.",ns)
        input.array.source.specific <- NULL
	eval(parse(text=input.calltext))
	
	numtests<-dim(input.array.source.specific)[num.table.dims]

	
	for(nt in numtests:1)
	{
	
		if(nt>1||ns>1)
		{
		chisqtext<-paste0("chisq.test_TABLE.STUDY.",ns,"_counts.",nt,"<-stats::chisq.test(input.array.source.specific[,,nt])")
		eval(parse(text=chisqtext))
		chisq.list.temp<-paste0(",chisq.test_TABLE.STUDY.",study.names.valid[ns],"_counts_table.",nt,"=chisq.test_TABLE.STUDY.",ns,"_counts.",nt,chisq.list.temp)	
		}
		else
		{
		chisqtext<-paste0("chisq.test_TABLE.STUDY.",ns,"_counts.",nt,"<-stats::chisq.test(input.array.source.specific[,,nt])")
		eval(parse(text=chisqtext))
		chisq.list.text<-paste0("chisq.tests<-list(chisq.test_TABLE.STUDY.",study.names.valid[ns],"_counts_table.",nt,"=chisq.test_TABLE.STUDY.",ns,"_counts.",nt,chisq.list.temp)
		}
	}
  }#END ns loop

#	print(chisq.list.text)
	
	eval(parse(text=chisq.list.text))
	

#If warnings suppressed now return to default
if(suppress.chisq.warnings)
{
options(warn=0)
}
	

	return.list.second<-list(output.list=return.list.first,chisq.tests=chisq.tests,validity.message=validity.message)

	if(!table.assign)
	{
	return(return.list.second)
	}





 }#END third dim=3 loop
 
##########################
#TABLES WITH 2 DIMENSIONS#
##########################
#Suppress.chisq.warnings by default
if(suppress.chisq.warnings)
{
options(warn=-1)
}

##################
#Combined studies#
##################
 
 if(num.table.dims==2){

	chisq.list.temp<-")"
	
		chisqtext<-paste0("chisq.test_TABLES.COMBINED<-stats::chisq.test(combine.array.all.sources)")
		eval(parse(text=chisqtext))
		chisq.list.temp<-paste0(",chisq.test_TABLES.COMBINED_all.sources_counts=chisq.test_TABLES.COMBINED",chisq.list.temp)	



##################
#Separate studies#
##################


for(ns in numsources:1)
 {
    input.calltext<-paste0("input.array.source.specific<-array.study.",ns)
	eval(parse(text=input.calltext))
	
	numtests<-dim(input.array.source.specific)[num.table.dims]
	
	
		if(ns>1)
		{
		chisqtext<-paste0("chisq.test_TABLE.STUDY.",ns,"_counts<-stats::chisq.test(input.array.source.specific)")
		eval(parse(text=chisqtext))
		chisq.list.temp<-paste0(",chisq.test_TABLE.STUDY.",study.names.valid[ns],"_counts=chisq.test_TABLE.STUDY.",ns,"_counts",chisq.list.temp)	
		}
		else
		{
		chisqtext<-paste0("chisq.test_TABLE.STUDY.",ns,"_counts<-stats::chisq.test(input.array.source.specific)")
		eval(parse(text=chisqtext))
		chisq.list.text<-paste0("chisq.tests<-list(chisq.test_TABLE.STUDY.",study.names.valid[ns],"_counts=chisq.test_TABLE.STUDY.",ns,"_counts",chisq.list.temp)
		}

  }#END ns loop

#If warnings suppressed now return to default
if(suppress.chisq.warnings)
{
options(warn=0)
}

	
	eval(parse(text=chisq.list.text))
	
	return.list.second<-list(output.list=return.list.first,chisq.tests=chisq.tests,validity.message=validity.message)

	if(!table.assign)
	{
	return(return.list.second)
	}



 }#END third dim=2 loop

#########################
#TABLES WITH 1 DIMENSION#
#########################
#Suppress.chisq.warnings by default
if(suppress.chisq.warnings)
{
options(warn=-1)
}
 
if(num.table.dims==1){

#combined.studies.not.possible.as.one.column.only

##################
#Separate studies#
##################



chisq.1D.temp<-")"



for(ns in numsources:1)
 {

	if(ns>1)
	{
	chisq.1D.temp<-paste0(",array.all.sources[,",ns,"]",chisq.1D.temp)
	}
	else
	{
		chisq.1D.text<-paste0("rvar.by.study_counts<-cbind(array.all.sources[,",ns,"]",chisq.1D.temp)	
	}


 }#END ns loop

#	print(chisq.1D.text)

        rvar.by.study_counts <- NULL
	eval(parse(text=chisq.1D.text))

	chisq.test_rvar.by.study_counts<-stats::chisq.test(rvar.by.study_counts)

	
	output.list<-list(
					  TABLE_rvar.by.study_row.props=array.all.sources_row.props,
					  TABLE_rvar.by.study_col.props=array.all.sources_col.props,
					  TABLE_rvar.by.study_counts=array.all.sources_counts,
					  TABLES.COMBINED_all.sources_proportions=TABLE.COMBINED_col.props,
					  TABLES.COMBINED_all.sources_counts=combine.array.all.sources
					  )

	chisq.tests=list(CHISQ.TEST_rvar.by.study_counts=chisq.test_rvar.by.study_counts)

	return.list.second<-list(output.list=return.list.first,chisq.tests=chisq.tests,validity.message=validity.message)

	if(!table.assign)
	{
	return(return.list.second)
	}

	
#If warnings suppressed now return to default
if(suppress.chisq.warnings)
{
options(warn=0)
}
	

 }#END third dim=1 loop
####was }

if(table.assign)
	{
# CALL THE SECOND MAIN SERVER SIDE AGGREGATE FUNCTION

	calltext <- call("tableDS2", newobj=newobj,rvar.transmit=rvar.transmit, cvar.transmit=cvar.transmit,
                    stvar.transmit=stvar.transmit)
 
	serverside.table.out<-DSI::datashield.aggregate(datasources, calltext)
 
 if(report.chisq.tests)
	{
	return.list.final<-list(serverside.table.structure=serverside.table.out,outlist=return.list.second)
	}
	else
	{
	return.list.final<-list(serverside.table.structure=serverside.table.out,outlist=return.list.first)
	}
	
	
	return(return.list.final)
	}
  }
 }
#ds.table
datashield/dsBaseClient documentation built on May 16, 2023, 10:19 p.m.