#'
#' @title Generates 1-, 2-, and 3-dimensional contingency tables with option
#' of assigning to serverside only and producing chi-squared statistics
#' @description Creates 1-dimensional, 2-dimensional and 3-dimensional
#' tables using the \code{table} function in native R.
#' @details The \code{ds.table} function selects numeric, integer or factor
#' variables on the serverside which define a contingency table with up to
#' three dimensions. The native R \code{table} function basically operates on
#' factors and if variables are specified that are integers or numerics
#' they are first coerced to factors. If the 1-dimensional, 2-dimensional or
#' 3-dimensional table generated from a given study satisfies appropriate
#' disclosure-control criteria it can be returned directly to
#' the clientside where it is presented as a study-specific
#' table and is also included in a combined table across all studies.
#'
#' The data custodian responsible for data security in a
#' given study can specify the minimum non-zero cell count that
#' determines whether the disclosure-control criterion can be viewed
#' as having been met. If the count in any one cell in a table falls
#' below the specified threshold (and is also non-zero)
#' the whole table is blocked and cannot be returned
#' to the clientside. However, even if a table is potentially
#' disclosive it can still be written to the serverside while
#' an empty representation of the structure of the table is returned
#' to the clientside. The contents of the cells in the serverside table
#' object are reflected in a vector of counts which is one component of
#' that table object.
#'
#' The true counts in the studyside vector
#' are replaced by a sequential set of cell-IDs running from 1:n
#' (where n is the total number of cells in the table) in the empty
#' representation of the structure of the potentially disclosive table
#' that is returned to the clientside. These cell-IDs reflect
#' the order of the counts in the true counts vector on the serverside.
#' In consequence, if the number 13 appears in a cell of the empty
#' table returned to the clientside, it means that the true count
#' in that same cell is held as the 13th element of the true count
#' vector saved on the serverside. This means that a data analyst
#' can still make use of the counts from a call to the {ds.table}
#' function to drive their ongoing analysis even when one or
#' more non-zero cell counts fall below the specified threshold
#' for potential disclosure risk.
#'
#' Because the table object on the
#' serverside cannot be visualised or transferred to the clientside,
#' DataSHIELD ensures that although it can, in this way, be used
#' to advance analysis, it does not create a direct risk of disclosure.
#'
#' The <rvar> argument identifies the variable defining the rows
#' in each of the 2-dimensional tables produced in the output.
#'
#' The <cvar>
#' argument identifies the variable defining the columns in the 2-dimensional
#' tables produced in the output.
#'
#' In creating a 3-dimensional table the
#' <stvar> ('separate tables') argument identifies the variable that
#' indexes the set of two dimensional tables in the output {ds.table}.
#'
#' As a minor technicality, it should be noted that
#' if a 1-dimensional table is required, one only need specify a value
#' for the <rvar> argument and any one dimensional table in the output
#' is presented as a row vectors and so technically the <rvar> variable
#' defines the columns in that 1 x n vector. However, the ds.table
#' function deals with 1-dimensional tables differently to 2 and 3
#' dimensional tables and key components of the output
#' for one dimensional tables are actually two dimensional: with
#' rows defined by <rvar> and with one column for each of the studies.
#'
#' The output list generated by {ds.table} contains tables based on counts
#' named "table.name_counts" and other tables reporting corresponding
#' column proportions ("table.name_col.props") or row proportions
#' ("table.name_row.props"). In one dimensional tables in the output the
#' output tables include _counts and _proportions. The latter are not
#' called _col.props or _row.props because, for the reasons noted
#' above, they are technically column proportions but are based on the
#' distribution of the <rvar> variable.
#'
#' If the <report.chisq.tests> argument is set to TRUE, chisq tests
#' are applied to every 2-dimensional table in the output and reported
#' as "chisq.test_table.name". The <report.chisq.tests> argument
#' defaults to FALSE.
#'
#' If there is at least one expected cell
#' counts < 5 in an output table, the native R <chisq.test> function
#' returns a warning. Because in a DataSHIELD setting this often means that
#' every study and several tables may return the same warning
#' and because it is debatable whether this warning is really
#' statistically important, the <suppress.chisq.warnings> argument
#' can be set to TRUE to block the warnings. However, it is defaulted to FALSE.
#' @param rvar is a character string (in inverted commas) specifying the
#' name of the variable defining the rows in all of the 2 dimensional
#' tables that form the output. Please see 'details' above for more
#' information about one-dimensional tables when a variable name is provided
#' by <rvar> but <cvar> and <stvar> are both NULL
#' @param cvar is a character string specifying the
#' name of the variable defining the columns in all of the 2 dimensional
#' tables that form the output.
#' @param stvar is a character string specifying the
#' name of the variable that indexes the separate two dimensional
#' tables in the output if the call specifies a 3 dimensional table.
#' @param report.chisq.tests if TRUE, chi-squared tests
#' are applied to every 2 dimensional table in the output and reported
#' as "chisq.test_table.name". Default = FALSE.
#' @param exclude this argument is passed through to the {table} function in
#' native R which is called by {tableDS}. The help for {table} in native R
#' indicates that 'exclude' specifies any levels that should be deleted for
#' all factors in rvar, cvar or stvar. If the <exclude> argument
#' does not include NA and if the <useNA> argument is not specified,
#' it implies <useNA> = "always" in DataSHIELD. If you read the help for {table} in native R
#' including the 'details' and the 'examples' (particularly 'd.patho') you
#' will see that the response of {table} to different combinations of the
#' <exclude> and <useNA> arguments can be non-intuitive. This is particularly
#' so if there is more than one type of missing (e.g. missing by observation
#' as well as missing because of an NaN response to a mathematical
#' function - such as log(-3.0)). In DataSHIELD, if you are in one
#' of these complex settings (which should not be very common) and
#' you cannot interpret the output that has been approached
#' you might try: (1) making sure that the variable producing the strange results
#' is of class factor rather than integer or numeric - although integers and
#' numerics are coerced to factors by {ds.table} they can occasionally behave less
#' well when the NA setting is complex; (2) specify both an <exclude> argument
#' e.g. exclude = c("NaN","3") and a <useNA> argument e.g. useNA= "no";
#' (3) if you are excluding multiple levels e.g exclude = c("NA","3")
#' then you can reduce this to one e.g. exclude = c("NA") and then remove
#' the 3s by deleting rows of data, or converting the 3s to a different value.
#' @param useNA this argument is passed through to the {table} function in
#' native R which is called by {tableDS}. In DataSHIELD, this argument can take
#' two values: "no" or "always" which indicate whether to include NA values in the table.
#' For further information, please see the help for the <exclude> argument (above)
#' and/or the help for the {table} function in native R. Default value is set to "always".
#' @param suppress.chisq.warnings if set to TRUE, the default warnings are
#' suppressed that would otherwise be produced by the {table} function in
#' native R whenever an expected cell count in one or more cells is less than 5.
#' Default is FALSE. Further details can be found under 'details' and the
#' help provided for the <report.chisq.tests> argument (above).
#' @param table.assign is a Boolean argument set by default to FALSE. If it is
#' FALSE the {ds.table} function acts as a standard aggregate function -
#' it returns the table that is specified in its call to the clientside
#' where it can be visualised and worked with by the analyst. But if
#' <table.assign> is TRUE, the same table object is also written to
#' the serverside. As explained under 'details' (above), this may be
#' useful when some elements of a table need to be used to drive forward the
#' overall analysis (e.g. to help select individuals for an analysis
#' sub-sample), but the required table cannot be visualised or returned
#' to the clientside because it fails disclosure rules.
#' @param newobj this a character string providing a name for the output
#' table object to be written to the serverside if <table.assign> is TRUE.
#' If no explicit name for the table object is specified, but <table.assign>
#' is nevertheless TRUE, the name for the serverside table object defaults
#' to \code{table.newobj}.
#' @param datasources a list of \code{\link{DSConnection-class}} objects obtained after login. If the <datasources>
#' the default set of connections will be used: see \link{datashield.connections_default}.
#' If the <datasources> is to be specified, it should be set without
#' inverted commas: e.g. datasources=connections.em or datasources=default.connections. If you wish to
#' apply the function solely to e.g. the second connection server in a set of three,
#' the argument can be specified as: e.g. datasources=connections.em[2].
#' If you wish to specify the first and third connection servers in a set you specify:
#' e.g. datasources=connections.em[c(1,3)].
#' @param force.nfilter if <force.nfilter> is non-NULL it must be specified as
#' a positive integer represented as a character string: e.g. "173". This
#' the has the effect of the standard value of 'nfilter.tab' (often 1, 3, 5 or 10
#' depending what value the data custodian has selected for this particular
#' data set), to this new value (here, 173). CRUCIALLY, the {ds.table} function
#' only allows the standard value to be INCREASED. So if the standard value has
#' been set as 5 (as one of the R options set in the serverside connection), "6" and
#' "4981" would be allowable values for the <force.nfilter> argument but "4" or
#' "1" would not. The purpose of this argument is for the user or developer
#' to force the table to fail the disclosure control tests so the he/she can
#' see what then happens and check that it is behaving as anticipated/hoped.
#' @return Having created the requested table based on serverside data
#' it is returned to the clientside for the analyst to visualise (unless
#' it is blocked because it fails the disclosure control criteria or
#' there is an error for some other reason).
#'
#' The clientside output from
#' {ds.table} includes error messages that identify when the creation of a
#' table from a particular study has failed and why. If table.assign=TRUE,
#' {ds.table} also writes the requested table as an object named by
#' the <newobj> argument or set to 'newObj' by default.
#'
#' Further information
#' about the visible material passed to the clientside, and the optional
#' table object written to the serverside can be seen under 'details' (above).
#' @author Paul Burton and Alex Westerberg for DataSHIELD Development Team, 01/05/2020
#' @export
#'
ds.table <- function(rvar=NULL, cvar=NULL, stvar=NULL, report.chisq.tests=FALSE,
exclude=NULL, useNA ="always", suppress.chisq.warnings=FALSE,
table.assign=FALSE, newobj=NULL, datasources=NULL,
force.nfilter=NULL){
# if no connection login details are provided look for 'connection' objects in the environment
if(is.null(datasources)){
datasources <- datashield.connections_find()
}
# ensure datasources is a list of DSConnection-class
if(!(is.list(datasources) && all(unlist(lapply(datasources, function(d) {methods::is(d,"DSConnection")}))))){
stop("The 'datasources' were expected to be a list of DSConnection-class objects", call.=FALSE)
}
# check if a value has been provided for rvar
if(is.null(rvar)){
return("Error: rvar must have a value which is a character string naming the row variable for the table")
}
# check if the input object is defined in all the studies
isDefined(datasources, rvar)
if(!is.null(cvar)&&!is.character(cvar)){
return("Error: if cvar is not null, it must have a value which is a character string naming the column variable for the table")
}
if(!is.null(cvar)){
isDefined(datasources, cvar)
}
if(!is.null(stvar)&&!is.character(stvar)){
return("Error: if stvar is not null, it must have a value which is a character string naming the variable coding separate tables for the table")
}
if(!is.null(stvar)){
isDefined(datasources, stvar)
}
if(useNA!="no" && useNA!="always"){
stop("useNA must be either 'no' or 'always'.")
}
if(!is.null(force.nfilter)&&!is.character(force.nfilter)){
return("Error: if force.nfilter is not null, it must have a value which is a character string specifying an integer for the forced value of the nfilter")
}
#All arguments should be directly transmittable
rvar.transmit<-rvar
if(is.null(cvar))
{
cvar.transmit<-NULL
}
else
{
cvar.transmit<-cvar
}
if(is.null(stvar))
{
stvar.transmit<-NULL
}
else
{
stvar.transmit<-stvar
}
if(is.null(exclude))
{
exclude.transmit<-NULL
}
else
{
exclude.transmit<-paste0(as.character(exclude),collapse=",")
}
useNA.transmit<-useNA
if(is.null(force.nfilter))
{
force.nfilter.transmit<-NULL
}
else
{
force.nfilter.transmit<-force.nfilter
}
#CALL THE asFactorDS1 SERVER SIDE FUNCTION (AN AGGREGATE FUNCTION)
# FOR rvar, cvar AND stvar
#TO DETERMINE ALL OF THE LEVELS REQUIRED
rvar.asfactor.calltext <- call("asFactorDS1", rvar)
rvar.all.levels <- DSI::datashield.aggregate(datasources, rvar.asfactor.calltext)
numstudies <- length(datasources)
rvar.all.levels.all.studies <- NULL
for(j in 1:numstudies){
rvar.all.levels.all.studies <- c(rvar.all.levels.all.studies,rvar.all.levels[[j]])
}
if (length(rvar.all.levels.all.studies) == 0) {
stop(paste0("Unable to obtain factors for rvar: '", rvar , "'"), call. = FALSE)
}
rvar.all.unique.levels <- as.character(unique(rvar.all.levels.all.studies))
rvar.all.unique.levels.transmit <- paste0(rvar.all.unique.levels, collapse=",")
########################################################
if(!is.null(cvar)){
cvar.asfactor.calltext <- call("asFactorDS1", cvar)
cvar.all.levels <- DSI::datashield.aggregate(datasources, cvar.asfactor.calltext)
numstudies <- length(datasources)
cvar.all.levels.all.studies <- NULL
for(j in 1:numstudies){
cvar.all.levels.all.studies <- c(cvar.all.levels.all.studies,cvar.all.levels[[j]])
}
if (length(cvar.all.levels.all.studies) == 0) {
stop(paste0("Unable to obtain factors for cvar: '", cvar , "'"), call. = FALSE)
}
cvar.all.unique.levels <- as.character(unique(cvar.all.levels.all.studies))
cvar.all.unique.levels.transmit <- paste0(cvar.all.unique.levels, collapse=",")
}else{
cvar.all.unique.levels.transmit<-NULL
}
########################################################
if(!is.null(stvar)){
stvar.asfactor.calltext <- call("asFactorDS1", stvar)
stvar.all.levels <- DSI::datashield.aggregate(datasources, stvar.asfactor.calltext)
numstudies <- length(datasources)
stvar.all.levels.all.studies <- NULL
for(j in 1:numstudies){
stvar.all.levels.all.studies <- c(stvar.all.levels.all.studies,stvar.all.levels[[j]])
}
if (length(stvar.all.levels.all.studies) == 0) {
stop(paste0("Unable to obtain factors for stvar: '", stvar , "'"), call. = FALSE)
}
stvar.all.unique.levels <- as.character(unique(stvar.all.levels.all.studies))
stvar.all.unique.levels.transmit <- paste0(stvar.all.unique.levels, collapse=",")
}else{
stvar.all.unique.levels.transmit<-NULL
}
#ASSIGN TABLE TO SERVERSIDE IF REQUIRED
if(table.assign)
{
if(is.null(newobj))
{
newobj<-"table.newobj"
}
# CALL THE MAIN SERVER SIDE ASSIGN FUNCTION
calltext.assign <- call("tableDS.assign", rvar.transmit=rvar.transmit, cvar.transmit=cvar.transmit,
stvar.transmit=stvar.transmit, rvar.all.unique.levels.transmit=rvar.all.unique.levels.transmit,
cvar.all.unique.levels.transmit=cvar.all.unique.levels.transmit,
stvar.all.unique.levels.transmit=stvar.all.unique.levels.transmit,
exclude.transmit=exclude.transmit, useNA.transmit=useNA.transmit
)
DSI::datashield.assign(datasources, newobj, calltext.assign)
}
# CALL THE MAIN SERVER SIDE AGGREGATE FUNCTION
calltext <- call("tableDS", rvar.transmit=rvar.transmit, cvar.transmit=cvar.transmit,
stvar.transmit=stvar.transmit,rvar.all.unique.levels.transmit=rvar.all.unique.levels.transmit,
cvar.all.unique.levels.transmit=cvar.all.unique.levels.transmit,
stvar.all.unique.levels.transmit=stvar.all.unique.levels.transmit,
exclude.transmit=exclude.transmit, useNA.transmit=useNA.transmit,
force.nfilter.transmit=force.nfilter.transmit)
table.out<-DSI::datashield.aggregate(datasources, calltext)
#END OF MAIN FUNCTION LEADING UP TO CALL
##############################################################################
###############################################################################
#Take serverside output and set up arrays with the correct dimensions
#so all values of a given dimension in any study are included in the
#tables produced for each individual study and for the combined values
#across studies
#How many studies have returned tables
numsources.orig<-length(table.out)
table.out.orig<-table.out
#Check whether return is a table or an error message
valid.output<-rep(1,numsources.orig)
error.messages<-table.out
sum.valid<-0
study.names.valid<-NULL
list.temp<-NULL
for(ns in 1:numsources.orig)
{
if("character" %in% class(table.out[[ns]]))
{
valid.output[ns]<-0
error.messages[[ns]]<-table.out[[ns]]
}
else
{
error.messages[[ns]]<-"No errors reported from this study"
}
}
num.valid.studies<-sum(valid.output)
if(num.valid.studies==0)
{
if ((! table.assign) || report.chisq.tests)
{
validity.message<-"All studies failed for reasons identified below"
cat("\n",validity.message,"\n\n")
for(ns in 1:numsources.orig)
{
cat("\nStudy",ns,": ",error.messages[[ns]],"\n")
}
return(list(validity.message=validity.message,error.messages=error.messages))
}
else
{
return(NULL)
}
}
for(ns in 1:numsources.orig)
{
if(valid.output[ns])
{
sum.valid<-sum.valid+1
list.temp<-paste0(list.temp,"table.out.orig[[",ns,"]]")
if(sum.valid < num.valid.studies)
{
list.temp<-paste0(list.temp,",")
}
study.names.valid<-c(study.names.valid,names(datasources)[ns])
}
}
table.out.valid <- FALSE
list.text<-paste0("table.out.valid<-list(",list.temp,")")
eval(parse(text=list.text))
table.out<-table.out.valid
numsources<-length(table.out)
if(num.valid.studies>0&&num.valid.studies<numsources.orig)
{
validity.message<-"At least one study failed for reasons identified by 'error.messages':"
for(ns in 1:numsources.orig)
{
message.add<-paste0("Study",ns,": ",error.messages[[ns]])
validity.message<-c(validity.message,message.add)
}
# cat("\n",validity.message,"\n")
# for(ns in 1:numsources.orig)
# {
# cat("\nStudy",ns,": ",error.messages[[ns]])
# }
# cat("\n\n")
#table.out<-table.out.valid
#numsources<-length(table.out)
}
if(num.valid.studies==numsources.orig)
{
validity.message<-"Data in all studies were valid"
if (! table.assign)
{
cat("\n",validity.message,"\n")
for(ns in 1:numsources.orig)
{
cat("\nStudy",ns,": ",error.messages[[ns]])
}
cat("\n\n")
}
}
#check all tables from all sources have the same number of dimensions
table.dimensions<-rep(0,numsources)
for(ns in 1:numsources)
{
table.dimensions[ns]<-length(dim(table.out[[ns]]))
}
#table.dimensions
all.dims.same<-TRUE
if(numsources>1)
{
for(ns in 1:(numsources-1))
{
if(table.dimensions[ns]!=table.dimensions[ns+1])
{
all.dims.same<-FALSE
return.message<-"Warning: tables in different sources have different numbers of dimensions. Please analyse and combine yourself from study.specific tables above"
print(return.message)
return(return.message)
}
}
if(all.dims.same)
{
num.table.dims<-table.dimensions[1]
}
else
{
num.table.dims<-NA
}
}
if(numsources==1)
{
num.table.dims<-table.dimensions[1]
}
#num.table.dims
######################################################################
#Work first with three dimensional tables
if(num.table.dims==3)
{
#identify all possible values of each dimension
rvar.dimnames<-NULL
cvar.dimnames<-NULL
stvar.dimnames<-NULL
for(ns in 1:numsources)
{
rvar.dimnames<-c(rvar.dimnames,dimnames(table.out[[ns]])$rvar)
rvar.dimnames[is.na(rvar.dimnames)]<-"NA"
cvar.dimnames<-c(cvar.dimnames,dimnames(table.out[[ns]])$cvar)
cvar.dimnames[is.na(cvar.dimnames)]<-"NA"
stvar.dimnames<-c(stvar.dimnames,dimnames(table.out[[ns]])$stvar)
stvar.dimnames[is.na(stvar.dimnames)]<-"NA"
}
rvar.dimnames.unique<-unique(rvar.dimnames)
cvar.dimnames.unique<-unique(cvar.dimnames)
stvar.dimnames.unique<-unique(stvar.dimnames)
#print(rvar.dimnames)
#print(rvar.dimnames.unique)
#print(cvar.dimnames.unique)
#print(stvar.dimnames.unique)
numcells.all.sources<-length(rvar.dimnames.unique)*length(cvar.dimnames.unique)*length(stvar.dimnames.unique)
#numcells.all.sources
empty.table.all.sources.col.1<-rep(rvar.dimnames.unique,times=(length(cvar.dimnames.unique)*length(stvar.dimnames.unique)))
empty.table.all.sources.col.1
empty.table.all.sources.col.2<-rep(rep(cvar.dimnames.unique,each=length(rvar.dimnames.unique)),times=length(stvar.dimnames.unique))
empty.table.all.sources.col.2
empty.table.all.sources.col.3<-rep(rep(stvar.dimnames.unique,each=(length(rvar.dimnames.unique)*length(cvar.dimnames.unique))))
empty.table.all.sources.col.3
empty.table.all.sources.col.4<-rep(0,times=(length(rvar.dimnames.unique)*length(cvar.dimnames.unique)*length(stvar.dimnames.unique)))
empty.table.all.sources.col.4
empty.table.all.sources<-cbind(empty.table.all.sources.col.1,empty.table.all.sources.col.2,
empty.table.all.sources.col.3,empty.table.all.sources.col.4)
#dimnames(empty.table.all.sources)<-list(NULL,NULL)
empty.table.all.sources[is.na(empty.table.all.sources)]<-"NA"
#print(empty.table.all.sources) #1 table length 36 empty values
dim.vector.all.sources<-c(length(rvar.dimnames.unique),length(cvar.dimnames.unique),
length(stvar.dimnames.unique),numsources)
array.all.sources<-array(data=NA,dim=dim.vector.all.sources,
dimnames=list(rvar.dimnames.unique,cvar.dimnames.unique,stvar.dimnames.unique,NULL))
names(dimnames(array.all.sources))<-c(rvar,cvar,stvar,"study")
#print(array.all.sources) #length 108 all NAs so the study specific template for dimnames
#has been correctly expanded to
#include all studies
#KEY LOOP
for(ns in 1:numsources)
{
#start with study 1 then 2 etc etc
numcells<-length(table.out[[ns]])
#print(numcells)
study.specific.dim.vect<-dim(table.out[[ns]])
study.specific.dim.vect
count.in.cell<-rep(NA,numcells)
stvar.mark<-rep("",numcells)
cvar.mark<-rep("",numcells)
rvar.mark<-rep("",numcells)
cells.so.far<-0
for(ss in 1:study.specific.dim.vect[3])
{
for(cc in 1:study.specific.dim.vect[2])
{
for(rr in 1:study.specific.dim.vect[1])
{
cells.so.far<-cells.so.far+1
count.in.cell[cells.so.far]<-table.out[[ns]][cells.so.far]
rvar.mark[cells.so.far]<-rvar.dimnames[rr]
cvar.mark[cells.so.far]<-cvar.dimnames[cc]
stvar.mark[cells.so.far]<-stvar.dimnames[ss]
}
}
}
table.current.study<-cbind(rvar.mark,cvar.mark,stvar.mark,count.in.cell)
table.current.study[is.na(table.current.study)]<-"NA"
#cat("current study =",ns)
#print(table.current.study)
array.current.study<-array(data=table.current.study[,4],dim=dim.vector.all.sources[1:3],
dimnames=list(rvar.dimnames.unique,cvar.dimnames.unique,stvar.dimnames.unique))
names(dimnames(array.current.study))<-c(rvar,cvar,stvar)
#array.current.study
#IS TABLE FOR CURRENT STUDY IDENTICAL IN STRUCTURE TO EMPTY TABLE OVERALL?
etas<-as.vector(empty.table.all.sources[,1:3])
tss<-as.vector(table.current.study[,1:3])
tables.identical<-FALSE
if((sum(etas==tss))==length(etas))tables.identical<-TRUE
#if the table structure is identical to the structure of the empty table from all sources
#then simply write the counts from the study specific table to the empty table from all sources
#and then that becomes the study.specific.table for the given study
#if structure not identical match rows in study specific table to rows in array.all.sources using
#the dimnames.x.num to index the equivalent rows then the dimnames.x to check
if(tables.identical)
{
#array.all.sources[,,,ns]<-array.current.study
}
#if(!tables.identical)
{
#set up sequential numeric code for each of the sorted unique values in each dimnames
dimnames.1<-dimnames(array.all.sources)[[1]]
dimnames.1.num<-1:length(dimnames.1)
#cbind(dimnames.1,dimnames.1.num)
dimnames.2<-dimnames(array.all.sources)[[2]]
dimnames.2.num<-1:length(dimnames.2)
#cbind(dimnames.2,dimnames.2.num)
dimnames.3<-dimnames(array.all.sources)[[3]]
dimnames.3.num<-1:length(dimnames.3)
#cbind(dimnames.3,dimnames.3.num)
#Map row in table from current study with equivalent row in array containing unique
#values from every study and check that the identified row in the full array contains
#the same dimnames values as the row in the current study table. If it does not,
#then rather than trying to second guess all possible ways this could go wrong
#simply stop processing and ask user to work with study specific tables to
#create table statistics he/she requires
index.current.study<-rep(NA,length(table.current.study[,1]))
index.overall<-rep(NA,(dim.vector.all.sources[1]*dim.vector.all.sources[2]*dim.vector.all.sources[3]))
for(oo in 1:length(table.current.study[,1]))
{
d1<-table.current.study[oo,1]
n1<-dimnames.1.num[dimnames.1==d1]
d2<-table.current.study[oo,2]
n2<-dimnames.2.num[dimnames.2==d2]
d3<-table.current.study[oo,3]
n3<-dimnames.3.num[dimnames.3==d3]
index.current.study<-oo
#index.overall applies to empty.table.all.sources.col.1 etc which are all of length dim1*dim2*dim3 because
#this was created before array.all.sources was replicated to include space for all studies. So calculations
#of index.overall do not need to take account of ns value
index.overall<-n1+dim.vector.all.sources[1]*(n2-1)+
dim.vector.all.sources[1]*dim.vector.all.sources[2]*(n3-1)
count.current.study.current.row<-table.current.study[oo,4]
#check dimnames all match
d1.a<-empty.table.all.sources.col.1[index.overall]
d2.a<-empty.table.all.sources.col.2[index.overall]
d3.a<-empty.table.all.sources.col.3[index.overall]
if(d1.a!=d1||d2.a!=d2||d3.a!=d3)
{
return.message= "Dimensions of tables not behaving sensibly across studies.Please check the data in each study and calculate counts and percentages, yourself, using the counts from the individual studies"
cat(return.message)
return(return.message)
}
#dimension markers match so allocate count to correct cell
#index.overall.extended.over.all.studies applies to array.all.sources which is of length dim1*dim2*dim3*numsources
#must therefore add dim.vector.all.sources[1]*dim.vector.all.sources[2]*dim.vector.all.sources[3]*(ns-1)
#to identify correct row in extended array which has the full dimnames structure replicated for each study
index.overall.extended.over.all.studies<-index.overall+
dim.vector.all.sources[1]*dim.vector.all.sources[2]*dim.vector.all.sources[3]*(ns-1)
array.all.sources[index.overall.extended.over.all.studies]<-count.current.study.current.row
}
dimnames(array.all.sources)[4]<-list(as.character(1:numsources))
#print(array.all.sources)
}#end of tables not identical loop
}#end of ns loop
#Combine across studies if requested
####################################
combine.array.all.sources<-as.numeric(array.all.sources[,,,1])
if(numsources>1)
{
for(ns in 2:numsources)
{
combine.array.all.sources<-combine.array.all.sources+as.numeric(array.all.sources[,,,ns])
}
}
combine.array.all.sources<-array(data=combine.array.all.sources,dim=dim(array.all.sources)[1:num.table.dims],
dimnames=dimnames(array.all.sources)[1:num.table.dims])
#print(combine.array.all.sources)
######################
#return(array.all.sources)
}#end of 3 dims loop
######################################################################
#Next work with two dimensional tables
if(num.table.dims==2)
{
#identify all possible values of each dimension
rvar.dimnames<-NULL
cvar.dimnames<-NULL
for(ns in 1:numsources)
{
rvar.dimnames<-c(rvar.dimnames,dimnames(table.out[[ns]])$rvar)
rvar.dimnames[is.na(rvar.dimnames)]<-"NA"
cvar.dimnames<-c(cvar.dimnames,dimnames(table.out[[ns]])$cvar)
cvar.dimnames[is.na(cvar.dimnames)]<-"NA"
}
rvar.dimnames.unique<-unique(rvar.dimnames)
cvar.dimnames.unique<-unique(cvar.dimnames)
numcells.all.sources<-length(rvar.dimnames.unique)*length(cvar.dimnames.unique)
empty.table.all.sources.col.1<-rep(rvar.dimnames.unique,times=(length(cvar.dimnames.unique)))
empty.table.all.sources.col.1
empty.table.all.sources.col.2<-rep(cvar.dimnames.unique,each=length(rvar.dimnames.unique))
empty.table.all.sources.col.2
empty.table.all.sources.col.3<-rep(0,times=(length(rvar.dimnames.unique)*length(cvar.dimnames.unique)))
empty.table.all.sources.col.3
empty.table.all.sources<-cbind(empty.table.all.sources.col.1,empty.table.all.sources.col.2,
empty.table.all.sources.col.3)
empty.table.all.sources[is.na(empty.table.all.sources)]<-"NA"
#print(empty.table.all.sources) #1 table length 36 empty values
dim.vector.all.sources<-c(length(rvar.dimnames.unique),length(cvar.dimnames.unique),numsources)
array.all.sources<-array(data=NA,dim=dim.vector.all.sources,
dimnames=list(rvar.dimnames.unique,cvar.dimnames.unique,NULL))
names(dimnames(array.all.sources))<-c(rvar,cvar,"study")
#print(array.all.sources) #length 108 all NAs so the study specific template for dimnames
#has been correctly expanded to
#include all studies
#KEY LOOP
for(ns in 1:numsources)
{
#start with study 1 then 2 etc etc
numcells<-length(table.out[[ns]])
#print(numcells)
study.specific.dim.vect<-dim(table.out[[ns]])
study.specific.dim.vect
count.in.cell<-rep(NA,numcells)
cvar.mark<-rep("",numcells)
rvar.mark<-rep("",numcells)
cells.so.far<-0
for(cc in 1:study.specific.dim.vect[2])
{
for(rr in 1:study.specific.dim.vect[1])
{
cells.so.far<-cells.so.far+1
count.in.cell[cells.so.far]<-table.out[[ns]][cells.so.far]
rvar.mark[cells.so.far]<-rvar.dimnames[rr]
cvar.mark[cells.so.far]<-cvar.dimnames[cc]
}
}
table.current.study<-cbind(rvar.mark,cvar.mark,count.in.cell)
table.current.study[is.na(table.current.study)]<-"NA"
#cat("current study =",ns)
#print(table.current.study)
array.current.study<-array(data=table.current.study[,3],dim=dim.vector.all.sources[1:2],
dimnames=list(rvar.dimnames.unique,cvar.dimnames.unique))
names(dimnames(array.current.study))<-c(rvar,cvar)
#array.current.study
#IS TABLE FOR CURRENT STUDY IDENTICAL IN STRUCTURE TO EMPTY TABLE OVERALL?
etas<-as.vector(empty.table.all.sources[,1:2])
tss<-as.vector(table.current.study[,1:2])
tables.identical<-FALSE
if((sum(etas==tss))==length(etas))tables.identical<-TRUE
#if the table structure is identical to the structure of the empty table from all sources
#then simply write the counts from the study specific table to the empty table from all sources
#and then that becomes the study.specific.table for the given study
#if structure not identical match rows in study specific table to rows in array.all.sources using
#the dimnames.x.num to index the equivalent rows then the dimnames.x to check
if(tables.identical)
{
#array.all.sources[,,,ns]<-array.current.study
}
#if(!tables.identical)
{
#set up sequential numeric code for each of the sorted unique values in each dimnames
dimnames.1<-dimnames(array.all.sources)[[1]]
dimnames.1.num<-1:length(dimnames.1)
#cbind(dimnames.1,dimnames.1.num)
dimnames.2<-dimnames(array.all.sources)[[2]]
dimnames.2.num<-1:length(dimnames.2)
#cbind(dimnames.2,dimnames.2.num)
#Map row in table from current study with equivalent row in array containing unique
#values from every study and check that the identified row in the full array contains
#the same dimnames values as the row in the current study table. If it does not,
#then rather than trying to second guess all possible ways this could go wrong
#simply stop processing and ask user to work with study specific tables to
#create table statistics he/she requires
index.current.study<-rep(NA,length(table.current.study[,1]))
index.overall<-rep(NA,(dim.vector.all.sources[1]*dim.vector.all.sources[2]))
for(oo in 1:length(table.current.study[,1]))
{
d1<-table.current.study[oo,1]
n1<-dimnames.1.num[dimnames.1==d1]
d2<-table.current.study[oo,2]
n2<-dimnames.2.num[dimnames.2==d2]
index.current.study<-oo
#index.overall applies to empty.table.all.sources.col.1 etc which are all of length dim1*dim2*dim3 because
#this was created before array.all.sources was replicated to include space for all studies. So calculations
#of index.overall do not need to take account of ns value
index.overall<-n1+dim.vector.all.sources[1]*(n2-1)
# dim.vector.all.sources[1]*dim.vector.all.sources[2]*(n3-1)
count.current.study.current.row<-table.current.study[oo,3]
#check dimnames all match
d1.a<-empty.table.all.sources.col.1[index.overall]
d2.a<-empty.table.all.sources.col.2[index.overall]
#test effect of discrepency
# d1<-"999"
if(d1.a!=d1||d2.a!=d2)
{
return.message= "Dimensions of tables not behaving sensibly across studies.Please check the data in each study and calculate counts and percentages, yourself, using the counts from the individual studies"
cat(return.message)
return(return.message)
}
#dimension markers match so allocate count to correct cell
#index.overall.extended.over.all.studies applies to array.all.sources which is of length dim1*dim2*dim3*numsources
#must therefore add dim.vector.all.sources[1]*dim.vector.all.sources[2]*dim.vector.all.sources[3]*(ns-1)
#to identify correct row in extended array which has the full dimnames structure replicated for each study
index.overall.extended.over.all.studies<-index.overall+
dim.vector.all.sources[1]*dim.vector.all.sources[2]*(ns-1)
array.all.sources[index.overall.extended.over.all.studies]<-count.current.study.current.row
}
dimnames(array.all.sources)[3]<-list(as.character(1:numsources))
#print(array.all.sources)
}#end of tables not identical loop
}#end of ns loop
#Combine across studies if requested
####################################
combine.array.all.sources<-as.numeric(array.all.sources[,,1])
if(numsources>1)
{
for(ns in 2:numsources)
{
combine.array.all.sources<-combine.array.all.sources+as.numeric(array.all.sources[,,ns])
}
}
combine.array.all.sources<-array(data=combine.array.all.sources,dim=dim(array.all.sources)[1:num.table.dims],
dimnames=dimnames(array.all.sources)[1:num.table.dims])
#print(combine.array.all.sources)
######################
#return(array.all.sources)
}#end of 2 dims loop
######################################################################
#Next work with one dimensional tables
if(num.table.dims==1)
{
#identify all possible values of each dimension
rvar.dimnames<-NULL
for(ns in 1:numsources)
{
rvar.dimnames<-c(rvar.dimnames,dimnames(table.out[[ns]])$rvar)
rvar.dimnames[is.na(rvar.dimnames)]<-"NA"
}
rvar.dimnames.unique<-unique(rvar.dimnames)
numcells.all.sources<-length(rvar.dimnames.unique)
empty.table.all.sources.col.1<-rvar.dimnames.unique
empty.table.all.sources.col.1
empty.table.all.sources.col.2<-rep(0,times=(length(rvar.dimnames.unique)))
empty.table.all.sources.col.2
empty.table.all.sources<-cbind(empty.table.all.sources.col.1,empty.table.all.sources.col.2)
empty.table.all.sources[is.na(empty.table.all.sources)]<-"NA"
#print(empty.table.all.sources) #1 table length 36 empty values
dim.vector.all.sources<-c(length(rvar.dimnames.unique),numsources)
array.all.sources<-array(data=NA,dim=dim.vector.all.sources,
dimnames=list(rvar.dimnames.unique,NULL))
names(dimnames(array.all.sources))<-c(rvar,"study")
#KEY LOOP
for(ns in 1:numsources)
{
#start with study 1 then 2 etc etc
numcells<-length(table.out[[ns]])
#print(numcells)
study.specific.dim.vect<-dim(table.out[[ns]])
study.specific.dim.vect
count.in.cell<-rep(NA,numcells)
rvar.mark<-rep("",numcells)
cells.so.far<-0
for(rr in 1:study.specific.dim.vect[1])
{
cells.so.far<-cells.so.far+1
count.in.cell[cells.so.far]<-table.out[[ns]][cells.so.far]
rvar.mark[cells.so.far]<-rvar.dimnames[rr]
}
table.current.study<-cbind(rvar.mark,count.in.cell)
table.current.study[is.na(table.current.study)]<-"NA"
#cat("current study =",ns)
#print(table.current.study)
array.current.study<-array(data=table.current.study[,2],dim=dim.vector.all.sources[1],
dimnames=list(rvar.dimnames.unique))
names(dimnames(array.current.study))<-c(rvar)
#IS TABLE FOR CURRENT STUDY IDENTICAL IN STRUCTURE TO EMPTY TABLE OVERALL?
etas<-as.vector(empty.table.all.sources[,1])
tss<-as.vector(table.current.study[,1])
tables.identical<-FALSE
if((sum(etas==tss))==length(etas))tables.identical<-TRUE
#if the table structure is identical to the structure of the empty table from all sources
#then simply write the counts from the study specific table to the empty table from all sources
#and then that becomes the study.specific.table for the given study
#if structure not identical match rows in study specific table to rows in array.all.sources using
#the dimnames.x.num to index the equivalent rows then the dimnames.x to check
if(tables.identical)
{
#array.all.sources[,,,ns]<-array.current.study
}
#if(!tables.identical)
{
#set up sequential numeric code for each of the sorted unique values in each dimnames
dimnames.1<-dimnames(array.all.sources[])[[1]]
dimnames.1.num<-1:length(dimnames.1)
#Map row in table from current study with equivalent row in array containing unique
#values from every study and check that the identified row in the full array contains
#the same dimnames values as the row in the current study table. If it does not,
#then rather than trying to second guess all possible ways this could go wrong
#simply stop processing and ask user to work with study specific tables to
#create table statistics he/she requires
index.current.study<-rep(NA,length(table.current.study[,1]))
index.overall<-rep(NA,(dim.vector.all.sources[1]))
for(oo in 1:length(table.current.study[,1]))
{
d1<-table.current.study[oo,1]
n1<-dimnames.1.num[dimnames.1==d1]
index.current.study<-oo
#index.overall applies to empty.table.all.sources.col.1 etc which are all of length dim1*dim2*dim3 because
#this was created before array.all.sources was replicated to include space for all studies. So calculations
#of index.overall do not need to take account of ns value
index.overall<-n1
# +dim.vector.all.sources[1]*(n2-1)
# dim.vector.all.sources[1]*dim.vector.all.sources[2]*(n3-1)
count.current.study.current.row<-table.current.study[oo,2]
#check dimnames all match
d1.a<-empty.table.all.sources.col.1[index.overall]
if(d1.a!=d1)
{
return.message= "Dimensions of tables not behaving sensibly across studies.Please check the data in each study and calculate counts and percentages, yourself, using the counts from the individual studies"
cat(return.message)
return(return.message)
}
#dimension markers match so allocate count to correct cell
#index.overall.extended.over.all.studies applies to array.all.sources which is of length dim1*dim2*dim3*numsources
#must therefore add dim.vector.all.sources[1]*dim.vector.all.sources[2]*dim.vector.all.sources[3]*(ns-1)
#to identify correct row in extended array which has the full dimnames structure replicated for each study
index.overall.extended.over.all.studies<-index.overall+
dim.vector.all.sources[1]*(ns-1)
array.all.sources[index.overall.extended.over.all.studies]<-count.current.study.current.row
}
dimnames(array.all.sources)[2]<-list(as.character(1:numsources))
#print(array.all.sources)
}#end of tables not identical loop
}#end of ns loop
#Combine across studies if requested
####################################
combine.array.all.sources<-as.numeric(array.all.sources[,1])
if(numsources>1)
{
for(ns in 2:numsources)
{
combine.array.all.sources<-combine.array.all.sources+as.numeric(array.all.sources[,ns])
}
}
combine.array.all.sources<-array(data=combine.array.all.sources,dim=dim(array.all.sources)[1:num.table.dims],
dimnames=dimnames(array.all.sources)[1:num.table.dims])
}#end of 1 dims loop
######################
#clean and process output tables
array.all.sources.temp<-array.all.sources
array.all.sources<-as.numeric(array.all.sources.temp)
array.all.sources<-array(data=array.all.sources,dim=dim(array.all.sources.temp),
dimnames=dimnames(array.all.sources.temp))
output.text.temp<-paste0(",TABLES.COMBINED_all.sources_counts=combine.array.all.sources)")
for(ns in numsources:1)
{
name.array.study<-paste0("array.study.",ns)
commas.vect<-rep(",",num.table.dims)
commas.vect<-paste(commas.vect,collapse="")
calltext<-paste0(name.array.study,"<-array.all.sources[",commas.vect,ns,"]")
eval(parse(text=calltext))
output.text.temp<-paste0(",TABLE_STUDY.",study.names.valid[ns],"_counts=array.study.",ns,output.text.temp)
}
##################################################
#NOW MOVE TO CALCULATE ROW AND COLUMN PROPORTIONS#
##################################################
##########################
#TABLES WITH 3 DIMENSIONS#
##########################
if(num.table.dims==3)
{
#start with combined table
combine.array.all.sources.row.props<-combine.array.all.sources
combine.array.all.sources.col.props<-combine.array.all.sources
for(st in 1:length(stvar.dimnames.unique))
{
numrows<-dim(combine.array.all.sources[,,st])[1]
numcols<-dim(combine.array.all.sources[,,st])[2]
for(nr in 1:numrows)
{
sum.row<-sum(combine.array.all.sources[nr,,st],na.rm=TRUE)
combine.array.all.sources.row.props[nr,,st]<-signif(combine.array.all.sources[nr,,st]/sum.row,3)
}
for(nc in 1:numcols)
{
sum.col<-sum(combine.array.all.sources[,nc,st],na.rm=TRUE)
combine.array.all.sources.col.props[,nc,st]<-signif(combine.array.all.sources[,nc,st]/sum.col,3)
}
}
calltext2<-paste0("TABLE.COMBINED_row.props<-combine.array.all.sources.row.props")
calltext3<-paste0("TABLE.COMBINED_col.props<-combine.array.all.sources.col.props")
TABLE.COMBINED_row.props <- NULL
TABLE.COMBINED_col.props <- NULL
eval(parse(text=calltext2))
eval(parse(text=calltext3))
#######################
#study specific tables#
#######################
for(ns in numsources:1)
{
# name.array.study<-paste0("array.study.",ns)
commas.vect<-rep(",",num.table.dims)
commas.vect<-paste(commas.vect,collapse="")
calltext<-paste0("study.specific.table<-array.all.sources[",commas.vect,ns,"]")
study.specific.table<-NULL
eval(parse(text=calltext))
study.specific.table.row.props<-study.specific.table
study.specific.table.col.props<-study.specific.table
for(st in 1:length(stvar.dimnames.unique))
{
numrows<-dim(study.specific.table[,,st])[1]
numcols<-dim(study.specific.table[,,st])[2]
for(nr in 1:numrows)
{
sum.row<-sum(study.specific.table[nr,,st],na.rm=TRUE)
study.specific.table.row.props[nr,,st]<-signif(study.specific.table[nr,,st]/sum.row,3)
}
for(nc in 1:numcols)
{
sum.col<-sum(study.specific.table[,nc,st],na.rm=TRUE)
study.specific.table.col.props[,nc,st]<-signif(study.specific.table[,nc,st]/sum.col,3)
}
}
calltext4<-paste0("TABLE.STUDY_row.props.",ns,"<-study.specific.table.row.props")
calltext5<-paste0("TABLE.STUDY_col.props.",ns,"<-study.specific.table.col.props")
#print(calltext4)
#print(calltext5)
eval(parse(text=calltext4))
eval(parse(text=calltext5))
}
output.text.temp<-paste0(",TABLES.COMBINED_all.sources_row.props=TABLE.COMBINED_row.props,
TABLES.COMBINED_all.sources_col.props=TABLE.COMBINED_col.props",
output.text.temp)
for(ns in numsources:1)
{
if(ns>1)
{
output.text.temp<-paste0(",TABLE.STUDY.",study.names.valid[ns],"_row.props=TABLE.STUDY_row.props.",ns,",",
"TABLE.STUDY.",study.names.valid[ns],"_col.props=TABLE.STUDY_col.props.",ns,
output.text.temp)
}
else
{
output.text.props.counts<-
paste0("output.list<-list(TABLE.STUDY.",study.names.valid[ns],"_row.props=TABLE.STUDY_row.props.",ns,",",
"TABLE.STUDY.",study.names.valid[ns],"_col.props=TABLE.STUDY_col.props.",ns,
output.text.temp)
}
}
eval(parse(text=output.text.props.counts))
return.list.first<-list(output.list=output.list,validity.message=validity.message)
if(!report.chisq.tests&&!table.assign)
{
return(return.list.first)
}
}#END second dim=3 loop
##########################
#TABLES WITH 2 DIMENSIONS#
##########################
if(num.table.dims==2)
{
#start with combined table
combine.array.all.sources.row.props<-combine.array.all.sources
combine.array.all.sources.col.props<-combine.array.all.sources
numrows<-dim(combine.array.all.sources[,])[1]
numcols<-dim(combine.array.all.sources[,])[2]
for(nr in 1:numrows)
{
sum.row<-sum(combine.array.all.sources[nr,],na.rm=TRUE)
combine.array.all.sources.row.props[nr,]<-signif(combine.array.all.sources[nr,]/sum.row,3)
}
for(nc in 1:numcols)
{
sum.col<-sum(combine.array.all.sources[,nc],na.rm=TRUE)
combine.array.all.sources.col.props[,nc]<-signif(combine.array.all.sources[,nc]/sum.col,3)
}
calltext2<-paste0("TABLE.COMBINED_row.props<-combine.array.all.sources.row.props")
calltext3<-paste0("TABLE.COMBINED_col.props<-combine.array.all.sources.col.props")
eval(parse(text=calltext2))
eval(parse(text=calltext3))
#######################
#study specific tables#
#######################
for(ns in numsources:1)
{
# name.array.study<-paste0("array.study.",ns)
commas.vect<-rep(",",num.table.dims)
commas.vect<-paste(commas.vect,collapse="")
calltext<-paste0("study.specific.table<-array.all.sources[",commas.vect,ns,"]")
eval(parse(text=calltext))
study.specific.table.row.props<-study.specific.table
study.specific.table.col.props<-study.specific.table
numrows<-dim(study.specific.table[,])[1]
numcols<-dim(study.specific.table[,])[2]
for(nr in 1:numrows)
{
sum.row<-sum(study.specific.table[nr,],na.rm=TRUE)
study.specific.table.row.props[nr,]<-signif(study.specific.table[nr,]/sum.row,3)
}
for(nc in 1:numcols)
{
sum.col<-sum(study.specific.table[,nc],na.rm=TRUE)
study.specific.table.col.props[,nc]<-signif(study.specific.table[,nc]/sum.col,3)
}
calltext4<-paste0("TABLE.STUDY_row.props.",ns,"<-study.specific.table.row.props")
calltext5<-paste0("TABLE.STUDY_col.props.",ns,"<-study.specific.table.col.props")
#print(calltext4)
#print(calltext5)
eval(parse(text=calltext4))
eval(parse(text=calltext5))
}
output.text.temp<-paste0(",TABLES.COMBINED_all.sources_row.props=TABLE.COMBINED_row.props,
TABLES.COMBINED_all.sources_col.props=TABLE.COMBINED_col.props",
output.text.temp)
for(ns in numsources:1)
{
if(ns>1)
{
output.text.temp<-paste0(",TABLE.STUDY.",study.names.valid[ns],"_row.props=TABLE.STUDY_row.props.",ns,",",
"TABLE.STUDY.",study.names.valid[ns],"_col.props=TABLE.STUDY_col.props.",ns,
output.text.temp)
#print(output.text.temp)
}
else
{
output.text.temp<-
paste0("TABLE.STUDY.",study.names.valid[ns],"_row.props=TABLE.STUDY_row.props.",ns,",",
"TABLE.STUDY.",study.names.valid[ns],"_col.props=TABLE.STUDY_col.props.",ns,
output.text.temp)
}
}
output.text.props.counts.dim.2<-paste0("output.list=list(",output.text.temp)
eval(parse(text=output.text.props.counts.dim.2))
return.list.first<-list(output.list=output.list,validity.message=validity.message)
if(!report.chisq.tests&&!table.assign)
{
return(return.list.first)
}
}#END second dim=2 loop
##########################
#TABLES WITH 1 DIMENSIONS#
##########################
if(num.table.dims==1)
{
#start with combined table
combine.array.all.sources.col.props<-combine.array.all.sources
numcols<-length(combine.array.all.sources)
for(nc in 1:numcols)
{
sum.col<-sum(combine.array.all.sources,na.rm=TRUE)
combine.array.all.sources.col.props<-signif(combine.array.all.sources/sum.col,3)
}
calltext3<-paste0("TABLE.COMBINED_col.props<-combine.array.all.sources.col.props")
eval(parse(text=calltext3))
########################
#rvar by source tables #
########################
array.all.sources_counts<-array.all.sources
array.all.sources_row.props<-array.all.sources
array.all.sources_col.props<-array.all.sources
numrows<-dim(array.all.sources)[1]
numcols<-dim(array.all.sources)[2]
row.sum<-rep(NA,numrows)
col.sum<-rep(NA,numcols)
for(nr in 1:numrows)
{
row.sum[nr]<-sum(array.all.sources[nr,],na.rm=TRUE)
array.all.sources_row.props[nr,]<-array.all.sources[nr,]/row.sum[nr]
}
for(nc in 1:numcols)
{
col.sum[nc]<-sum(array.all.sources[,nc],na.rm=TRUE)
array.all.sources_col.props[,nc]<-array.all.sources[,nc]/col.sum[nc]
}
dimnames(array.all.sources_counts)[2]<-list(study.names.valid)
dimnames(array.all.sources_col.props)[2]<-list(study.names.valid)
dimnames(array.all.sources_row.props)[2]<-list(study.names.valid)
output.list<-list(
TABLE_rvar.by.study_row.props=array.all.sources_row.props,
TABLE_rvar.by.study_col.props=array.all.sources_col.props,
TABLE_rvar.by.study_counts=array.all.sources_counts,
TABLES.COMBINED_all.sources_proportions=TABLE.COMBINED_col.props,
TABLES.COMBINED_all.sources_counts=combine.array.all.sources)
return.list.first<-list(output.list=output.list,validity.message=validity.message)
if(!report.chisq.tests&&!table.assign)
{
return(return.list.first)
}
}#END second dim=1 loop
################################
#NOW UNDERTAKE CHISQUARED TESTS#
################################
if(report.chisq.tests)
{
##########################
#TABLES WITH 3 DIMENSIONS#
##########################
#Suppress.chisq.warnings by default
if(suppress.chisq.warnings)
{
options(warn=-1)
}
##################
#Combined studies#
##################
if(num.table.dims==3){
numtests<-dim(combine.array.all.sources)[num.table.dims]
chisq.list.temp<-")"
for(nt in numtests:1)
{
chisqtext<-paste0("chisq.test_TABLES.COMBINED.",nt,"<-stats::chisq.test(combine.array.all.sources[,,nt])")
eval(parse(text=chisqtext))
chisq.list.temp<-paste0(",chisq.test_TABLES.COMBINED_all.sources_counts_table.",nt,"=chisq.test_TABLES.COMBINED.",nt,chisq.list.temp)
}
##################
#Separate studies#
##################
for(ns in numsources:1)
{
input.calltext<-paste0("input.array.source.specific<-array.study.",ns)
input.array.source.specific <- NULL
eval(parse(text=input.calltext))
numtests<-dim(input.array.source.specific)[num.table.dims]
for(nt in numtests:1)
{
if(nt>1||ns>1)
{
chisqtext<-paste0("chisq.test_TABLE.STUDY.",ns,"_counts.",nt,"<-stats::chisq.test(input.array.source.specific[,,nt])")
eval(parse(text=chisqtext))
chisq.list.temp<-paste0(",chisq.test_TABLE.STUDY.",study.names.valid[ns],"_counts_table.",nt,"=chisq.test_TABLE.STUDY.",ns,"_counts.",nt,chisq.list.temp)
}
else
{
chisqtext<-paste0("chisq.test_TABLE.STUDY.",ns,"_counts.",nt,"<-stats::chisq.test(input.array.source.specific[,,nt])")
eval(parse(text=chisqtext))
chisq.list.text<-paste0("chisq.tests<-list(chisq.test_TABLE.STUDY.",study.names.valid[ns],"_counts_table.",nt,"=chisq.test_TABLE.STUDY.",ns,"_counts.",nt,chisq.list.temp)
}
}
}#END ns loop
# print(chisq.list.text)
eval(parse(text=chisq.list.text))
#If warnings suppressed now return to default
if(suppress.chisq.warnings)
{
options(warn=0)
}
return.list.second<-list(output.list=return.list.first,chisq.tests=chisq.tests,validity.message=validity.message)
if(!table.assign)
{
return(return.list.second)
}
}#END third dim=3 loop
##########################
#TABLES WITH 2 DIMENSIONS#
##########################
#Suppress.chisq.warnings by default
if(suppress.chisq.warnings)
{
options(warn=-1)
}
##################
#Combined studies#
##################
if(num.table.dims==2){
chisq.list.temp<-")"
chisqtext<-paste0("chisq.test_TABLES.COMBINED<-stats::chisq.test(combine.array.all.sources)")
eval(parse(text=chisqtext))
chisq.list.temp<-paste0(",chisq.test_TABLES.COMBINED_all.sources_counts=chisq.test_TABLES.COMBINED",chisq.list.temp)
##################
#Separate studies#
##################
for(ns in numsources:1)
{
input.calltext<-paste0("input.array.source.specific<-array.study.",ns)
eval(parse(text=input.calltext))
numtests<-dim(input.array.source.specific)[num.table.dims]
if(ns>1)
{
chisqtext<-paste0("chisq.test_TABLE.STUDY.",ns,"_counts<-stats::chisq.test(input.array.source.specific)")
eval(parse(text=chisqtext))
chisq.list.temp<-paste0(",chisq.test_TABLE.STUDY.",study.names.valid[ns],"_counts=chisq.test_TABLE.STUDY.",ns,"_counts",chisq.list.temp)
}
else
{
chisqtext<-paste0("chisq.test_TABLE.STUDY.",ns,"_counts<-stats::chisq.test(input.array.source.specific)")
eval(parse(text=chisqtext))
chisq.list.text<-paste0("chisq.tests<-list(chisq.test_TABLE.STUDY.",study.names.valid[ns],"_counts=chisq.test_TABLE.STUDY.",ns,"_counts",chisq.list.temp)
}
}#END ns loop
#If warnings suppressed now return to default
if(suppress.chisq.warnings)
{
options(warn=0)
}
eval(parse(text=chisq.list.text))
return.list.second<-list(output.list=return.list.first,chisq.tests=chisq.tests,validity.message=validity.message)
if(!table.assign)
{
return(return.list.second)
}
}#END third dim=2 loop
#########################
#TABLES WITH 1 DIMENSION#
#########################
#Suppress.chisq.warnings by default
if(suppress.chisq.warnings)
{
options(warn=-1)
}
if(num.table.dims==1){
#combined.studies.not.possible.as.one.column.only
##################
#Separate studies#
##################
chisq.1D.temp<-")"
for(ns in numsources:1)
{
if(ns>1)
{
chisq.1D.temp<-paste0(",array.all.sources[,",ns,"]",chisq.1D.temp)
}
else
{
chisq.1D.text<-paste0("rvar.by.study_counts<-cbind(array.all.sources[,",ns,"]",chisq.1D.temp)
}
}#END ns loop
# print(chisq.1D.text)
rvar.by.study_counts <- NULL
eval(parse(text=chisq.1D.text))
chisq.test_rvar.by.study_counts<-stats::chisq.test(rvar.by.study_counts)
output.list<-list(
TABLE_rvar.by.study_row.props=array.all.sources_row.props,
TABLE_rvar.by.study_col.props=array.all.sources_col.props,
TABLE_rvar.by.study_counts=array.all.sources_counts,
TABLES.COMBINED_all.sources_proportions=TABLE.COMBINED_col.props,
TABLES.COMBINED_all.sources_counts=combine.array.all.sources
)
chisq.tests=list(CHISQ.TEST_rvar.by.study_counts=chisq.test_rvar.by.study_counts)
return.list.second<-list(output.list=return.list.first,chisq.tests=chisq.tests,validity.message=validity.message)
if(!table.assign)
{
return(return.list.second)
}
#If warnings suppressed now return to default
if(suppress.chisq.warnings)
{
options(warn=0)
}
}#END third dim=1 loop
####was }
if(table.assign)
{
# CALL THE SECOND MAIN SERVER SIDE AGGREGATE FUNCTION
calltext <- call("tableDS2", newobj=newobj,rvar.transmit=rvar.transmit, cvar.transmit=cvar.transmit,
stvar.transmit=stvar.transmit)
serverside.table.out<-DSI::datashield.aggregate(datasources, calltext)
if(report.chisq.tests)
{
return.list.final<-list(serverside.table.structure=serverside.table.out,outlist=return.list.second)
}
else
{
return.list.final<-list(serverside.table.structure=serverside.table.out,outlist=return.list.first)
}
return(return.list.final)
}
}
}
#ds.table
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.