ds.extractQuantiles: Secure ranking of a vector across all sources and use of...

View source: R/ds.extractQuantiles.R

ds.extractQuantilesR Documentation

Secure ranking of a vector across all sources and use of these ranks to estimate global quantiles across all studies

Description

Takes the global ranks and quantiles held in the serverside data data frame that is written by ranksSecureDS4 and named as specified by the argument (<output.ranks.df>) and converts these values into a series of quantile values that identify, for example, which value of V2BR across all of the studies corresponds to the median or to the 95 indication in which study the V2BR corresponding to a particular quantile falls and, in fact, the relevant value may fall in more than one study and may appear multiple times in any one study. Finally, the output data frame containing this information is written to the clientside and to the serverside at each study separately.

Usage

ds.extractQuantiles(
  extract.quantiles,
  extract.summary.output.ranks.df,
  extract.ranks.sort.by,
  extract.rm.residual.objects,
  extract.datasources = NULL
)

Arguments

extract.quantiles

one of a restricted set of character strings. The value of this argument is set in choosing the value of the argument <quantiles.for.estimation> in ds.ranksSecure. In summary: to mitigate disclosure risk only the following set of quantiles can be generated: c(0.025,0.05,0.10,0.20,0.25,0.30,0.3333,0.40,0.50,0.60,0.6667, 0.70,0.75,0.80,0.90,0.95,0.975). The allowable formats for the argument are of the general form: "0.025-0.975" where the first number is the lowest quantile to be estimated and the second number is the equivalent highest quantile to estimate. These two quantiles are then estimated along with all allowable quantiles in between. The allowable argument values are then: "0.025-0.975", "0.05-0.95", "0.10-0.90", "0.20-0.80". Two alternative values are "quartiles" i.e. c(0.25,0.50,0.75), and "median" i.e. c(0.50). The default value is "0.05-0.95". For more details, see the associated document "secure.global.ranking.docx". Also see the header file for ds.ranksSecure.

extract.summary.output.ranks.df

a character string which specifies the optional name for the summary data.frame written to the serverside on each data source that contains 5 of the key output variables from the ranking procedure pertaining to that particular data source. If no name has been specified by the argument <summary.output.ranks.df> in ds.ranksSecure, the default name is allocated as "summary.ranks.df".The only reason the <extract.summary.output.ranks.df> argument needs specifying in ds.extractQuantiles is because, ds.extractQuantiles is the last function called by ds.ranksSecure and almost the final command of ds.extractQuantiles to print out the name of the data frame containing the summarised ranking information generated by ds.ranksSecure and the order in which the data frame is laid out. This therefore appears as the last output produced when ds.ranksSecure is run, and when this happens it is clear this relates to the main output of ds.ranksSecure not of ds.extractQuantiles.

extract.ranks.sort.by

a character string taking two possible values. These are "ID.orig" and "vals.orig". This is set via the argument <ranks.sort.by> in ds.ranksSecure. For more details see the associated document entitled "secure.global.ranking.docx". Also see the header file for ds.ranksSecure.

extract.rm.residual.objects

logical value. Default = TRUE: at the beginning and end of each run of ds.ranksSecure delete all extraneous objects that are otherwise left behind. These are not usually needed, but could be of value if one were investigating a problem with the ranking. FALSE: do not delete the residual objects

extract.datasources

specifies the particular opal object(s) to use. This is set via the argument<datasources> in ds.ranksSecure. For more details see the associated document entitled "secure.global.ranking.docx". Also see the header file for ds.ranksSecure.

Details

ds.extractQuantiles is a clientside function which should usually be called from within the clientside function ds.ranksSecure.If you try to call ds.extractQuantiles directly(i.e. not by running ds.ranksSecure) you are almost certainly going to have to set up quite a few vectors and scalars that are normally set by ds.ranksSecure and this is likely to be difficult. ds.extractQuantiles itself calls two serverside functions extractQuantilesDS1 and extractQuantilesDS2. For more details about the cluster of functions that collectively enable secure global ranking and estimation of global quantiles see the associated document entitled "secure.global.ranking.docx". In particular this explains how ds.extractQuantiles works. Also see the header file for ds.ranksSecure.

Value

the final main output of ds.extractQuantiles is a data frame object named "final.quantile.df". This contains two vectors. The first named "evaluation.quantiles" lists the full set of quantiles you have requested for evaluation as specified by the argument "quantiles.for.estimation" in ds.ranksSecure and explained in more detail above under the information for the argument "extract.quantiles" in this function. The second vector is called "final.quantile.vector" which details the values of V2BR that correspond to the evaluation quantiles in vector 1. The information in the data frame "final.quantile.df" is generic: there is no information identifying in which study each value of V2BR falls. This data frame is written to the clientside (as it is non-disclosive) and is also copied to the serverside in every study. This means it is easily accessible from anywhere in the DataSHIELD environment. For more details see the associated document entitled "secure.global.ranking.docx".

Author(s)

Paul Burton 11th November, 2021


datashield/dsBaseClient documentation built on Nov. 16, 2024, 2:07 p.m.