View source: R/ds.heatmapPlot.R
ds.heatmapPlot | R Documentation |
Generates a heat map plot of the pooled data or one plot for each dataset.
ds.heatmapPlot(
x = NULL,
y = NULL,
type = "combine",
show = "all",
numints = 20,
method = "smallCellsRule",
k = 3,
noise = 0.25,
datasources = NULL
)
x |
a character string specifying the name of a numerical vector. |
y |
a character string specifying the name of a numerical vector. |
type |
a character string that represents the type of graph to display.
|
show |
a character string that represents where the plot should be focused.
|
numints |
the number of intervals for a density grid object.
Default |
method |
a character string that defines which heat map will be created.
The |
k |
the number of the nearest neighbours for which their centroid is calculated.
Default |
noise |
the percentage of the initial variance that is used as the variance of the embedded
noise if the argument |
datasources |
a list of |
The ds.heatmapPlot
function first generates a density grid
and uses it to plot the graph.
Cells of the grid density matrix that hold a count of less than the filter set by
DataSHIELD (usually 5) are considered invalid and turned into 0 to avoid potential
disclosure. A message is printed to inform the user about the number of invalid cells.
The ranges returned by each study and used in the process of getting the grid density matrix
are not the exact minimum and maximum values but rather close approximates of the real
minimum and maximum value. This was done to reduce the risk of potential disclosure.
In the argument type
can be specified two types of graphics to display:
'combine'
: a combined heat map plot is displayed
'split'
: each heat map is plotted separately
In the argument show
can be specified two options:
'all'
: the ranges of the variables are used as plot limits
'zoomed'
: the plot is zoomed to the region where the actual data are
In the argument method
can be specified 3 different heat map to be created:
'smallCellsRule'
: the heat map of the actual variables is
created but grids with low counts are replaced with grids with zero counts
'deterministic'
: the heat map of the scaled centroids of each
k
nearest neighbours of the
original variables are created, where the value of k
is set by the user
'probabilistic'
: the heat map of 'noisy'
variables is generated.
The added noise follows a normal distribution with
zero mean and variance equal to a percentage of
the initial variance of each input variable.
This percentage is specified by the user in the
argument noise
In the k
argument the user can choose any value for
k
equal to or greater than the pre-specified threshold
used as a disclosure control for this method and lower than the number of observations
minus the value of this threshold. By default the value of k
is set to be equal to 3
(we suggest k to be equal to, or bigger than, 3). Note that the function fails if the user
uses the default value but the study has set a bigger threshold.
The value of k
is used only
if the argument method
is set to 'deterministic'
.
Any value of k
is ignored if the
argument method
is set to 'probabilistic'
or 'smallCellsRule'
.
The value of noise
is used only if the argument
method
is set to 'probabilistic'
.
Any value of noise
is ignored if the argument
method
is set to 'deterministic'
or 'smallCellsRule'
.
The user can choose any value for noise
equal
to or greater than the pre-specified threshold 'nfilter.noise'
.
Server function called: heatmapPlotDS
ds.heatmapPlot
returns to the client-side a heat map plot and a message specifying
the number of invalid cells in each study.
DataSHIELD Development Team
## Not run:
## Version 6, for version 5 see the Wiki
# Connecting to the Opal servers
require('DSI')
require('DSOpal')
require('dsBaseClient')
builder <- DSI::newDSLoginBuilder()
builder$append(server = "study1",
url = "http://192.168.56.100:8080/",
user = "administrator", password = "datashield_test&",
table = "CNSIM.CNSIM1", driver = "OpalDriver")
builder$append(server = "study2",
url = "http://192.168.56.100:8080/",
user = "administrator", password = "datashield_test&",
table = "CNSIM.CNSIM2", driver = "OpalDriver")
builder$append(server = "study3",
url = "http://192.168.56.100:8080/",
user = "administrator", password = "datashield_test&",
table = "CNSIM.CNSIM3", driver = "OpalDriver")
logindata <- builder$build()
# Log onto the remote Opal training servers
connections <- DSI::datashield.login(logins = logindata, assign = TRUE, symbol = "D")
# Compute the heat map plot
# Example 1: Plot a combined (default) heat map plot of the variables 'LAB_TSC'
# and 'LAB_HDL' using the method 'smallCellsRule' (default)
ds.heatmapPlot(x = 'D$LAB_TSC',
y = 'D$LAB_HDL',
datasources = connections) #all servers are used
# Example 2: Plot a split heat map plot of the variables 'LAB_TSC'
# and 'LAB_HDL' using the method 'smallCellsRule' (default)
ds.heatmapPlot(x = 'D$LAB_TSC',
y = 'D$LAB_HDL',
method = 'smallCellsRule',
type = 'split',
datasources = connections[1]) #only the first server is used (study1)
# Example 3: Plot a combined heat map plot using the method 'deterministic' centroids of each
k = 7 nearest neighbours for numints = 40
ds.heatmapPlot(x = 'D$LAB_TSC',
y = 'D$LAB_HDL',
numints = 40,
method = 'deterministic',
k = 7,
type = 'split',
datasources = connections[2]) #only the second server is used (study2)
# clear the Datashield R sessions and logout
datashield.logout(connections)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.