Description Usage Arguments Value Processing steps Author(s) See Also
These functions implement the cghRA workflow, as a sequence of process
subfunction calls. Each of them rely on cghRA.array
and cghRA.regions
methods, so custom processing can be easily achieved using them directly if the steps
argument is not flexible enough to your purpose.
Custom steps can be added as well on the model of existing ones, defining a function called process.NAME
and adding "NAME" to the steps
vector during the call to process
. Step functions need to handle at least an input
parameter which will be returned directly by the previous step, thus forming a pipeline.
The tk.process
function is a wrapper for process
, built around a Tcl-Tk interface for more user-friendliness.
The process
function is a multi-core command line interface that will dispatch its arguments to individual process.core
calls, and should be the prefered entry point even on single core computers. process.log
is a wrapper to process.core
which captures warnings and errors into a log file.
The process.default
function is a common way for process
and tk.process
to obtain default values for complex arguments like 'segmentArgs' and 'modelizeArgs'. It can be used to obtain the profiles proposed by tk.process
in process
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | process(inputs, logFile = "process.log", cluster = NA, ...)
process.log(..., logFile)
process.core(input, inputName, steps = c("parse", "mask", "replicates", "waca",
"export", "spatial", "segment", "fill", "modelize", "export", "fittest", "export",
"applyModel", "export"), ...)
process.parse(input, design, probeParser = Agilent.probes, probeArgs = list(), ...)
process.probes(input, design, ...)
process.regions(input, ...)
process.mask(input, ...)
process.replicates(input, replicateFun = stats::median, ...)
process.waca(input, ...)
process.spatial(input, outDirectory, ...)
process.segment(input, segmentArgs = process.default("segmentArgs"), ...)
process.fill(input, ...)
process.modelize(input, modelizeArgs = process.default("modelizeArgs"), ...)
process.applyModel(input, ...)
process.fittest(input, ...)
process.export(input, outDirectory, ...)
tk.process(globalTopLevel, localTopLevel)
process.default(argName, profileName)
|
inputs |
List of |
logFile |
Single character value, the path to the log file to produce with messages, warnings and errors. If the file already exists, it will be emptied first. The behavior when |
cluster |
Arguments to be passed to |
... |
Further arguments to be passed to |
input |
A single input to process on one node. The default workflow expects it to be a single character value naming a raw data file to be parsed. |
inputName |
Single character value, the name of the input currently processed (for logging only). |
steps |
Ordered character vector, naming the processing steps to apply. Custom steps can be named as well, as long as a function named "process.[step]" exists in the global environment. Each step will take as input the output of the previous step, the first step taking the value of the |
probeParser |
The function to parse |
probeArgs |
A list of arguments to pass to |
design |
Single character vector, the path and name of the RDT design file, as produced by |
replicateFun |
The function to apply to replicate groups, if the "replicate" step is to be applied. This function must use a vector of numeric values (logRatios) as input, and return a single representative value (typically |
outDirectory |
Single character value, the directory in which produce the output files. |
segmentArgs |
Character vector, the arguments to be passed to the |
modelizeArgs |
Single character value, the arguments to be passed to the |
argName |
Single character value, 'segmentArgs' or 'modelizeArgs', the argument to get the default value for. If missing, the list of profiles and arguments handled is returned. |
profileName |
Single character value, altering the default values returned. If missing, the default profile is returned. |
globalTopLevel |
This argument should be filled only when embedding this Tcl-Tk interface in an other. It is the top level of the embedding interface, generally a call to |
localTopLevel |
This argument should be filled only when embedding this Tcl-Tk interface in an other. It is the local top level to use to build this interface, generally a |
Only process.default
returns something : if argName
is provided it returns the default value for the queried argument, else a list of profiles available for each handled argument. When many profiles are handled, the first value in the list is the default one (returned when profileName
is missing).
The complete workflow involves the following steps :
Read a raw data file and return a cghRA.array
object.
Read a cghRA.probes
object stored in a RDT file and return a cghRA.array
object.
Reads one or many cghRA.regions
file(s) stored in RDT file(s).
Discard flagged probes (saturated, high background ...) in a cghRA.array
object. Any TRUE
value in a column whose name begins with "flag_" is enough to discard a probe (turn its logRatio into NA
. See the cghRA.array$maskByFlag() method for further details.
Replace replicated probe groups (same "name") by a single representative value (all logRatios are turned to NA
except from the first one which will hold the representative value). See the cghRA.array$replicates() method for further details.
Apply the WACA algorithm to the logRatios. See the cghRA.array$WACA() method for further details.
Produce a PNG file to visually check spatial biases. See the cghRA.array$spatial() method for further details.
Compute regions with similar logRatios along the genome, using the CBS algorithm. See the cghRA.array$DNAcopy() method for further details.
Extend segments to the right to join consecutive segments. See the cghRA.regions$fillGaps() method for further details.
Fit a copy number model to segments, in order to convert logRatios to true copy numbers. If segmentArgs
contains multiple values, each segmentation profile will lead to distinct "copies" and "regions" files numbered according to its position in segmentArgs
. See the cghRA.regions$model.auto() method for further details.
Convert a modelized cghRA.regions
objects into cghRA.copies
.
If multiple segmentation profiles have been used, select the fittest model ("copies" and "regions" files duplicated without number). For further details on the STM score used for fittest model selection, see the model.auto
function of the cghRA.copies package.
Erase "copies" and "regions" files of the different segmentation profiles tested, as "fittest" should have saved the best.
Sylvain Mareschal
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.