Description Usage Arguments Details Value Note Author(s) References See Also Examples
This is the main function to run the TDCOR algorithm and reconstruct the gene network topology.
1 |
dataset |
Numerical matrix storing the non-log2 transcriptomic data (average of replicates). The rows of this matrix must be named by gene codes (e.g. the AGI gene code for Arabidopsis datasets). The columns must be organized in chronological order from the left to the right. |
l_genes |
A character vector containing the (AGI) gene codes of the genes one wishes to build the network with (gene codes -e.g. "AT5G26930"- by opposed to gene names -e.g."GATA23"- which are provided by the optional |
TPI |
A TPI database generated by |
DPI |
A DPI database generated by |
... |
Additional arguments to be passed to the TDCOR function (Some are necessary if
|
The default values are certainly not the best values to work with. The TDCOR parameters have to be optimized by the user based on its own knowledge of the network, the quality of the data etc... Because TDCOR works by pruning interactions, it is probably easier (as a first go) to optimize the parameter values following the order of the filters.
Before starting inactivate all the filters using the less stringent parameter values possible or for the MRST filter by setting search.EP
to FALSE. You should as well set the bootstrap parameters to a relatively low value (e.g. n0
=100 and n1
=1). Hence the runs will be quick and you will be able to rapidly assess whether the changes you made in the parameter values were a good thing.
Start by optimizing the parameters involved in time shifts estimation. That is to say, essentially delayspan
, time_step
, tol
and delaymax
. The latter (together with delaymin
) is a biological parameters and the range of possible values is argueably limited. Though they ought to be adapted to the organism (e.g. in prokaryotes, the delays are extremely short since polysomes couple transcription and translation). Note that the estimate.delay
function can be very helpful to optimize these various parameters thanks to the visual output. Use it with pairs of genes that have been shown to interact directly or indirectly in your system and for which the relationship in the dataset in clearly linear. For network reconstruction with TDCor, good time shift estimation is absolutely crucial. Once this is done, proceed with optimizing the threshold for correlation thr_cor
and the thresholds on the index of directness (thr_ind1
, thr_ind2
). Then optimize the parameters of the triangle and diamond pruning filters (thrpTPI
and thrpDPI
). You may have to try a couple of different TPI and DPI databases (i.e. databases built with different input parameters). In particular increasing the noise
level when generating these database enables one to decrease the stringency of the triangle and diamond filters, when increasing the thrpTPI
and thrpDPI
value is not sufficient. Subsequently fine-tune the parameters of the MRST filter (thr_bool_EP
, MinTarNumber
, MinProp
, MaxEPNumber
) if you want it on. Remember to set search.EP
back to TRUE first. Next optimize thr_isr
(self-regulation). Finally, restrict the number of maximum regulators if necessary (regmax
).
The TDCOR
main function returns a list containing 7 elements
input |
A list containing the input parameters (as a reminder). |
intermediate |
A list containing three intermediate matrices. |
network |
A matrix containing the network. The element [i,j] of this matrix contains the bootstrap value for the edge "gene j to gene i". The sign indicates the sign of the predicted interaction. |
ID |
A matrix containing the computed indices of directness (ID). The element [i,j] contains the ID for the edge "gene j to gene i". |
delay |
A matrix containing the computed time shifts. The element [i,j] of this matrix contains the estimated time shift between the profile of gene j and the profile of gene i. |
EP |
A vector containing the bootstrap values for the MRST predictions. |
predictions |
The edge predictions in the form of a table. The columns are organized in following order: Regulator name, Type of interaction (+ or-), Target name, Bootstrap, Index of Directness, Estimated time shift between the target and regulator profiles. |
The table of predictions (without header) and the input parameters are printed at the end of the run in two separate text files located in the current R working directory (If you are not sure which directory this is, use the command getwd()
).
For a parameter to be involved in the bootstrapping process, one must feed the function a vector containing two values as input. These two values are respectively the lower and upper boundaries of the bootstrapping interval. If one chooses not to use a parameter for bootstrapping, one can either feed the function an input vector containing twice the same value, or only one value.
Julien Lavenus jl.tdcor@gmail.com
Lavenus et al., 2015, The Plant Cell
See also CalculateDPI
, CalculateTPI
, UpdateDPI
, UpdateTPI
, TDCor-package
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 | ## Not run:
# Load the lateral root transcriptomic dataset
data(LR_dataset)
# Load the vectors of gene codes, gene names and prior
data(l_genes)
data(l_names)
data(l_prior)
# Load the vector of time points for the LR_dataset
data(times)
# Generate the DPI databases
DPI15=CalculateDPI(dataset=LR_dataset,l_genes=l_genes,l_prior=l_prior,
times=times,time_step=1,N=10000,ks_int=c(0.5,3),kd_int=c(0.5,3),delta_int=c(0.5,3),
noise=0.15,delay=3)
# Generate the TPI databases
TPI10=CalculateTPI(dataset=LR_dataset,l_genes=l_genes,
l_prior=l_prior,times=times,time_step=1,N=10000,ks_int=c(0.5,3),
kd_int=c(0.5,3),delta_int=c(0.5,3),noise=0.1,delay=3)
# Check/update if necessary the databases (Not necessary here though.
# This is just to illustrate how it would work.)
TPI10=UpdateTPI(TPI10,LR_dataset,l_genes,l_prior)
DPI15=UpdateDPI(DPI15,LR_dataset,l_genes,l_prior)
### Choose your TDCOR parameters ###
# Parameters for time shift estimatation
# and filter on time shift value
ptime_step=1
ptol=0.13
pdelayspan=12
pdelaymax=c(2.5,3.5)
pdelaymin=0
# Parameter of the correlation filter
pthr_cor=c(0.65,0.8)
# Parameters of the ID filter
pdelay=3
pthr_ind1=0.65
pthr_ind2=3.5
# Parameter of the overlap filter
pthr_overlap=c(0.4,0.6)
# Parameters of the triangle and diamond filters
pthrpTPI=c(0.55,0.8)
pthrpDPI=c(0.65,0.8)
pTPI=TPI10
pDPI=DPI15
# Parameter for identification of self-regulations
pthr_isr=c(4,6)
# Parameters for MRST identification
pMinTarNumber=5
pMinProp=0.6
# Max number of regulators
pregmax=5
# Bootstrap parameters
pn0=1000
pn1=10
# Name of the file to print network in
poutfile_name="TDCor_output.txt"
### Reconstruct the network ###
tdcor_out= TDCOR(dataset=LR_dataset,l_genes=l_genes,l_names=l_names,n0=pn0,n1=pn1,
l_prior=l_prior,thr_ind1=pthr_ind1,thr_ind2=pthr_ind2,regmax=pregmax,thr_cor=pthr_cor,
delayspan=pdelayspan,delaymax=pdelaymax,delaymin=pdelaymin,delay=pdelay,thrpTPI=pthrpTPI,
thrpDPI=pthrpDPI,TPI=pTPI,DPI=pDPI,thr_isr=pthr_isr,time_step=ptime_step,
thr_overlap=pthr_overlap,tol=ptol,MinProp=pMinProp,MinTarNumber=pMinTarNumber,
outfile_name=poutfile_name)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.