Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/CE.Normal.Init.Mean.R

Performs calculations to estimate the break-point locations when their initial values are given. Normal distribution is used to model the observed continous data. Accross the segments standard deviation is assumed to be the same. This function supports for the simulation of break-point locations based on the four parameter beta distribution or truncated normal distribution. User can select from the modified BIC (mBIC) proposed by Zhang and Siegmund (2007), BIC or AIC to obtain the optimal number of break-points.

1 2 | ```
CE.Normal.Init.Mean(data, init.locs, eps = 0.01, rho = 0.05, M = 200, h = 5, a = 0.8,
b = 0.8, distyp = 1, penalty = "mBIC", var.init = 1e+05, parallel = FALSE)
``` |

`data` |
data to be analysed. A single column array or a dataframe. |

`init.locs` |
Initial break-point locations. |

`eps` |
the cut-off value for the stopping criterion in the CE method. Default value is 0.01. |

`rho` |
the fraction which is used to obtain the best performing set of sample solutions (i.e., elite sample). Default value is 0.05. |

`M` |
sample size to be used in simulating the locations of break-points. Default value is 200. |

`h` |
minimum aberration width. Default is 5. |

`a` |
a smoothing parameter value. It is used in the four parameter beta distribution to smooth both shape parameters. When simulating from the truncated normal distribution, this value is used to smooth the estimates of the mean values. Default is 0.8. |

`b` |
a smoothing parameter value. It is used in the truncated normal distribution to smooth the estimates of the standard deviation. Default is 0.8. |

`distyp` |
distribution to simulate break-point locations. Options: 1 = four parameter beta distribution, 2 = truncated normal distribution. Default is 1. |

`penalty` |
User can select from mBIC, BIC or AIC to obtain the optimal number of break-points. Options: "mBIC", "BIC" and "AIC". Default is "mBIC". |

`var.init` |
Initial variance value to facilitate the search process. Default is 100000. |

`parallel` |
A logical argument specifying if parallel computation should be carried-out (TRUE) or not (FALSE). By default it is set as ‘FALSE’. In WINDOWS OS systems "snow" functionalities are used, whereas in Unix/Linux/MAC OSX "multicore" functionalities are used to carryout parallel computations with the maximum number of cores available. |

The normal distribution is used to model the continuous data. A performance function score (mBIC/BIC/AIC) is calculated for each of the solutions generated by the statistical distribution (four parameter beta distribution or truncated normal distribution), which is used to simulate break-points from the user provided initial locations. The solution that maximizes the selection criteria with respect to the number of break-points is reported as the optimal solution. Finally, a list containing a vector of break-point locations, number of break-points, mBIC/BIC/AIC values and log-likelihood value is returned in the console.

A list is returned with following items:

`No.BPs` |
The number of break-points |

`BP.Loc` |
A vector of break-point locations |

`mBIC/BIC/AIC` |
mBIC/BIC/AIC value |

`ll` |
Loglikelihood of the optimal solution |

Priyadarshana, W.J.R.M. <mjayawardana@swin.edu.au>

Priyadarshana, W. J. R. M., Sofronov G. (2015). Multiple Break-Points Detection in Array CGH Data via the Cross-Entropy Method, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 12 (2), pp.487-498.

Priyadarshana, W. J. R. M. and Sofronov, G. (2012) A Modified Cross- Entropy Method for Detecting Multiple Change-Points in DNA Count Data, In Proc. of the IEEE Conference on Evolutionary Computation (CEC), 1020-1027, DOI: 10.1109/CEC.2012.6256470.

Rubinstein, R., and Kroese, D. (2004) The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation and Machine Learning. Springer-Verlag, New York.

Zhang, N.R., and Siegmund, D.O. (2007) A modified Bayes information criterion with applications to the analysis of comparative genomic hybridization data. Biometrics, 63, 22-32.

`CE.Normal.Mean`

for CE with normal,

`CE.Normal.MeanVar`

for CE with normal to detect break-points in both mean and variance,

`CE.Normal.Init.MeanVar`

for CE with normal to detect break-points in both mean and variance with initial locations,

`profilePlot`

to obtain mean profile plot.

1 2 3 4 5 6 7 8 9 10 11 12 13 | ```
## Not run:
simdata <- as.data.frame(c(rnorm(200,100,5),rnorm(100,300,5),rnorm(300,150,5)))
## CE with four parameter beta distribution with mBIC as the selection criterion ##
obj1 <- CE.Normal.Init.Mean(simdata, init.locs = c(150, 380), distyp = 1, parallel =TRUE)
profilePlot(obj1, simdata)
## CE with truncated normal distribution with mBIC as the selection criterion ##
obj2 <- CE.Normal.Init.Mean(simdata, init.locs = c(150, 380), distyp = 2, parallel =TRUE)
profilePlot(obj2, simdata)
## End(Not run)
``` |

breakpoint documentation built on May 19, 2017, 8:10 a.m.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.