fsmmmdrs: Performs random start monitoring of minimum Mahalanobis...
In fsdaR: Robust Data Analysis Through Monitoring and Dynamic Visualization

fsmmmdrs

R Documentation

Performs random start monitoring of minimum Mahalanobis distance

Description

The trajectories originate from many different random initial subsets and provide information on the presence of groups in the data. Groups are investigated by monitoring the minimum Mahalanobis distance outside the forward search subset.

Usage

fsmmmdrs(
  x,
  plot = FALSE,
  init,
  bsbsteps,
  nsimul = 200,
  nocheck = FALSE,
  numpool,
  cleanpool = FALSE,
  msg = FALSE,
  trace = FALSE,
  ...
)

Arguments

`x`	An n x p data matrix (n observations and p variables). Rows of x represent observations, and columns represent variables. Missing values (NA's) and infinite values (Inf's) are allowed, since observations (rows) with missing or infinite values will automatically be excluded from the computations.
`plot`	Plots the random starts minimum Mahalanobis distance with 1 If `plot=FALSE` (default) or `plot=0` no plot is produced. The scale (ylim) for the y axis is defined as follows: ylim[2] is the maximum between the values of `mmd` in steps `[n0.2 n]` and the final value of the 99 per cent envelope multiplied by 1.1. ylim[1] is the minimum between the values of `mmd` in steps `[n0.2 n]` and the 1 per cent envelope multiplied by 0.9. Remark: the plot which is produced is very simple. In order to control a series of options in this plot (including the y scale) and in order to connect it dynamically to the other forward plots it is necessary to use function `mmdrsplot`.
`init`	Point where to start monitoring required diagnostics. If `init` is not specified it will be set equal to `(p+1)`.
`bsbsteps`	A vector which specifies for which steps of the forward search it is necessary to save the units forming subset for each random start. if `bsbsteps = 0` for each random start we store the units forming subset in all steps. The default is store the units forming subset in all steps if `n <= 500` else to store the units forming subset at step init and steps which are multiple of 100. For example, if `n = 753` and `init = 6`, units forming subset are stored for `m=init, 100, 200, 300, 400, 500 and 600`. REMARK: The vector bsbsteps must contain numbers from init to n. if `min(bsbsteps) < init` a warning message will be issued.
`nsimul`	Number of random starts. Default value is `nsimul=200`.
`nocheck`	It controls whether to perform checks on matrix Y. If `nocheck=TRUE`, no check is performed.
`numpool`	If `numpool > 1`, the routine automatically checks if the Parallel Computing Toolbox is installed and distributes the random starts over numpool parallel processes. If `numpool <= 1`, the random starts are run sequentially. By default, numpool is set equal to the number of physical cores available in the CPU (this choice may be inconvenient if other applications are running concurrently). The same happens if the numpool value chosen by the user exceeds the available number of cores. REMARK: up to R2013b, there was a limitation on the maximum number of cores that could be addressed by the parallel processing toolbox (8 and, more recently, 12). From R2014a, it is possible to run a local cluster of more than 12 workers. REMARK: Unless you adjust the cluster profile, the default maximum number of workers is the same as the number of computational (physical) cores on the machine. REMARK: In modern computers the number of logical cores is larger than the number of physical cores. By default, MATLAB is not using all logical cores because, normally, hyper-threading is enabled and some cores are reserved to this feature. REMARK: It is because of Remarks 3 that we have chosen as default value for numpool the number of physical cores rather than the number of logical ones. The user can increase the number of parallel pool workers allocated to the multiple start monitoring by: setting the NumWorkers option in the local cluster profile settings to the number of logical cores (Remark 2). To do so go on the menu Home\|Parallel\|Manage Cluster Profile and set the desired "Number of workers to start on your local machine". setting numpool to the desired number of workers Therefore, if a parallel pool is not already open, UserOption numpool (if set) overwrites the number of workers set in the local/current profile. Similarly, the number of workers in the local/current profile overwrites default value of `numpool` obtained as feature('numCores') (i.e. the number of physical cores).
`cleanpool`	Set cleanpool `cleanpool=TRUE` if the parallel pool has to be cleaned after the execution of the random starts. Otherwise (default) `cleanpool=FALSE`. Clearly this option has an effect just if previous option `numpool > 1`.
`msg`	Level of output to sidplay. It controls whether to display or not messages about random start progress. More precisely, if previous option `numpool > 1`, then a progress bar is displayed, on the other hand a message will be displayed on the screen when 10 REMARK: in order to create the progress bar when `nparpool > 1` the program writes on a temporary .txt file in the folder where the user is working. Therefore it is necessary to work in a folder where the user has write permission. If this is not the case and the user (say) is working without write permission in folder C:/Program Files/MATLAB the following message will appear on the screen: Error using ProgressBar (line 57) Do you have write permissions for C:/Program Files/MATLAB?"
`trace`	Whether to print intermediate results. Default is `trace=FALSE`.
`...`	potential further arguments passed to lower level functions.

Value

Returns an object of class fsmmmdrs.object.

Author(s)

FSDA team, valentin.todorov@chello.at

References

Atkinson, A.C., Riani, M., and Cerioli, A. (2006), Random Start Forward Searches with Envelopes for Detecting Clusters in Multivariate Data, in: Zani S., Cerioli A., Riani M., Vichi M., Eds., Data Analysis, Classification and the Forward Search, pp. 163-172, Springer Verlag.

Atkinson, A.C. and Riani, M., (2007), Exploratory Tools for Clustering Multivariate Data, Computational Statistics and Data Analysis, Vol. 52, pp. 272-285, doi:10.1016/j.csda.2006.12.034

Riani, M., Cerioli, A., Atkinson, A.C., Perrotta, D. and Torti, F. (2008), Fitting Mixtures of Regression Lines with the Forward Search, in: Mining Massive Data Sets for Security, F. Fogelman-Soulie et al. Eds., pp. 271-286, IOS Press.

Examples

 ## Not run: 
 data(hbk, package="robustbase")
 out <- fsmmmdrs(hbk[,1:3])
 class(out)
 summary(out)
 
## End(Not run)

fsdaR documentation built on May 29, 2024, 5:35 a.m.

fsdaR index

Transformations in regression with fsdaR

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

fsdaR
Robust Data Analysis Through Monitoring and Dynamic Visualization

fsmmmdrs: Performs random start monitoring of minimum Mahalanobis...
In fsdaR: Robust Data Analysis Through Monitoring and Dynamic Visualization

Performs random start monitoring of minimum Mahalanobis distance

Description

Usage

Arguments

Value

Author(s)

References

Examples

Related to fsmmmdrs in fsdaR...

R Package Documentation

Browse R Packages

We want your feedback!

fsdaR Robust Data Analysis Through Monitoring and Dynamic Visualization

fsmmmdrs: Performs random start monitoring of minimum Mahalanobis... In fsdaR: Robust Data Analysis Through Monitoring and Dynamic Visualization

Performs random start monitoring of minimum Mahalanobis distance

Description

Usage

Arguments

Value

Author(s)

References

Examples

Related to fsmmmdrs in fsdaR...

R Package Documentation

Browse R Packages

We want your feedback!

fsdaR
Robust Data Analysis Through Monitoring and Dynamic Visualization

fsmmmdrs: Performs random start monitoring of minimum Mahalanobis...
In fsdaR: Robust Data Analysis Through Monitoring and Dynamic Visualization