parKml: ~ Function: parKml ~

View source: R/parKml.R

parKmlR Documentation

~ Function: parKml ~

Description

parKml and parALGO are constructor for the object ParKml.

Usage

parKml(saveFreq,maxIt,imputationMethod,distanceName,power,distance,
   centerMethod,startingCond,nbCriterion,scale)

parALGO(saveFreq=100,maxIt=200,imputationMethod="copyMean",
   distanceName="euclidean",power=2,distance=function(){},
   centerMethod=meanNA,startingCond="nearlyAll",nbCriterion=1000,scale=TRUE)

Arguments

saveFreq

[numeric]: Long computations can take several days. So it is possible to save the object ClusterLongData on which works kml once in a while. saveFreq defines the frequency of the saving process. The ClusterLongData is saved every saveFreq clustering calculations. The object is saved in the file objectName.Rdata in the curent folder. If saveFreq is set to Inf, the object is never saved.

maxIt

[numeric]: Set a limit to the number of iteration if convergence is not reached.

imputationMethod

[character]: the calculation of quality criterion can not be done if some value are missing. imputationMethod define the method use to impute the missing value. See imputation for detail.

distanceName

[character]: name of the distance used by k-means. If the distanceName is one of "manhattan", "euclidean", "minkowski", "maximum", "canberra" or "binary", a compiled optimized version specificaly design for trajectories version is used. Otherwise, the function define in the slot distance is used.

power

[numeric]: If distanceName="minkowski", this define the power that will be used.

distance

[numeric <- function(trajA,trajB)]: function that computes the distance between two trajectories. If no function is specified, the Euclidian distance with Gower adjustment (to deal with missing value) is used.

centerMethod

[numeric <- function(vector(numeric))]: k-means algorithm computes the centers of each cluster. It is possible to personalize the definition of "center" by defining a function "centerMethod". This function should take a vector of numeric as argument and return a single numeric -the center of the vector-.

startingCond

[character]: specifies the starting condition. Should be one of "randomAll", "randomK", "maxDist", "kmeans++", "kmeans+", "kmeans-" or "kmeans–" (see initializePartition for details). It also could take two specifics values: "all" stands for c("maxDist","kmeans-") then an alternance of "kmeans–" and "randomK" while "nearlyAll" stands for "kmeans-" then an alternance of "kmeans–" and "randomK".

nbCriterion

[numeric]: set the maximum number of quality criterion that are display on the graph (since displaying a high criterion number an slow down the overall process). The default value is 100.

scale

[logical]: if TRUE, then the data will be automaticaly scaled (using the function scale with default values) before the execution of k-means on joint trajectories. Then the data will be restore (using the function restoreRealData) just before the end of the function kml3d. This option has no effect on kml.

Details

parKml is the constructor of object ParKml.

Value

An object ParKml.

Examples


### Move to tempdir
wd <- getwd()
setwd(tempdir()); getwd()

### Generation of some data
cld1 <- generateArtificialLongData()

### Setting two different set of option :
(option1 <- parALGO())
(option2 <- parALGO(distanceName="maximum",centerMethod=function(x)median(x,na.rm=TRUE)))

### Running kml We suspect 3, 4 or 5 clusters, we want 3 redrawing.
kml(cld1,3:5,3,toPlot="both",parAlgo=option1)
kml(cld1,3:5,3,toPlot="both",parAlgo=option2)

### Go back to current dir
setwd(wd)


kml documentation built on Feb. 16, 2023, 8:35 p.m.