trajcluster: Clustering mouse tracking trajectories

Description Usage Arguments Value Author(s) References Examples

View source: R/trajcluster.R

Description

Function to cluster mouse trajectory using hierarchical cluster analysis or distance to given prototypical trajectories

Usage

1
2
trajcluster(data, i.xyt, i.id, type=c("hierarchical", "prototypes"), nclust, nResc = 10, 
      prototypes = NA, subsampN = NA)

Arguments

data

A dataframe with x,y values of time-normalizes trajectories (all trajectories have the same length) and id variable(s).

i.xyt

Vector containing the column names of the x, y, t variables in that order.

i.id

Vector containing the column names of the indicator variables that uniquely identify single trajectories (e.g. c('experiment1', 'trial')).

type

type='hierarchical' performs hierarchical cluster analysis on the trajectories. The distance measure is the sum of the euclidean distances between points 1,2,...,n of the two compared trajectories. The number of extracted clusters is specified with the parameter nclust. type='hierarchical' calculates the distance of each trajectory to a number of prototypical trajectories that are provided via prototypes. Using these distances, trajectories are classified to the prototype trajectories. type=c("hierarchical", "prototypes") performs both analyses.

nclust

The number of clusters that should be extracted in the hierarchical clustering method

nResc

Before calculating the distance matrix, all trajectories are spatially normalized, i.e. we distribute nResc equally (spatially) spaced points on each trajectory. This improves clustering performance, as most points in movement trajectories are at the start- and endpoint of the trajectory, which are relatively uninformative with regards to the shape of the trajectory.

prototypes

A list containing prototypical trajectories, to which the method type=c("prototypes") classifies all trajectories. Each prototype has to be specified as a n x 2 - matrix, with x values in the first and y values in the second column.

subsampN

Takes a random subsample from the original data and performs the clustering analysis on this subsample. This is useful for datasets with a large number of trajectories, which could render hierarchical clustering computationally infeasible.

Value

Returns a list containing:

call

The function input except the data.

data_res

A data frame containing the spatially normalized data.

hierarchical

The results of the hierarchical cluster analysis. distmat contains the distance matrix used for clustering. clust_obj contains the hclust object from the fastcluster package. Using this object, you can compute clustering solutions with a different number of clusters, without rerunning the whole function. cluster is a vector containing the classification of each trajectory.

prototypes

Contains the results of the prototype classification. prototypes_resc contains the spatially normalized prototypes (Both the empirical trajectories and the prototypes have to have the same number of data points to compute a difference). protoclust contains a data frame that specifies the distance of each trajectory to each prototypical trajectory. Also, every trajectory is calssified to the closest prototype.

Author(s)

Jonas Haslbeck <jonashaslbeck@gmail.com>

References

Spivey, M. J., Grosjean, M., & Knoblich, G. (2005). Continuous attraction toward phonological competitors. Proceedings of the National Academy of Sciences of the United States of America, 102(29), 10393-10398.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
## Not run: 

# THIS EXAMPLE DOES NOT RUN ANYMORE

# we use a part of the following example dataset

head(data_sp2015)

# we use prepr() to time normalize all trajectories to 101 time-steps and strech them to a norm display

layout_stretch <- list("start"=c(0,0), "left"=c(-1,1.5), "right"=c(1,1.5))

output <- prepr(data = data_sp2015[1:1000,], 
                i.xyt = c('x', 'y', 't'), 
                i.id =  c('id.ptp', 'id.trial'), 
                type = "time", 
                steps = 101, 
                start2zero = TRUE, 
                stretch = layout_stretch)

head(output$data)

# clustering; prototypes from example datasets "prototypes"

out_clust <- trajcluster(output$data, 
                        i.xyt =  c('x', 'y', 't'), 
                        i.id = c('id.ptp', 'id.trial'),
                        type=c("hierarchical", "prototypes"),
                        nclust = 4, 
                        nResc = 10, 
                        prototypes = prototypes, 
                        subsampN = NA)


## End(Not run)

jmbh/mta documentation built on May 19, 2019, 1:51 p.m.