msnbc323: Dataset: msnbc323

Description Usage Format Source References See Also

Description

A portion of the msnbc dataset containing 323 clickstream sequences. This version of the original dataset (David Heckerman) was used in Melnykov (2014).
There are 17 states representing the following categories:
1: frontpage
2: news
3: tech
4: local
5: opinion
6: on-air
7: misc
8: weather
9: msn-news
10: health
11: living
12: business
13: msn-sports
14: sports
15: summary
16: bbs
17: travel

Usage

1

Format

List of 323 numeric vectors representing categorical sequences.

Source

Melnykov, V. (2014)

References

Cadez, I., Heckerman, D., Meek, C., Smyth, P., White, S. (2003) Model-based clustering and visualization of navigation patterns on a web site, Data Mining and Knowledge Discovery, 399-424.

Melnykov, V. (2016) Model-Based Biclustering of Clickstream Data, Computational Statistics and Data Analysis, 93, 31-45.

Melnykov, V. (2016) ClickClust: An R Package for Model-Based Clustering of Categorical Sequences, Journal of Statistical Software, 74, 1-34.

See Also

synth


ClickClust documentation built on May 1, 2019, 8:23 p.m.