seq2feature_mds
extracts K
features from response processes by
multidimensional scaling.
1 2 3 4  seq2feature_mds(seqs = NULL, K = 2, method = "auto",
dist_type = "oss_action", pca = TRUE, subset_size = 100,
subset_method = "random", n_cand = 10, return_dist = FALSE,
L_set = 1:3)

seqs 
a 
K 
the number of features to be extracted. 
method 
a character string specifies the algorithm used for performing MDS. See 'Details'. 
dist_type 
a character string specifies the dissimilarity measure for two response processes. See 'Details'. 
pca 
logical. If 
subset_size, n_cand 
two parameters used in the large data algorithm. See 'Details'
and 
subset_method 
a character string specifying the method for choosing the subset
in the large data algorithm. See 'Details' and 
return_dist 
logical. If 
L_set 
length of ngrams considered 
Since the classical MDS has a computational complexity of order n^3 where
n is the number of response processes, it is computational expensive to
perform classical MDS when a large number of response processes is considered.
In addition, storing an n \times n dissimilarity matrix when n is large
require a large amount of memory. In seq2feature_mds
, the algorithm proposed
in Paradis (2018) is implemented to obtain MDS for large datasets. method
specifies the algorithm to be used for obtaining MDS features. If method = "small"
,
classical MDS is used by calling cmdscale
. If method = "large"
,
the algorithm for large datasets will be used. If method = "auto"
(default),
seq2feature_mds
selects the algorithm automatically based on the sample size.
dist_type
specifies the dissimilarity to be used for measuring the discrepancy
between two response processes. If dist_type = "oss_action"
, the orderbased
sequence similarity (oss) proposed in GomezAlonso and Valls (2008) is used
for action sequences. If dist_type = "oss_both"
, both action sequences and
timestamp sequences are used to compute a timeweighted oss.
The number of features to be extracted K
can be selected by crossvalidation
using chooseK_mds
.
seq2feature_mds
returns a list containing
theta 
a numeric matrix giving the 
dist_mat 
the dissimilary matrix. This element exists only if

GomezAlonso, C. and Valls, A. (2008). A similarity measure for sequences of categorical data based on the ordering of common elements. In V. Torra & Y. Narukawa (Eds.) Modeling Decisions for Artificial Intelligence, (pp. 134145). Springer Berlin Heidelberg.
Paradis, E. (2018). Multidimensional scaling with very large datasets. Journal of Computational and Graphical Statistics, 27(4), 935939.
Tang, X., Wang, Z., He, Q., Liu, J., and Ying, Z. (2020) Latent Feature Extraction for Process Data via Multidimensional Scaling. Psychometrika, 85, 378397.
chooseK_mds
for choosing K
.
Other feature extraction methods: aseq2feature_seq2seq
,
atseq2feature_seq2seq
,
seq2feature_mds_large
,
seq2feature_ngram
,
seq2feature_seq2seq
,
tseq2feature_seq2seq
1 2 3 4  n < 50
set.seed(12345)
seqs < seq_gen(n)
theta < seq2feature_mds(seqs, 5)$theta

