seq_ord_model: The sequential logistic regression model for...

Description Usage Arguments Details Value References See Also Examples

View source: R/seq_ord_model.R

Description

seq_ord_model chooses the subjects sequentially by the logistic regression model for ordinal case

Usage

1
2
seq_ord_model(labeled_ids, unlabeled_ids, splitted, newY, train, data,
  d = 0.8, adaptive = "random")

Arguments

labeled_ids

A numeric vector for the unique identification of the labeled dataset

unlabeled_ids

A numeric vector for the unique identification of the unlabeled dataset

splitted

A list containing the datasets which we will use in the cordinl case. Note that the element of the data_split is the samples from Classes k-1and Classes k

newY

A numeric number denotes the value of the labels from 0 to K which is the number of categories

train

A matrix for the labeled samples. Note that the indices of the samples in the train dataset is the same as the labeled_ids

data

A matrix denotes all the data including the labeled samples and the unlabeled samples. Note that the first column of the dataset is the response variable, that's the labels and the rest is the explanatory variables.

d

A numeric number specifying the length of the fixed size confidence set for our model. The default value is 0.8.

adaptive

A character string that determines the sample selection criterion to be used, matching one of 'random' or 'A_optimal'. The default value is 'random'.

Details

The seq_ord_model function and seq_cat_model function are very similar. seq_ord_model is also a multinomial logistic regression model but under the ordinal case that estimate the coefficient variables and determines the samples given the fixed size confidence set. seq_ord_model selects the sample in the same way as seq_cat_model: both are two methods. The details about the selecting methoed in seq_ord_model please refer to the seq_cat_model function.

Value

a list containing the following components

d

the length of the fixed size confidence set that we specify

n

the current sample size when the stopping criterion is satisfied

is_stopped

the label of sequential iterations stop or not. When the value of is_stopped is TRUE, it means the iteration stops

beta_est

the estimated coeffificent when the criterion is safisfied

cov

the covariance matrix between the estimated parameters

adaptive

the sample selection criterion we used

References

Li, J., Chen, Z., Wang, Z., & Chang, Y. I. (2020). Active learning in multiple-class classification problems via individualized binary models. Computational Statistics & Data Analysis, 145, 106911. doi:10.1016/j.csda.2020.106911

See Also

seq_cat_model for categorical case

seq_bin_model for binary classification case

seq_GEE_model for generalized estimating equations case.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# generate the toy example
beta <- matrix(c(1,2,1,-1,1,2), ncol=2)
res <-  gen_multi_data(beta, N = 10000, type = 'ord', test_ratio = 0.3)
train_id <- res$train_id
train <- res$train
test <- res$test
res <- init_multi_data(train_id, train, init_N = 300, type = 'ord')
splitted <- res$splitted
train <- res$train
newY <- res$newY
labeled_ids <- res$labeled_ids
unlabeled_ids <- res$unlabeled_ids
data <- res$data

# use seq_ord_model to multi-classification problem under the ordinal case.
# You can remove '#' to run the command.
# start_time <- Sys.time()
# logitA_ord <- seq_ord_model(labeled_ids, unlabeled_ids, splitted, newY,
#                             train, data, d = 0.5, adaptive = "A_optimal")
# logitA_ord$time <- as.numeric(Sys.time() - start_time, units = "mins")
# print(logitA_ord)

seqest documentation built on July 2, 2020, 2:28 a.m.

Related to seq_ord_model in seqest...