HP: HP

Description Usage Arguments Value Examples

View source: R/HP.R

Description

Perform heterogeneity pursuit on the given dataset

Usage

1
2
HP(X, y, tau = NULL, method, Xj = NULL, K = 3,
  no.cluster.search = FALSE, IC = "bic", max.no.cluster = NULL)

Arguments

X

the design matrix; intercept column should bes included, if any.

y

the response vector of length n.

tau

the tuning parameter of the penalized regression; if not provided, tau = 0.1 * sqrt(n / log(n)).

method

method = c(“latent”, “mst”, “knn”).

Xj

the threshold variable, must be provided if method = “mst” or “knn”

K

the number of neighbors in a KNN graph; It may be specfied if method = "knn" or "latent"; default value is 3.

no.cluster.search

TRUE or FALSE; if TRUE, the optimal number of clusters is decided based on some information criterion.

IC

IC = c("aic", "bic", "mdl"), the information criterion based on which the cluster number of the dataset is determined; default is IC = "bic".

max.no.cluster

If no.cluster.search = TRUE, it is the max number of clusters which the data is partitioned into; if no.cluster.search = FALSE, max.no.cluster is the user-specified cluster numbers. The default value is max(n^(1/3), 5).

Value

a list of length (K + 1), where K is the number of subgroups. The first element in the list is a vector of membership indicators and the rest of the list elements are the R lm objects for each subgroup.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
n <- 50
p <- 3

X <- matrix(rnorm(n * p), nrow = n)
Xj <- X[,1] # the threshold variable
beta1 <- rep(3,p)
beta2 <- rep(-3,p)

index.g1 <- which(Xj <= 0)
index.g2 <- which(Xj > 0)

y.g1 <- X[index.g1,] %*% beta1
y.g2 <- X[index.g2,] %*% beta2

y <- rep(0,n)
y[index.g1] <- y.g1
y[index.g2] <- y.g2

y <- y + rnorm(n = n, sd = 0.5)

res.mst <- HP(X, y, method = "mst", Xj = X[,1], max.no.cluster = 2)
m.mst <- res.mst$membership
lm1 <- res.mst$lm1
lm2 <- res.mst$lm2

wenda-1121/RHP documentation built on Feb. 18, 2020, 9:36 p.m.