gplsim: Function to fit generalized partially linear single-index...

Description Usage Arguments Value Examples

View source: R/gplsim.r

Description

This function employs penalized spline (P-spline) to estimate generalized partially linear single index models, which extend the generalized linear models to include nonlinear effect for some predictors.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
gplsim(
  Y = Y,
  X = X,
  Z = Z,
  family = gaussian(),
  penalty = TRUE,
  penalty_type = "L2",
  scale = -1,
  smooth_selection = "GCV.Cp",
  profile = TRUE,
  bs = "ps",
  user.init = NULL,
  k = 13
)

Arguments

Y

Response variable, should be a vector.

X

Single index covariates.

Z

Partially linear covariates.

family

A family object: a list of functions and expressions for defining link and variance functions. Families supported are binomial, gaussian. The default family is gaussian.

penalty

Whether use penalized splines or un-penalized splines to fit the model. The default is TRUE.

penalty_type

The optional argument penalty_type is a character variable, which specifies the type of penalty used in the penalized splines estimation. The default penalty type is L_2 penalty, while L_1 is also supported.

scale

The optional argument scale is a numeric indicator with a default value set to -1. Any negative value including -1 indicates that the scale of response distribution is unknown, thus need to be estimated. Another option is 0 signaling scale of 1 for Poisson and binomial distribution and unknown for others. Any positive value will be taken as the known scale parameter.

smooth_selection

The optional argument smooth_selection is another character variable that specifies the criterion used in the selection of a smoothing parameter. The supported criteria include "GCV.Cp","GACV.Cp", "ML","P-ML", "P-REML" and "REML", while the default criterion is "GCV.Cp".

profile

profile is a logical variable that indicates whether the algorithm with profile likelihood or algorithm with NLS procedure should be used. The default algorithm is set to algorithm with profile likelihood.

bs

bs is a character variable that specifies the spline basis in the estimation of unknown univariate function of single index. Default is P-splines.

user.init

The user.init is a numeric vector of the same length as the dimensionality of single index predictors. The users can use this argument to pass in any appropriate user-defined initial single-index coefficients based on prior information or domain knowledge. The default value is NULL

k

k is the the dimension of the basis used to represent the smooth term. The default is set at 13.

Value

theta Estimation of Theta

coefficients the coefficients of the fitted model. Parametric coefficients are first, followed by coefficients for each spline term in turn.

... See GAM object

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# parameter settings
n=1000
true.theta = c(1, 1, 1)/sqrt(3)
# Generating data (binary case)
# This function generates a sin bump model as in Yu et al. (2007).
# You may use your data instead.
data <- generate_data(n,true.theta=true.theta,family="binomial")
y=data$Y       # binary response
X=data$X       # single index term ;
Z=data$Z       # partially linear term ;

# Fit the generalized partially linear single-index models
result <- gplsim(y,X,Z,family = binomial)

# Estimation of Theta
result$theta

# The coefficients of the fitted model. 
# Parametric coefficients are first, followed by coefficients for each spline term in turn.
result$coefficients

# summary of the fitted model
summary(result)

#plot the estimated single index function curve
plot.si(result)
#par(new=T)
#plot.si(result,index=Z,xaxt="n", yaxt="n",col="red")


# Gaussian case
# This function generate a plain sin bump model with gaussian response.
data <- generate_data(n,true.theta=true.theta,family="gaussian")
y=data$Y       # continous response
X=data$X       # single index term ;
Z=data$Z       # partially linear term ;

result <- gplsim(y,X,Z,family = gaussian)
result$theta
result$coefficients
summary(result)


#plot the estimated single index function curve
plot.si(result)
#par(new=T)
#plot.si(result,index=Z,xaxt="n", yaxt="n",col="red")

# A real data example
data(air)
y=air$ozone               # response
X=as.matrix(air[,3:4])    # single index term ;
Z=air[,2]                 # partially linear term ;

result <- gplsim(y,X,Z=Z,family = gaussian,k=10)
result$theta
result$coefficients
summary(result)

# Or you can try different spline basis
result <- gplsim(y,X,Z=Z,family = gaussian,bs="tp",k=10)
result$theta
result$coefficients
summary(result)

# to know more about air data
?air

zzz1990771/gplsim documentation built on Nov. 25, 2021, 9:21 a.m.