sema_fit_one: Fit multilevel models in a data stream I

Description Usage Arguments Details Value See Also Examples

Description

Fit a multilevel model online, row-by-row, without storing data points

Usage

1
2
3
sema_fit_one(data_fixed, data_random, data_y, id, theta_j, theta,
  print = FALSE, start_resid_var = 1, start_random_var = 1,
  start_fixed_coef = NULL, prior_n = 0, prior_j = 0)

Arguments

data_fixed

A vector with the data of the fixed effects covariates.

data_random

A vector with the data of the random effects covariates.

data_y

A scalar with the response of this unit.

id

A scalar which identifies the unit of this data point.

theta_j

A list with this unit's parameters and contributions to the sufficient statistics.

theta

A list with model parameters and sufficient statistics.

print

The default is FALSE, if TRUE the function prints a summary of the model.

start_resid_var

A scalar, optional if the user wants to provide a start value of the residual variance, default start value is 1.

start_random_var

A vector, optional if the user wants to provide a start values of the variance of the random effects covariates, default start value is 1. NOTE, if start values are provided make sure that the length of the vector of start values matches the number of random effects.

start_fixed_coef

A vector, optional if the user wants to provide start values of the fixed effects, default is set to NULL such that sema_fit_one creates the vector of start values matching the number of fixed effects. NOTE, if start values are provided make sure that the length of the vector of start values matches the number of fixed effects.

prior_n

A scalar, if starting values are provided, prior_n determines the weight of the starting value of the residual variance, default is 0.

prior_j

A scalar, if starting values are provided, prior_j determines the weight of the starting value of the variance of the random effects and the fixed effects, default is 0.

Details

This is the main function to fit the multilevel models in a data stream. The function takes in one observation which consists of the id number of the unit the data of the fixed effects covariates, the values of the random effects covariates, the response or outcome, and the current state of the model parameters. Currently the algorithm can fit models including fixed effects at level 1 and 2 and random intercepts and slopes for continuous outcomes. The user manages storage and retrieval of unit's parameters. This function is also used in sema_fit_set and sema_fit_df.

Value

A list with a list with updated unit level parameters for one unit and a list with updated global parameters.

See Also

sema_fit_set, sema_fit_df, summary_sema, ranef, store_resid_var, store_random_var, store_fixed_coef

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
## First we create a dataset, consisting of 2500 observations from 20 
## units. The fixed effects have the coefficients 1, 2, 3, 4, and 5. The 
## variance of the random effects equals 1, 4, and 9. Lastly the 
## residual variance equals 4:

test_data <- build_dataset(n = 1500, 
                           j = 200, 
                           fixed_coef = 1:5, 
                           random_coef_sd = 1:3, 
                           resid_sd = 2)
                           
## to simplify the indexing, we generate 2 vectors, one that indicates which
## columns are fixed effects variables and the other to indicate in which
## columns the random effects variables are
 
data_fixed_var <- c(3:7)
data_random_var <- c(3)

## a list where the unit output of fit_sema is stored 
id_records <- list(NA)

## a vector which contains all observed units
id_vector <- c()

## an object where fit_sema output is stored in, this should be \code{NULL}
## because that tells the fit_sema function to create model statistics lists 

m1 <- NULL

## the user can decide when output is printed to the console 
print <- FALSE

## mimic a data stream: 
for(i in 1:nrow(test_data)){
  id <- test_data$id[i]
  if(!is.element(id, id_vector)){
    id_vector		 <- c(id_vector, id)
    temp_id			 <- which(id_vector == id)
    id_suff_stat <- NULL
  }
  else{
    temp_id		   <- which(id_vector == id)
    id_suff_stat <- id_records[[temp_id]]
  }
  m1	<- sema_fit_one(data_fixed = as.numeric(test_data[i, data_fixed_var]),
                    data_random = as.numeric(test_data[i, data_random_var]),
                    data_y      = test_data$y[i],
                    theta       = m1$model,
                    theta_j     = id_suff_stat,
                    id          = test_data$id[i],
                    print       = print)
                    
 id_records[[temp_id]]	<- m1$unit
}

L-Ippel/SEMA documentation built on May 30, 2019, 8:23 a.m.