sample_data: Sample Data

View source: R/sample_data.R

sample_dataR Documentation

Sample Data

Description

A function to subset data for use in distributed hierarchical bayesian algorithm for scalable target marketing.

Usage

sample_data(Data, Rate = 1)

Arguments

Data

(list) - A list of lists where each sublist contains either 'regdata' or 'lgtdata'.

Rate

(numeric) - Proportion of the data to be sampled

Value

Returns a list of the same structure as Data, but with length scaled by Rate.

Author(s)

Federico Bumbaca, federico.bumbaca@colorado.edu

Examples


# Generate hierarchical linear data
R=1000
nreg=10000
nobs=5 #number of observations
nvar=3 #columns
nz=2

Z=matrix(runif(nreg*nz),ncol=nz) 
Z=t(t(Z)-apply(Z,2,mean))
Delta=matrix(c(1,-1,2,0,1,0), ncol = nz) 
tau0=.1
iota=c(rep(1,nobs)) 

## create arguments for rmixture
tcomps=NULL
a = diag(1, nrow=3)
tcomps[[1]] = list(mu=c(-5,0,0),rooti=a) 
tcomps[[2]] = list(mu=c(5, -5, 2),rooti=a)
tcomps[[3]] = list(mu=c(5,5,-2),rooti=a)
tpvec = c(.33,.33,.34)                               
ncomp=length(tcomps)
regdata=NULL
betas=matrix(double(nreg*nvar),ncol=nvar) 
tind=double(nreg) 
for (reg in 1:nreg) { 
  tempout=bayesm::rmixture(1,tpvec,tcomps)
  if (is.null(Z)){
    betas[reg,]= as.vector(tempout$x)  
  }else{
    betas[reg,]=Delta%*%Z[reg,]+as.vector(tempout$x)} 
  tind[reg]=tempout$z
  X=cbind(iota,matrix(runif(nobs*(nvar-1)),ncol=(nvar-1))) 
  tau=tau0*runif(1,min=0.5,max=1) 
  y=X%*%betas[reg,]+sqrt(tau)*rnorm(nobs)
  regdata[[reg]]=list(y=y,X=X,beta=betas[reg,],tau=tau) 
}

Prior1=list(ncomp=ncomp) 
keep=1
Mcmc1=list(R=R,keep=keep)
Data1=list(list(regdata=regdata,Z=Z))

length(Data1[[1]]$regdata)

data_s = sample_data(Data = Data1, Rate = 0.1)
length(data_s[[1]]$regdata)


scalablebayesm documentation built on April 3, 2025, 7:55 p.m.