EMmvpolySelfDesign: Two stage model with self design second stage matrix
In andrewhaoyu/TOP: Two-stage Polytomous Logistic regression

EMmvpolySelfDesign

R Documentation

Two stage model with self design second stage matrix

Description

Two stage model with self design second stage matrix

Usage

EMmvpolySelfDesign(
  y,
  x.self.design,
  z.design,
  baselineonly = NULL,
  additive = NULL,
  pairwise.interaction = NULL,
  saturated = NULL,
  missingTumorIndicator = 888,
  z.all = NULL,
  delta0 = NULL,
  cutoff = 10
)

Arguments

`y`	the phenotype file. The first column is the case control disease status. The other columns are the tumor characteristics status
`x.self.design`	the covariate you want to use self designed second stage matrix
`z.design`	self designed second stage matrix
`baselineonly`	the covariates to be adjusted used baseline effect only model. This assumes the odds ratio of the covariates for all the subtpes to be the same.
`additive`	the covariates to be adjusted used the additive two-stage model
`pairwise.interaction`	the covariates to be adjusted used the pairwise interaction two-stage model
`saturated`	the covariates to be adjusted used the saturated two-stage model. This model assumes every subtype has their specific odds ratio. It's equivalent to the polytmous model.
`missingTumorIndicator`	The indicators to show the tumor characteristics are missing. In the example, we put missing tumor characteristics as 888. Note, for all the controls subjects, they don't have tumor characteristics. So their tumor characteristics are put as NA instead of 888 to differentiate with cases missing tumor characteristics.
`z.all`	if you want to have differnt self designed second stage matrix for different covariantes, then you can directly construct the second stage matrix for all of the covariates.
`delta0`	the starting value for the second stage parameters. By defualt, we will use the empirical distribution of the subtypes.
`cutoff`	by default, the model will remove the subtypes with less than 10 cases, the user can specify other values by changing the cutoff. But we don't recommend to set the cutoff too low, since the asymptotic convergence requires enough sample size

Value

the result is a list containing 9 elements. 1. the second stage parameters 2. the covariance matrix for the second stage parameters. 3. the second stage parameters organzied for the self desinged covariate 4. The odds ratio of the self designed subtypes5. Global association test and global heterogeneity test result (Wald test based) 6. The first stage parameter organized for self designed covariates 7. First stage odds test results of all the subtypes. 8. Likelihood 9. AIC

Examples

#load in the breast cancer example
data(data, package="TOP") #load in the breast cancer example
#this is a simulated breast cancer example
#there are around 5000 breast cancer cases and 5000 controls, i.e. people without disease
data[1:5,]

#four different tumor characteristics were included, ER (positive vs negative), PR (positive vs negative), HER2 (positive vs negative), grade (ordinal 1, 2, 3)
#the phenotype file
y <- data[,1:5]
#generate the combinations of all the subtypes
#by default, we remove all the subtypes with less than 10 cases
z.standard <- GenerateZstandard(y)
M <- nrow(z.standard) #M is the total number of first stage subtypes
#initial a z.design matrix with M rows, and 5 columns
#each row represent a first stage subtype
#each column represent an aggregated subtype
z.design <- matrix(0,M,5)
#define names for the five intrinsic subtypes
colnames(z.design) <- c("HR+_HER2-_lowgrade",
                       "HR+_HER2+",
                       "HR+_HER2-_highgrade",
                       "HR-_HER2+", 
                       "HR-_HER2-")
#To construct a self design second stage matrix,
#we need to find the correpsonding first stage subtypes
#belonging to specific aggregated subtypes
#for first subtype HR+_HER2-_lowgrade
idx.1 <- which((z.standard[,1]==1|z.standard[,2]==1)
              &z.standard[,3]==0
              &(z.standard[,4]==1|z.standard[,4]==2))
z.design[idx.1,1] <- 1
#for second subtype HR+_HER2+
idx.2 <- which((z.standard[,1]==1|z.standard[,2]==1)
              &z.standard[,3]==1)
z.design[idx.2,2] <- 1
#for third subtype HR+_HER2-_highgrade
idx.3 <- which((z.standard[,1]==1|z.standard[,2]==1)
              &z.standard[,3]==0
              &z.standard[,4]==3)
z.design[idx.3,3] <- 1
#for third subtype HR-_HER2+
idx.4 <- which(z.standard[,1]==0&z.standard[,2]==0
              &z.standard[,3]==1)
z.design[idx.4,4] <- 1
#for third subtype HR-_HER2-
idx.5 <- which(z.standard[,1]==0&z.standard[,2]==0
              &z.standard[,3]==0)
z.design[idx.5,5] <- 1
#one SNP and one Principal components (PC1) are the covariates
SNP <- data[,6,drop=F]
PC1 <- data[,7,drop=F]
model.3 <- EMmvpolySelfDesign(y,
                             x.self.design = SNP,
                             z.design = z.design,
                             additive=PC1,
                             missingTumorIndicator = 888)

andrewhaoyu/TOP documentation built on Aug. 29, 2022, 2:49 a.m.