ssCTPR: ssCTPR
In yingxi-kaylee/ssCTPR: Cross-Trait Penalized Regression using Summary Statistics

Description Usage Arguments Details Value Note

Function to obtain beta estimates of an elastic net regression problem given summary statistics from one or more traits and a reference panel

ssCTPR(
  cor,
  adj,
  bfile,
  lambda = exp(seq(log(0.001), log(0.1), length.out = 20)),
  shrink = 0.9,
  lambda_ct = c(0, 0.06109, 0.1392, 0.24257),
  thr = 1e-04,
  init = NULL,
  trace = 0,
  maxiter = 3000,
  blocks = NULL,
  keep = NULL,
  remove = NULL,
  extract = NULL,
  exclude = NULL,
  chr = NULL,
  mem.limit = 4 * 10^9,
  chunks = NULL,
  cluster = NULL
)

`cor`	A matrix of SNP-wise correlation with primary trait, derived from summary statistics, and beta of secondary traits if have any
`adj`	Adjacency coefficients
`bfile`	PLINK bfile (as character, without the .bed extension)
`lambda`	A vector of λs (the tuning parameter)
`shrink`	The shrinkage parameter s for the correlation matrix R
`lambda_ct`	A vector of λ_{ctp}s (the tuning parameter)
`thr`	convergence threshold for β
`init`	Initial values for β as a vector of the same length as `cor`
`trace`	An integer controlling the amount of output generated.
`maxiter`	Maximum number of iterations
`blocks`	A vector to split the genome by blocks (coded as c(1,1,..., 2, 2, ..., etc.))
`keep`	samples to keep
`remove`	samples to remove
`extract`	SNPs to extract
`exclude`	SNPs to exclude
`chr`	a vector of chromosomes
`mem.limit`	Memory limit for genotype matrix loaded. Note that other overheads are not included.
`chunks`	Splitting the genome into chunks for computation. Either an integer indicating the number of chunks or a vector (length equal to `cor`) giving the exact split.
`cluster`	A `cluster` object from the `parallel` package for parallel computing

A function to find the minimum of β in

f(β)=β'Rβ - 2β'r + 2λ||β||_1 + λ_{ct}||β-s_{t}||^{2}

where

R=(1-s)X'X/n + sI

is a shrunken correlation matrix, with X being standardized reference panel. s should take values in (0,1]. r is a vector of correlations. s_{t} is a vector of summary statistics from secondary traits, if any. keep, remove could take one of three formats: (1) A logical vector indicating which individuals to keep/remove, (2) A data.frame with two columns giving the FID and IID of the individuals to keep/remove (matching those in the .fam file), or (3) a character scalar giving the text file with the FID/IID. Likewise extract, exclude can also take one of the three formats, except with the role of the FID/IID data.frame replaced with a character vector of SNP ids (matching those in the .bim file).

A list with the following

`lambda`	same as the lambda input
`beta`	A matrix of estimated coefficients
`conv`	A vector of convergence indicators. 1 means converged. 0 not converged.
`pred`	=√(1-s)Xβ
`loss`	=(1-s)β'X'Xβ/n - 2β'r
`fbeta`	=β'Rβ - 2β'r + 2λ\|\|β\|\|_1
`sd`	The standard deviation of the reference panel SNPs
`shrink`	same as input
`lambda_ct`	same as input
`nparams`	Number of non-zero coefficients