lassosum: Function to obtain LASSO estimates of a regression problem...

Description Usage Arguments Details Value Note

View source: R/lassosum.R

Description

Function to obtain LASSO estimates of a regression problem given summary statistics and a reference panel

lassosum: A package for carrying out LASSO regression using GWAS summary statistics

Usage

1
2
3
4
5
lassosum(cor, bfile, lambda = exp(seq(log(0.001), log(0.1), length.out =
  20)), shrink = 0.9, thr = 1e-04, init = NULL, trace = 0,
  maxiter = 10000, blocks = NULL, keep = NULL, remove = NULL,
  extract = NULL, exclude = NULL, chr = NULL, mem.limit = 4 * 10^9,
  chunks = NULL, cluster = NULL)

Arguments

cor

A vector of correlations (r)

bfile

PLINK bfile (as character, without the .bed extension)

lambda

A vector of λs (the tuning parameter)

shrink

The shrinkage parameter s for the correlation matrix R

thr

convergence threshold for β

init

Initial values for β as a vector of the same length as cor

trace

An integer controlling the amount of output generated.

maxiter

Maximum number of iterations

blocks

A vector to split the genome by blocks (coded as c(1,1,..., 2, 2, ..., etc.))

keep

samples to keep

remove

samples to remove

extract

SNPs to extract

exclude

SNPs to exclude

chr

a vector of chromosomes

mem.limit

Memory limit for genotype matrix loaded. Note that other overheads are not included.

chunks

Splitting the genome into chunks for computation. Either an integer indicating the number of chunks or a vector (length equal to cor) giving the exact split.

cluster

A cluster object from the parallel package for parallel computing

Details

A function to find the minimum of β in

f(β)=β'Rβ - 2β'r + 2λ||β||_1

where

R=(1-s)X'X/n + sI

is a shrunken correlation matrix, with X being standardized reference panel. s should take values in (0,1]. r is a vector of correlations. keep, remove could take one of three formats: (1) A logical vector indicating which indivduals to keep/remove, (2) A data.frame with two columns giving the FID and IID of the indivdiuals to keep/remove (matching those in the .fam file), or (3) a character scalar giving the text file with the FID/IID. Likewise extract, exclude can also take one of the three formats, except with the role of the FID/IID data.frame replaced with a character vector of SNP ids (matching those in the .bim file).

Value

A list with the following

lambda

same as the lambda input

beta

A matrix of estimated coefficients

conv

A vector of convergence indicators. 1 means converged. 0 not converged.

pred

=√(1-s)Xβ

loss

=(1-s)β'X'Xβ/n - 2β'r

fbeta

=β'Rβ - 2β'r + 2λ||β||_1

sd

The standard deviation of the reference panel SNPs

shrink

same as input

nparams

Number of non-zero coefficients

Note

Missing genotypes are interpreted as having the homozygous A2 alleles in the PLINK files (same as the --fill-missing-a2 option in PLINK).


tshmak/lassosum documentation built on Sept. 24, 2020, 9:41 a.m.