View source: R/fineboost_normal.R
fineboost_normal | R Documentation |
This function uses a kernel-based FS-boost framework to find causal fine-mapped SNP sets from GWAS or QTL effect size data. It performs a regression Y = Xb + e where b is a sparse vector of coefficients with local signal clusters. Unlike other fine-mapping methods, our algorithm is model-free on the coefficients b.
fineboost_normal(X, Y, M = 1000, Lmax = 5, LD = NULL, step = 0.1, kern_tau = 0.01, method = "LS", kernel = "L2", stop_thresh = 1e-04, na.rm = FALSE, intercept = TRUE, standardize = TRUE, coverage = 0.95, clus_thresh = 0.1, min_within_LD = 0.5, min_between_LD = 0.25, min_clus_centrality = 0.5, nmf_try = 5, verbose = TRUE)
X |
The design matrix X (N times P) with samples/individuals along the rows and putatively correlated ordered features (SNPs) along the columns. |
Y |
The response vector of length N |
M |
The maximum number of boosting iterations to run. Default is 1000. |
Lmax |
The maximum number of local signal clusters fitted. |
LD |
The external LD matrix for the P features of interest. Defaults to NULL, in which case, in-sample LD is used. |
step |
The stepsize used in boosting iterations. Default set to 0.05. |
kern_tau |
The smoothing intensity of the kernel averaging at each boosting iteration. Default set to 0.01. |
method |
The boosting update method- either 'LS' or 'FS' indicating the LS-Boost and FS-epsilon methods respectively. Default is set to LS-Boost. |
kernel |
The nature of the kernel used for smoothing. Can be either 'L1', 'L2', 'epanechnikov' or 'prune'. 'L1' kernel uses a L-1 norm based kernel, 'L2' uses a L-2 norm based kernel, 'epanechnikov' uses an Epanechnikov kernel and 'prune' uses a uniform kernel on all SNPs with high LD to the optimal SNP at each boosting iteration. |
stop_thresh |
The stopping threshold (small number) for the objective function, when attained, the boosting iterations will stop automatically. Default is 0.1. |
na.rm |
Drop missing samples in y from both y and X inputs. Default set to FALSE. |
intercept |
Boolean; if there is an intercept in the model to fit. Defaults to TRUE. |
standardize |
Boolean; if the columns of X need to be standardized. Defaults to TRUE. |
coverage |
A number between 0 and 1 (close to 1) specifying the coverage of the estimated signal clusters. Default set to 0.95. |
min_within_LD |
The minimum value of LD permitted for SNPs within a local signal cluster. Default is 0.25. |
min_between_LD |
The minimum value of LD permitted for SNPs across two local signal clusters. Default is it cannot exceed 0.25. |
nmf_try |
The number of NMF initiializations to fix the confidence sets. Default is set to 5. |
verbose |
If |
min_cluster_centrality |
The minimum value of cluster centrailty required for a SNP to make the cut in a local signal cluster. Default is set at 0.5. |
min_abs_corr |
Minimum of absolute value of correlation allowed in a credible set. The default, 0.5, corresponds to squared correlation of 0.25, which is a commonly used threshold for genotype data in genetics studies. |
A "fineboost"
object with the following elements:
N |
|
P |
|
Lmax |
|
beta |
Y = Xb + e.
beta_path |
|
weights_path |
|
profile_loglik |
|
obj_path |
|
csets |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.