load_database: Load annotations.
In GMELab/GraBLD: Gradient Boosted and LD adjusted

Description Usage Arguments Details Value

The function loads the annotation predictor variables into the workspace.

1	load_database(annotation_file, pos = 2)

`annotation_file`	the directory to a `data.frame` of annotation variables used to update the `beta` values through gradient boosted regression tree models. The first column of the matrix must be the SNP IDs and the remaining columns could be additional annotation information. The SNP IDs must be in the same order as those in `beta`.
`pos`	an integer indicating which columns of the data matrix `annotations` is the corresponding consortium value and additionally which columns should also be included.

The annotation matrix provides the necessary predictor variables used to update the weights of polygenic gene score via gradient boosted regression tree. The data.frame should have at least two columns, the first column is SNP_ID; the rest are the adjusted consortia regression coefficient or summary statistics. It is recommended to adjust the consortia regression coefficient by the minor allele frequency of the SNP:

1
2
3

   SNP_SD = sqrt(2 * as.numeric(MAF[,5]) * (1 - as.numeric(MAF[,5])))
   beta_adj = as.numeric(beta) * SNP_SD

For any one trait, at least one column of corresponding adjusted beta from the consortium is required. For instance, if we work on BMI, at least the adjusted regression coefficient for association with BMI in a consortium study should be provided. Additional annotations such as related regression coefficients of other traits, or SNP functional annotations can also be included.

a data frame of predictor variables that can be used to update SNPs weights.

GMELab/GraBLD documentation built on May 4, 2019, 3:20 p.m.