get_data_num: Normalize the regression coefficients and annotation data
In GMELab/GraBLD: Gradient Boosted and LD adjusted

Description Usage Arguments Value Examples

The function process the unviariate beta regression as well as the annotation data matrix and combine the normalized data used for estimating the optimal boosted regression trees model.

1 2	get_data_num(betas, annotations, pos = 2, pos_sign = 3, abs_effect = 2:5, normalize = FALSE)

`betas`	a matrix of regression coefficients from association analysis in the target population. The first column is the chromosome for each SNP, and the column with the regression coefficient should be specified by setting `pos`. The default value for `pos` is 2. The SNP IDs or other information could be present as additional columns. Users need to prepare univariate association beta file without headers. The betas were generated from the model: `coef(summary(lm(pheno_data ~ geno[,j])))[2,1]` Both genotype data and phenotype data over individuals need to be standardized to have `mean` = 0 and `variance` = 1.
`annotations`	a matrix of annotation variables used to update the `beta` values through gradient boosted regression tree models. Usually, this can be taken from the summary-level test statistics of matching traits from genome-wide consortia available online. The first column of the matrix must be the SNP IDs and the remaining columns could be additional annotation information. The SNP IDs must be in the same order as those in `beta`.
`pos`	an integer indicating which columns of the data matrix `annotations` is the corresponding consortium value and additionally which columns should also be included.
`pos_sign`	an integer indicating which column of the data matrix `annotations` should be used to update the sign of the univariate regression coefficient. Usually, it is set to be the consortium univariate regression coefficient of the same trait.
`abs_effect`	a vector of integers indicating which columns of the data matrix `annotations` should be used as absolute effect by taking the absolute sign. For example, when only the strength of the effect rather than the direction of the effect is informative for improving the polygenic score weights.
`normalize`	a logic indicating whether the univariate beta regression coefficients in `beta` should be normalized with respect to the consortium values in `annotations`.