BLUE_estimates_BT: BLUE_estimates_BT function

View source: R/BLUE_estimates_BT.R

BLUE_estimates_BTR Documentation

BLUE_estimates_BT function

Description

Estimates individual-level polygenic risk scores (PRS) with uncertainty using a frequentist approach for binary traits. This implementation applies Firth's bias-reduced logistic regression on the discovery sample, computes the coefficient covariance matrix, and uses the delta method to derive PRS variance and confidence intervals.

Usage

BLUE_estimates_BT(
  discovery_pheno,
  discovery_geno_mat,
  target_pheno,
  target_geno_mat,
  significance_level = 0.05,
  max_iterations = 100
)

Arguments

discovery_pheno

Character. Path to the phenotype file for the discovery dataset. Assumes no header and that the binary trait is in the third column.

discovery_geno_mat

Character. Path to the genotype matrix file for the discovery dataset. Assumes no header.

target_pheno

Character. Path to the phenotype file for the target dataset. Assumes no header and individual IDs in the second column.

target_geno_mat

Character. Path to the genotype matrix file for the target dataset. Assumes no header.

significance_level

Numeric. Significance level for confidence intervals (e.g., 0.05 for 95% CI). Default is 0.05.

max_iterations

Integer. Maximum number of iterations allowed in Firth logistic regression. Default is 100.

Details

The function fits a Firth logistic regression model using the logistf package to reduce small-sample bias in the discovery set. It extracts SNP effect estimates and their covariance matrix, and propagates this uncertainty through to the individual-level PRS in the target dataset via the delta method. Confidence intervals are derived assuming normality.

Missing or non-estimable coefficients and variances are set to zero.

Value

A data frame with the following columns:

IID

Individual identifier (from the target phenotype file).

PRS

Estimated polygenic risk score for each individual.

Variance

Estimated variance of the PRS.

Lower_Limit

Lower bound of the confidence interval.

Upper_Limit

Upper bound of the confidence interval.

Examples

  bpd <- system.file("Bpd_0_1.txt", package = "iPRSue", mustWork = TRUE)
  bpt <- system.file("Bpt.txt", package = "iPRSue", mustWork = TRUE)
  gd  <- system.file("Gd.txt",  package = "iPRSue", mustWork = TRUE)
  gt  <- system.file("Gt.txt",  package = "iPRSue", mustWork = TRUE)

  results <- BLUE_estimates_BT(
    discovery_pheno    = bpd,
    discovery_geno_mat = gd,
    target_pheno       = bpt,
    target_geno_mat    = gt,
    significance_level = 0.05,
    max_iterations     = 100
  )
  head(results)


iPRSue documentation built on Sept. 10, 2025, 10:39 a.m.