get_matching_snps: Match association results across datasets by variant key

View source: R/foresttopr.R

get_matching_snpsR Documentation

Match association results across datasets by variant key

Description

get_matching_snps() aligns rows from multiple variant-level association result tables using a genomic variant key constructed from chromosome and position (e.g. "CHR:POS"). Effect estimates and confidence intervals are standardized across datasets, while row labels are taken from the reference dataset (the first element of dfs).

The function supports allele-aware matching and automatically corrects effect directions when reference and alternate alleles are flipped between datasets.

This function is typically used internally by foresttopr when variant-level columns are detected, but may also be called directly to prepare matched variant-level effect tables.

Usage

get_matching_snps(
  dfs,
  labels,
  gene_col = "ID",
  label_col = NULL,
  effect_type = c("OR", "beta")
)

Arguments

dfs

A list of data frames containing variant-level association results. Each data frame must contain chromosome and position columns (CHROM, POS) and effect size information.

labels

A character vector of dataset labels of the same length as dfs. These labels identify the source of each matched effect estimate.

gene_col

Character scalar specifying a column name in the reference dataset used for labeling rows (e.g. gene or variant identifier). Defaults to "ID". This column is not used for matching.

label_col

Optional character scalar specifying an alternative column name in the reference dataset to use as a human-readable row label. If NULL, gene_col is used for labeling.

effect_type

Character scalar specifying the effect scale to use. Either "OR" (odds ratio) or "beta" (regression coefficient). Matching is case-insensitive. Effect estimates are converted between scales as needed.

Details

Variants are matched across datasets using a key constructed from chromosome and position. Reference and alternate alleles are compared between datasets, and effect estimates are automatically flipped when allele orientation differs.

Confidence intervals are derived preferentially from explicit bounds, standard errors, or p-values, depending on availability.

The returned table contains one row per matched variant per dataset.

Value

A data frame containing matched variant-level effect estimates with the following columns:

key

Variant key constructed from chromosome and position.

label

Row label used for display purposes.

set

Dataset identifier corresponding to labels.

or

Effect estimate on the requested scale.

p

P-value associated with the effect estimate.

lcl

Lower confidence interval bound.

ucl

Upper confidence interval bound.

See Also

foresttopr, get_matching_genes


topr documentation built on April 13, 2026, 5:07 p.m.

Related to get_matching_snps in topr...