get_matching_genes: Match association results across datasets by a key column

View source: R/foresttopr.R

get_matching_genesR Documentation

Match association results across datasets by a key column

Description

get_matching_genes() aligns rows from multiple association result tables using a shared key column (e.g. gene or feature identifier). Effect estimates and confidence intervals are standardized across datasets, while row labels are taken exclusively from the reference dataset (the first element of dfs).

This function is typically used internally by foresttopr, but may be useful on its own when preparing matched effect tables for visualization or downstream analysis.

Usage

get_matching_genes(
  dfs,
  labels,
  gene_col = NULL,
  label_col = NULL,
  effect_type = c("OR", "beta")
)

Arguments

dfs

A list of data frames containing association results. Each data frame must contain a key column and effect size information.

labels

A character vector of dataset labels of the same length as dfs. These labels identify the source of each matched effect estimate.

gene_col

Character scalar specifying the column name used to match rows across datasets (e.g. gene identifier). If NULL, a suitable column is inferred from common gene identifier names.

label_col

Optional character scalar specifying the column name in the reference dataset (the first element of dfs) to use as a human-readable row label. If NULL, the matching key (gene_col) is used for labeling.

effect_type

Character scalar specifying the effect scale to use. Either "OR" (odds ratio) or "beta" (regression coefficient). Matching is case-insensitive. Effect estimates are converted between scales as needed.

Details

Rows are matched across datasets using the key column specified by gene_col. The set of keys present in the reference dataset defines the universe of rows retained. For each dataset, confidence intervals are derived preferentially from explicit bounds, standard errors, or p-values, depending on availability.

The returned table contains one row per matched key per dataset.

Value

A data frame containing matched effect estimates with the following columns:

key

Matching key used to align rows across datasets.

label

Row label used for display purposes.

set

Dataset identifier corresponding to labels.

or

Effect estimate on the requested scale.

p

P-value associated with the effect estimate.

lcl

Lower confidence interval bound.

ucl

Upper confidence interval bound.

See Also

foresttopr


topr documentation built on April 13, 2026, 5:07 p.m.