format_data: Format input data

format_dataR Documentation

Format input data

Description

Reads in and format input data. It checks and organises columns for Slope-Hunter analyses. Infers p-values when possible from beta and se.

Usage

format_data(
  dat,
  type = "incidence",
  snps = NULL,
  snp_col = "SNP",
  beta_col = "BETA",
  se_col = "SE",
  pval_col = "PVAL",
  eaf_col = "EAF",
  effect_allele_col = "EA",
  other_allele_col = "OA",
  gene_col = "GENE",
  chr_col = "CHR",
  pos_col = "POS",
  min_pval = 1e-200,
  log_pval = FALSE
)

Arguments

dat

Data frame. Must have header with at least the SNP, beta, se and EA columns present.

type

Is this the incidence or the prognosis data that is being read in? The default is "incidence".

snps

SNPs to extract. If NULL, then it keeps all. The default is NULL.

snp_col

Required name of column with SNP rs IDs. The default is "SNP".

beta_col

Required name of column with effect sizes. The default is "BETA".

se_col

Required name of column with standard errors. The default is "SE".

pval_col

Name of column with p-value (optional). The default is "PVAL". It will be Inferred when possible from beta and se.

eaf_col

Name of column with effect allele frequency (optional). The default is "EAF".

effect_allele_col

Required for harmonisation. Name of column with effect allele. Must be "A", "C", "T" or "G". The default is "EA".

other_allele_col

Required for harmonisation. Name of column with non-effect allele. Must be "A", "C", "T" or "G". The default is "OA".

gene_col

Optional column for gene name. The default is "GENE".

chr_col

Optional column for chromosome number. The default is "CHR".

pos_col

Optional column for SNP position. The default is "POS".

min_pval

Minimum allowed p-value. The default is 1e-200.

log_pval

The p-value is -log10(P). The default is FALSE.

Value

data frame


Osmahmoud/SlopeHunter documentation built on Oct. 7, 2022, 4:38 p.m.