harmonise_effects: Harmonise and format data for Slope-Hunter

View source: R/harmonise_effects.R

harmonise_effectsR Documentation

Harmonise and format data for Slope-Hunter

Description

Harmonise the alleles and effects between the incidence and prognosis (inspired by https://github.com/MRCIEU/TwoSampleMR/blob/master/R/harmonise.R)

Usage

harmonise_effects(
  incidence_dat,
  prognosis_dat,
  incidence_formatted = TRUE,
  prognosis_formatted = TRUE,
  by.pos = FALSE,
  pos_cols = c("POS.incidence", "POS.prognosis"),
  snp_cols = c("SNP", "SNP"),
  beta_cols = c("BETA.incidence", "BETA.prognosis"),
  se_cols = c("SE.incidence", "SE.prognosis"),
  EA_cols = c("EA.incidence", "EA.prognosis"),
  OA_cols = c("OA.incidence", "OA.prognosis"),
  chr_cols = c("CHR.incidence", "CHR.prognosis"),
  gene_col = c("GENE.incidence", "GENE.prognosis")
)

Arguments

incidence_dat

data.table for incidence data. It is recommended to be an output from read_incidence. If not, it tries to format it before harmonisation.

prognosis_dat

data.table for prognosis data. It is recommended to be an output from read_prognosis. If not, it tries to format it before harmonisation.

incidence_formatted

Logical indicationg whether incidence_dat is formatted using read_incidence.

prognosis_formatted

Logical indicationg whether prognosis_dat is formatted using read_prognosis.

by.pos

Logical, if TRUE the harmonisation will be performed by matching the exact SNP positions between the incidence and prognosis datasets.

pos_cols

A vector of length 2 specifying the name of the genetic position columns in the incidence and prognosis datasets respectively.

snp_cols

A vector of length 2 specifying the name of the snp columns in the incidence and prognosis datasets respectively. This is the column on which the data will be merged if by.pos is FASLE.

beta_cols

A vector of length 2 specifying the name of the beta columns in the incidence and prognosis datasets respectively.

se_cols

A vector of length 2 specifying the name of the se columns in the incidence and prognosis datasets respectively.

EA_cols

A vector of length 2 specifying the name of the effect allele columns in the incidence and prognosis datasets respectively.

OA_cols

A vector of length 2 specifying the name of the non-effect allele columns in the incidence and prognosis datasets respectively.

chr_cols

A vector of length 2 specifying the name of the chromosome columns in the incidence and prognosis datasets respectively.

gene_col

A vector of length 2 specifying the name of the gene columns in the incidence and prognosis datasets respectively.

Details

In order to perform Slope-Hunter analysis the effect of a SNP on an incidence and prognosis traits must be harmonised to be relative to the same allele.

This function will try to harmonise the incidence and prognosis data sets on the specified columns. Where necessary, correct strand for non-palindromic SNPs (i.e. flip the sign of effects so that the effect allele is the same in both datasets), and drop all palindromic SNPs from the analysis (i.e. with the allele A/T or G/C). The alleles that do not match between data sets (e.g T/C in one data set and A/C in the other) will also be dropped.

Value

A data.frame with harmonised effects and alleles


Osmahmoud/SlopeHunter documentation built on Oct. 7, 2022, 4:38 p.m.