phasing: Haplotype pseudo alignment

Description Usage Arguments Value

View source: R/ASEP functions.R

Description

This function is used to obtain the pseudo haplotype phase of the RNA-seq data for a given gene, and align the major alleles across individuals.

Usage

1
phasing(dat, phased = FALSE, n_condition = "one")

Arguments

dat:

bulk RNA-seq dataset of a given gene. Must contain variables:

  • One condition analysis:
    - 'id': character, individual identifier;
    - 'ref': numeric, the snp-level read counts for the reference allele if the haplotype phase of the data is unknown, and the snp-level read counts for allele aligned on paternal/maternal haplotype if haplotype phase is known;
    - 'total': numeric, snp-level total read counts for both alleles;

  • Two conditions analysis:
    - 'id': character, individual identifier;
    - 'snp': character, the name/chromosome location of the heterzygous genetic variants;
    - 'ref': numeric, the snp-level read counts for the reference allele if the haplotype phase of the data is unknown, and the snp-level read counts for allele aligned on the same paternal/maternal haplotype for both conditions if haplotype phase is known;
    - 'total': numeric, snp-level total read counts for both alleles;
    - 'group': character, the condition each RNA-seq sample is obtained from (i.e., pre- vs post-treatment);
    - 'ref_condition': character, the condition used as the reference for pseudo haplotype phasing;

phased:

a logical value indicates whether the haplotype phase of the data is known or not. Default is FALSE

n_condition:

a character string indicates whether the RNA-seq data contains data from only one condition or two conditions (i.e. normal vs diseased). Possible values are "one" or "two". Default is "one"

Value

The psudo-phased RNA-seq data, with one more column "major" indicates the read counts for major alleles aligned across individuals


Jiaxin-Fan/ASEP documentation built on Aug. 9, 2021, 6:39 a.m.