logXYratio: Calculate ratio of total counts of genes mapping to the X and...

View source: R/logXYratio.R

logXYratioR Documentation

Calculate ratio of total counts of genes mapping to the X and Y chromosome

Description

This function calculates the number of reads that map to the X and Y chromosomes, and returns the (natural) log of the ratio of total X reads to total Y reads. It can be used to verify or infer sex using RNAseq data. Large values are likely to be from female samples; small values from male samples.

Usage

logXYratio(
  counts,
  lib_cols = 1:ncol(counts),
  gene_ID = "symbol",
  species = "human",
  use_annotables = TRUE
)

Arguments

counts

a matrix or data frame containing the gene expression counts. Should have samples in columns and genes in rows. Row names must contain gene names, in the class standard matching gene_ID. NOTE: The calculated ratios are returned in the order of the counts object (by default), or in the order specified by lib_cols

lib_cols

a numeric vector, the indices of columns containing count data. Defaults to all columns in counts; can be adjusted if non-count columns are included in the counts object. Can also be used to specify an ordering of the libraries other than the order of the counts object.

gene_ID

the gene identifier class of the gene names. Must match a corresponding variable in annotables or biomaRt.

species

character, the species that the data derive from. Can be "human" or "mouse", or any species abbreviation used in the BioMart ensembl datasets (as long as the species has X and Y chromosomes). For a full list of possible species, use biomaRt::listDatasets(biomaRt::useMart("ensembl"))$dataset). For backward compatibility, defaults to human.

use_annotables

boolean, whether to use the annotables package. If annotables is not installed, defaults to using biomaRt.

Details

counts should be normalized or raw, but not log-transformed. It assumes that each column in the counts object corresponds to a library. If the counts object contains additional columns, the columns containing libraries must be indicated in lib_cols.

Value

a vector of ratios, with one element for each sample.


BenaroyaResearch/RNAseQC documentation built on April 19, 2024, 7:38 p.m.