aggregateSegmentExpression: Aggregating genes across copy number segments.

View source: R/aggregateSegmentExpression.R

aggregateSegmentExpressionR Documentation

Aggregating genes across copy number segments.

Description

Calculates average expression of genes grouped by common segment membership.

Usage

aggregateSegmentExpression(epg, segments, mingps = 20, GRCh=37)

Arguments

epg

Gene-by-cell matrix of expression. Recommendation is to cap extreme UMI counts (e.g. at the 99% quantile) and to include only cells expressing at least 1,000 genes.

segments

Matrix in which each row corresponds to a copy number segment as calculated by a circular binary segmentation algorithm. Has to contain at least the following column names:
chr - chromosome;
startpos - the first genomic position of a copy number segment;
endpos - the last genomic position of a copy number segment;
CN_Estimate - the copy number estimated for each segment.

mingps

Minimum number of expressed genes a segment needs to contain in order to be included in output.

GRCh

Human reference genome version to be used for annotating gene coordinates.

Details

Let S := { S_1, S_2, ... S_n } be the set of n genomic segments that have been obtained from DNA-sequencing a given sample (e.g. from bulk exome-sequencing, scDNA-sequencing, etc.). Genes are mapped to their genomic coordinates using the biomaRt package and assigned to a segment based on their coordinates. Genes are grouped by their segment membership, to obtain the average number of UMIs and the number of expressed genes per segment S_j per cell i.

Value

List with fields:

eps

Segment-by-cell matrix of expression values.

gps

Segment-by-cell matrix of the number of expressed genes.

Author(s)

Noemi Andor

Examples

data(epg)
data(segments)

	X=aggregateSegmentExpression(epg, segments, mingps=20, GRCh=38)


noemiandor/liayson documentation built on March 31, 2022, 7:39 a.m.