dmSQTLdata: Create dmSQTLdata object

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/class_dmSQTLdata.R

Description

Constructor functions for a dmSQTLdata object. dmSQTLdata assignes to a gene all the SNPs that are located in a given surrounding (window) of this gene.

Usage

1
2
dmSQTLdata(counts, gene_ranges, genotypes, snp_ranges, samples, window = 5000,
  BPPARAM = BiocParallel::SerialParam())

Arguments

counts

Data frame with counts. Rows correspond to features, for example, transcripts or exons. This data frame has to contain a gene_id column with gene IDs, feature_id column with feature IDs and columns with counts for each sample. Column names corresponding to sample IDs must be the same as in the sample data frame.

gene_ranges

GRanges object with gene location. It must contain gene names when calling names().

genotypes

Data frame with genotypes. Rows correspond to SNPs. This data frame has to contain a snp_id column with SNP IDs and columns with genotypes for each sample. Column names corresponding to sample IDs must be the same as in the sample data frame. The genotype of each sample is coded in the following way: 0 for ref/ref, 1 for ref/not ref, 2 for not ref/not ref, -1 or NA for missing value.

snp_ranges

GRanges object with SNP location. It must contain SNP names when calling names().

samples

Data frame with column sample_id corresponding to unique sample IDs

window

Size of a down and up stream window, which is defining the surrounding for a gene. Only SNPs that are located within a gene or its surrounding are considered in the sQTL analysis.

BPPARAM

Parallelization method used by bplapply.

Details

It is quite common that sample grouping defined by some of the SNPs is identical. Compare dim(genotypes) and dim(unique(genotypes)). In our QTL analysis, we do not repeat tests for the SNPs that define the same grouping of samples. Each grouping is tested only once. SNPs that define such unique groupings are aggregated into blocks. P-values and adjusted p-values are estimated at the block level, but the returned results are extended to a SNP level by repeating the block statistics for each SNP that belongs to a given block.

Value

Returns a dmSQTLdata object.

Author(s)

Malgorzata Nowicka

See Also

plotData

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# --------------------------------------------------------------------------
# Create dmSQTLdata object
# --------------------------------------------------------------------------
# Use subsets of data defined in the GeuvadisTranscriptExpr package

library(GeuvadisTranscriptExpr)

geuv_counts <- GeuvadisTranscriptExpr::counts
geuv_genotypes <- GeuvadisTranscriptExpr::genotypes
geuv_gene_ranges <- GeuvadisTranscriptExpr::gene_ranges
geuv_snp_ranges <- GeuvadisTranscriptExpr::snp_ranges

colnames(geuv_counts)[c(1,2)] <- c("feature_id", "gene_id")
colnames(geuv_genotypes)[4] <- "snp_id"
geuv_samples <- data.frame(sample_id = colnames(geuv_counts)[-c(1,2)])

d <- dmSQTLdata(counts = geuv_counts, gene_ranges = geuv_gene_ranges,  
  genotypes = geuv_genotypes, snp_ranges = geuv_snp_ranges, 
  samples = geuv_samples, window = 5e3)

DRIMSeq documentation built on Nov. 8, 2020, 8:25 p.m.