filter_regions: Filters regions from prediction set to be used to build the...

Description Usage Arguments Value Examples

View source: R/filter_regions.R

Description

This function takes phenotype of interest (sex, tissue type, etc.) input by the user and uses a linear model (accounting for covariates, if provided) to filter those expressed regions that best predict the phenotype of interest. This is necessary when expression data are provided in chunks or broken down by chromosome. These regions can then be merged together with merge_regions() and are then used downstream for prediction. Default filters top 100 expressed regions from input data.

Usage

1
2
3
filter_regions(expression = NULL, regiondata = NULL, phenodata = NULL,
  phenotype = NULL, covariates = NULL, type = c("factor", "numeric"),
  numRegions = 100)

Arguments

expression

expression data where regions are in rows and samples are in columns expression

regiondata

A GenomicRanges object in which regiondata

phenodata

phenotype data with samples in rows and corresponding phenotype information in columns phenodata

phenotype

phenotype of interest phenotype

covariates

Which covariates to include in model covariates

type

The class of the phenotype of interest (numeric, factor) type

numRegions

The number of regions per class of variable of interest to pull out from each chromosome (default: 100) numRegions

Value

The selected regions, the coverage matrix, and the region info to be used for prediction

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
library('GenomicRanges')
library('dplyr')

## Make up some some region data
regions <- GRanges(seqnames = 'chr2', IRanges(
     start = c(28971710:28971712, 29555081:29555083, 29754982:29754984),
     end = c(29462417:29462419, 29923338:29923340, 29917714:29917716)))

## make up some expression data for 9 rows and 30 people
data(sysdata, package='phenopredict')
## includes R object 'cm'
exp= cm[1:length(regions),1:30]

## generate some phenotype information
sex = as.data.frame(rep(c("male","female"),each=15))
age = as.data.frame(sample(1:100,30))
pheno = dplyr::bind_cols(sex,age)
colnames(pheno) <- c("sex","age")

## filter regions to be used to build the predictor
inputdata <- filter_regions(expression=exp, regiondata=regions,
	phenodata=pheno, phenotype="sex", covariates=NULL,type="factor",
	numRegions=2)

leekgroup/phenopredict documentation built on May 14, 2019, 11:27 a.m.