classifySeg: A method for defining a genome segment map by an empirical...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/classifySeg.R

Description

This function acquires empirical distributions of sequence tag density from an already existing (or heuristically defined) segment map. It uses these to classify potential segments as either segments or nulls in order to define a new (and improved) segment map.

Usage

1
2
3
classifySeg(sD, cD, aD, lociCutoff = 0.9, nullCutoff = 0.9, subRegion =
NULL, getLikes = TRUE, lR = FALSE, samplesize = 1e5, largeness = 1e8,
tempDir = NULL, recoverFromTemp = FALSE, cl) 

Arguments

sD

A segData object derived from the ‘aD’ object.

cD

A lociData object containing an already existing segmentation map, or NULL.

aD

An alignmentData object.

lociCutoff

The minimum posterior likelihood of being a locus for a region to be treated as a locus.

nullCutoff

The minimum posterior likelihood of being a null for a region to be treated as a null.

subRegion

A data.frame object defining the subregions of the genome to be segmented. If NULL (default), the whole genome is segmented.

getLikes

Should posterior likelihoods for the new segmented genome (loci and nulls) be assessed?

lR

If TRUE, locus and null calls are made on the basis of likelihood ratios rather than posterior likelihoods. Not recommended.

samplesize

The sample size to be used when estimating the prior distribution of the data with the getPriors.NB function.

largeness

The maximum size for a split analysis.

tempDir

A directory for storing temporary files produced during the segmentation.

recoverFromTemp

If TRUE, will attempt to recover the position saved in 'tempDir'. Defaults to FALSE. See Details.

cl

A SNOW cluster object, or NULL. See Details.

Details

This function acquires empirical distributions of sequence tag density from the segmentation map defined by the ‘cD’ argument (if ‘cD’ is NULL or missing, then the heuristicSeg function is used to define a segmentation map. It uses these empirical distributions to acquire posterior likelihoods on each potential segment being either a true segment or a null region. These posterior likelihoods are then used to define the segment map.

If recoverFromTemp = TRUE, the function will attempt to recover a crashed position from the temporary files in tempDir. At present, the function assumes you know what you are doing, and will perform no checking that these files are suitable for the specified recovery. Use with caution.

Value

A lociData object, containing the segmentation map discovered.

Author(s)

Thomas J. Hardcastle

References

Hardcastle T.J., Kelly, K.A. and Balcombe D.C. (2011). Identifying small RNA loci from high-throughput sequencing data. In press.

See Also

heuristicSeg a fast heuristic alternative to this function. plotGenome, a function for plotting the alignment of tags to the genome (together with the segments defined by this function). baySeq, a package for discovering differential expression in lociData objects.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Define the files containing sample information.

datadir <- system.file("extdata", package = "segmentSeq")
libfiles <- c("SL9.txt", "SL10.txt", "SL26.txt", "SL32.txt")

# Establish the library names and replicate structure.

libnames <- c("SL9", "SL10", "SL26", "SL32")
replicates <- c(1,1,2,2)

# Process the files to produce an `alignmentData' object.

alignData <- readGeneric(file = libfiles, dir = datadir, replicates =
replicates, libnames = libnames, gap = 100)

# Process the alignmentData object to produce a `segData' object.

sD <- processAD(alignData, gap = 100, cl = NULL)

# Use the classifySeg function on the segData object to produce a lociData object.

pS <- classifySeg(aD = alignData, sD = sD, subRegion = data.frame(chr = ">Chr1", start = 1, end = 1e5), getLikes = TRUE, cl = NULL)

segmentSeq documentation built on Nov. 8, 2020, 5:18 p.m.