importFromAlignedReads: Import aligned reads to database

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/importAndManage.R

Description

This function takes a named list of AlignedRead objects (from the ShortRead package) and creates an ExpData object from them, with one column for each list element. Column names are taken from list names, which must be unique.

Usage

1
2
3
4
importFromAlignedReads(x, chrMap, dbFilename,
  tablename, overwrite = TRUE, deleteIntermediates = TRUE,
  readPosition = c("5prime", "left", "center"),
  verbose = getOption("verbose"), ...)

Arguments

x

This argument can be one of two things: either a named list of objects of class AlignedRead or a named character vector of filenames. In both cases, the names of the object are used as column names inthe resulting database (not that it is not easy to change those names). Therefore the names of x needs to be present and non-empty and also to satisfy the requirements of column names in SQLite. If x is a list of AlignedRead, the column names needs to be unique. If x is a character vector of filenames, the names do not have to be unique, in which case two filenames with the same (column) name gets collapsed into the same column.

chrMap

A vector of chromosome names from the aligned output. On importation to the database, chromosome names will be converted to integers corresponding to position within the chrMap vector.

dbFilename

The filename of the database to which the data will be imported.

tablename

Name of database table to write output data to.

overwrite

Logical indicating whether database table referred to in tablename argument should be overwritten.

deleteIntermediates

Logical indicating whether intermediate database tables constructed in the process should be removed.

readPosition

How each read is assigned a unique genomic location. Default is "5prime" indicating that the location is the position of the 5' end of the reads, "left" indicates that the position of the left part of the read is used (5' end for reads mapping to the forward strand, 3' for reads mapping to the reverse strand), "center" indicates that the position of the center of the read is used.

verbose

Logical indicating whether details should be printed.

...

Additional arguments to be passed to readAligned from ShortRead.

Details

The reads are aggregated and joined to form a database where each file/list element is a column. Positions are stored as the position of the 5' end of the reads (note that this differs from the convention for the AlignedRead class from ShortRead.) This can be changed by the readPosition argument.

If the x argument is a character vector of filenames, the function will require enough memory to parse each input file in turn. If there are duplicates in names of x the function requires enough memory to parse all files with the same column name at the same time.

If the AlignedRead class object has a weights column in its alignData slot, this weights column is used as the data to aggregate over.

Value

Outputs an object of class ExpData with a column for each element of the x argument.

Author(s)

James Bullard bullard@berkeley.edu, Kasper Daniel Hansen khansen@jhsph.edu

See Also

See Genominator vignette for more information. See also ExpData-class, AlignedRead-class and readAligned.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
## Not run: 
require(ShortRead)
require(yeastRNASeq)
data("yeastAligned")
chrMap <- levels(chromosome(yeastAligned[[1]]))
eData <- importFromAlignedReads(yeastAligned, chrMap = chrMap,
               dbFilename = tempfile(), tablename = "raw",
               overwrite = TRUE)

## End(Not run)

Genominator documentation built on Oct. 31, 2019, 8:56 a.m.