gtrack.import_mappedseq: Creates a track from a file of mapped sequences

View source: R/track.R

gtrack.import_mappedseqR Documentation

Creates a track from a file of mapped sequences

Description

Creates a track from a file of mapped sequences.

Usage

gtrack.import_mappedseq(
  track = NULL,
  description = NULL,
  file = NULL,
  pileup = 0,
  binsize = -1,
  cols.order = c(9, 11, 13, 14),
  remove.dups = TRUE
)

Arguments

track

track name

description

a character string description

file

name of mapped sequences file

pileup

interval expansion

binsize

bin size of a dense track

cols.order

order of sequence, chromosome, coordinate and strand columns in mapped sequences file or NULL if SAM file is used

remove.dups

if 'TRUE' the duplicated coordinates are counted only once.

Details

This function creates a track from a file of mapped sequences. The file can be in SAM format or in a general TAB delimited text format where each line describes a single read.

For a SAM file 'cols.order' must be set to 'NULL'.

For a general TAB delimited text format the following columns must be presented in the file: sequence, chromosome, coordinate and strand. The position of these columns should be specified in 'cols.order' argument. The default value of 'cols.order' is an array of (9, 11, 13, 14) meaning that sequence is expected to be found at column number 9, chromosome - at column 11, coordinate - at column 13 and strand - at column 14. The column indices are 1-based, i.e. the first column is referenced by 1. Chromosome needs a prefix 'chr' e.g. 'chr1'. Valid strand values are '+' or 'F' for forward strand and '-' or 'R' for the reverse strand.

Each read at given coordinate can be "expanded" to cover an interval rather than a single point. The length of the interval is controlled by 'pileup' argument. The direction of expansion depends on the strand value. If 'pileup' is '0', no expansion is performed and the read is converted to a single point. The track is created in sparse format. If 'pileup' is greater than zero, the output track is in dense format. 'binsize' controls the bin size of the dense track.

If 'remove.dups' is 'TRUE' the duplicated coordinates are counted only once.

'description' is added as a track attribute.

'gtrack.import_mappedseq' returns the statistics of the conversion process.

Value

A list of conversion process statistics.

See Also

gtrack.rm, gtrack.info, gdir.create


misha documentation built on Sept. 14, 2023, 5:08 p.m.