hildaReadMPFile: Read the raw mutation data of Mutation Position Format.

Description Usage Arguments Value Examples

View source: R/read_file.R

Description

The mutation position format is tab-delimited text file, where the 1st-5th columns shows sample names, chromosome names, coordinates, reference bases (A, C, G, or T) and the alternate bases (A, C, G, or T), respectively. An example is as follows;

sample1 chr1 100 A C

sample1 chr1 200 A T

sample1 chr2 100 G T

sample2 chr1 300 T C

sample3 chr3 400 T C

Also, this function usually can accept compressed files (e.g., by gzip, bzip2 and so on) when using recent version of R.

Usage

1
2
3
4
5
6
7
hildaReadMPFile(
  infile,
  numBases = 3,
  trDir = FALSE,
  bs_genome = NULL,
  txdb_transcript = NULL
)

Arguments

infile

the path for the input file for the mutation data of Mutation Position Format.

numBases

the number of upstream and downstream flanking bases (including the mutated base) to take into account.

trDir

the index representing whether transcription direction is considered or not. The gene annotation information is given by UCSC knownGene (TxDb.Hsapiens.UCSC.hg19.knownGene object) When trDir is TRUE, the mutations located in intergenic region are excluded from the analysis.

bs_genome

this argument specifies the reference genome (e.g., B Sgenome.Mmusculus.UCSC.mm10 can be used for the mouse genome). See https://bioconductor.org/packages/release/bioc/html/BSgenome.html for the available genome list

txdb_transcript

this argument specified the transcript database (e.g., TxDb.Mmusculus.UCSC.mm10.knownGene can be used for the mouse genome). See https://bioconductor.org/packages/release/bioc/html/AnnotationDbi.html for details.

Value

The output is an instance of MutationFeatureData S4 class (which stores summarized information on mutation data). This will be typically used as the initial values for the global test and the local test.

Examples

1
2
inputFile <- system.file("extdata/esophageal.mp.txt.gz", package="HiLDA")
G <- hildaReadMPFile(inputFile, numBases=5, trDir=TRUE)

USCbiostats/HiLDA documentation built on Oct. 15, 2021, 10:22 a.m.