Home

/

CRAN

/

BaySIC

/

baysic.data: Organizes data for BaySIC functions

baysic.data: Organizes data for BaySIC functions
In BaySIC: Bayesian Analysis of Significantly Mutated Genes in Cancer

Description Usage Arguments Details Value Author(s) See Also Examples

Creates a list object from mutation and reference data for use with BaySIC fitting and testing functions

1	baysic.data(dat, ref.dat, plot = FALSE, N = NULL, silent = TRUE)

`dat`	matrix; Mutation input data. Baysic requires a specific format similar to the MUT format file, and should be an M\times7 matrix with column headings "chr", "start", "end", "id","type", "gene","context," where each row details an individual mutation.
`ref.dat`	a dataframe or `list` of dataframes; `ref.dat` is a representation of the sequence content of each gene of interest, for 32 unique trinucleotide sequence contexts, yielding an G\times34 matrix, where G is the total number of genes. If `ref.dat` is a matrix, it is assumed that all subjects correspond to the same reference data. It is possible that reference data may vary from subject to subject due to different platforms or coverages. In this case, `ref.dat` can also be a list of `N` reference data matrices, where `N` is the number of subjects. The names of each list element should correspond to ids used in the `dat` file.
`plot`	logical; if `TRUE`, a plot summarizing the mutation data at an overall and per subject basis is generated. Defaults to `FALSE`.
`N`	an integer (optional); equal to the number of subjects represented in `dat`. If `N=NULL` and `is.list(ref.dat)==FALSE`, `N` is assumed to the number of unique subject ids in `dat`. If `is.list(ref.dat)=TRUE`, then `N=length(ref.dat)`.
`silent`	logical; if `FALSE`, mutations defined as 'Synonymous' or 'Silent' will be removed from the dataset and subsequent analyses. Defaults to `TRUE`.

The mutation data dat is a 7-column matrix similar in style to other popular mutation file formats. The first three columns ("chr","start","end") correspond to the positional information of the somatic mutation. The "id" column represents an identification vector including subject ids for each documented mutation. The "type" column corresponds to the type of mutation for each entry. This is relatively flexible for point mutations, and only requires some form of "silent" or "synonymous" for such mutations if silent=FALSE, but insertion/deletion events should be designated as "INDEL." The "gene" column represents the name of the gene the mutation corresponds to, and must match the gene names used in ref.dat. The "context" entries represent the trinucleotide sequence context of each point mutation (NA for INDELS)

The first two columns of the data matrix (or matrices) in ref.dat should correspond to the gene name and corresponding chromosome, and the column names of the remaining 32 columns should correspond to the trinucleotide motif (e.g. "ACA"). The sequence content entries should be integer values which correspond to the number of nucleotides in the coding content of a given gene which satisify the trinucleotide motif (central base with flanking 5' and 3' bases). Each base should be uniquely represented, such that the sum of all 32 counts is equivalent to the basepair length of the total coding sequence for a given gene.

The baysic.data function has its own trinucleotide naming convention, in that all motifs are in all caps and have either "T" or "C" as the central base. Column names of ref.dat and "context" entries in dat will be adjusted to accommodate this convention if they deviate from it.

Returns a list data structure with the following components:

`all.dat`	Original mutation data object `dat`
`ref.dat`	Original reference data object `ref.dat`
`N`	Number of subjects with observed data
`genes`	Vector of length G of gene names included in analysis, where G is the total number of genes. Derived from `ref.dat`
`snv.dat`	A G\times32 matrix of total number of SNV mutations per sequence context and gene
`indel.dat`	Vector of length G of total number of indel mutations per gene

Nicholas B. Larson

baysic.fit,baysic.test

## Not run: 
data(example.dat)
data(ccds.19)
baysic.dat.ex<-baysic.data(example.dat,ccds.19)

## End(Not run)

BaySIC documentation built on May 2, 2019, 10:29 a.m.

BaySIC index

Package overview

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

BaySIC
Bayesian Analysis of Significantly Mutated Genes in Cancer

baysic.data: Organizes data for BaySIC functions
In BaySIC: Bayesian Analysis of Significantly Mutated Genes in Cancer

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Related to baysic.data in BaySIC...

R Package Documentation

Browse R Packages

We want your feedback!

BaySIC Bayesian Analysis of Significantly Mutated Genes in Cancer

baysic.data: Organizes data for BaySIC functions In BaySIC: Bayesian Analysis of Significantly Mutated Genes in Cancer

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Related to baysic.data in BaySIC...

R Package Documentation

Browse R Packages

We want your feedback!

BaySIC
Bayesian Analysis of Significantly Mutated Genes in Cancer

baysic.data: Organizes data for BaySIC functions
In BaySIC: Bayesian Analysis of Significantly Mutated Genes in Cancer