R/data.R

#' An example dataset of 5 peaks
#'
#' A example dataset of 5 genomic coordinates representing peaks in a
#' hypothetical genome. This dataset is intended to be mapped to the features
#' in the testGenes object. This object was imported from the BED file at
#' inst/extdata/testPeaks.bed using importBED.
#'
#' @format A dataframe with 5 rows and 6 columns, representing the first 6
#'   fields of a standard BED file. Note that unlike in BED format, where
#'   intervals are right-open, the base pair given in "End" is actually included
#'   in the feature. The 6 columns are:
#' \describe{
#'   \item{Chr}{The chromosome the peak is on.}
#'   \item{Start}{The left endpoint of the peak.}
#'   \item{End}{The right endpoint of the peak. Unlike standard BED format, this
#'       position is actually included in the peak.}
#'   \item{Name}{The name given to the peak.}
#'   \item{Score}{An arbitrary numeric value that can be given to different
#'       peaks in BED format (e.g. quality scores).}
#'   \item{Strand}{The strand of the peak: + for forward strand, - for reverse
#'       strand, and . for no strand information, in which case PeakMapper
#'       treats this peak as being on the forward strand.}
#' }
#' @source Daniel Fusca, University of Toronto
"testPeaks"


#' An example dataset of 5 genes
#'
#' A example dataset of 5 genomic coordinates representing genes in a
#' hypothetical genome. This dataset is intended for annotating the peaks in the
#' testPeaks object. This object was imported from the BED file at
#' inst/extdata/testGenes.bed using importBED.
#'
#' @format A dataframe with 5 rows and 6 columns, representing the first 6
#'   fields of a standard BED file. Note that unlike in BED format, where
#'   intervals are right-open, the base pair given in "End" is actually included
#'   in the feature. The 6 columns are:
#' \describe{
#'   \item{Chr}{The chromosome the gene is on.}
#'   \item{Start}{The left endpoint of the gene.}
#'   \item{End}{The right endpoint of the gene. Unlike standard BED format, this
#'       position is actually included in the gene.}
#'   \item{Name}{The name given to the gene.}
#'   \item{Score}{An arbitrary numeric value that can be given to different
#'       genes in BED format (e.g. quality scores).}
#'   \item{Strand}{The strand of the gene: + for forward strand, - for reverse
#'       strand, and . for no strand information, in which case PeakMapper
#'       treats this gene as being on the forward strand.}
#' }
#' @source Daniel Fusca, University of Toronto
"testGenes"


#' Dataset of H3K27me3 peaks from C. elegans
#'
#' A set of 2111 genomic coordinates representing ChIP-seq peaks of the histone
#' modification H3K27 trimethlyation in the C. elegans genome. ChIP-seq
#' experiments were performed by Arneet Saltzman, and peaks were called
#' by Daniel Fusca using MACS2 (Yong Zhang et al. Model-based Analysis of
#' ChIP-Seq (MACS). Genome Biol (2008) vol. 9 (9) pp. R137). This dataset is
#' intended to be mapped to the features in the WS263Genes object. Note that
#' mapping these peaks to WS263Genes may take several minutes, so the user may
#' use the smaller H3K27me3PeaksSmall dataset as a faster example. This object
#' was imported from the BED file at inst/extdata/H3K27me3Peaks.bed using
#' importBED.
#'
#' @format A dataframe with 2111 rows and 6 columns, representing the first 6
#'   fields of a standard BED file. Note that unlike in BED format, where
#'   intervals are right-open, the base pair given in "End" is actually included
#'   in the peak The 6 columns are:
#' \describe{
#'   \item{Chr}{The chromosome the peak is on.}
#'   \item{Start}{The left endpoint of the peak}
#'   \item{End}{The right endpoint of the peak Unlike standard BED format, this
#'       position is actually included in the peak}
#'   \item{Name}{The name given to the peak}
#'   \item{Score}{An arbitrary numeric value that can be given to different
#'       peaks in BED format (e.g. quality scores).}
#'   \item{Strand}{The strand of the peak: + for forward strand, - for reverse
#'       strand, and . for no strand information, in which case PeakMapper
#'       treats this peak as being on the forward strand.}
#' }
#' @source Daniel Fusca and Arneet Saltzman, University of Toronto (unpublished
#'     data)
"H3K27me3Peaks"


#' Small sample of H3K27me3 peaks from C. elegans
#'
#' A set of 100 genomic coordinates representing ChIP-seq peaks of the histone
#' modification H3K27 trimethlyation in the C. elegans genome. These are the first
#' 100 peaks given in the H3K27me3Peaks object. This smaller dataset can be
#' used to test PeakMapper more quickly than the full dataset. ChIP-seq
#' experiments were performed by Arneet Saltzman, and peaks were called
#' by Daniel Fusca using MACS2 (Yong Zhang et al. Model-based Analysis of
#' ChIP-Seq (MACS). Genome Biol (2008) vol. 9 (9) pp. R137). This dataset is
#' intended to be mapped to the features in the WS263Genes object. This object
#' was imported from the BED file at inst/extdata/H3K27me3PeaksSmall using
#' importBED.
#'
#' @format A dataframe with 100 rows and 6 columns, representing the first 6
#'   fields of a standard BED file. Note that unlike in BED format, where
#'   intervals are right-open, the base pair given in "End" is actually included
#'   in the peak The 6 columns are:
#' \describe{
#'   \item{Chr}{The chromosome the peak is on.}
#'   \item{Start}{The left endpoint of the peak}
#'   \item{End}{The right endpoint of the peak Unlike standard BED format, this
#'       position is actually included in the peak}
#'   \item{Name}{The name given to the peak}
#'   \item{Score}{An arbitrary numeric value that can be given to different
#'       peaks in BED format (e.g. quality scores).}
#'   \item{Strand}{The strand of the peak: + for forward strand, - for reverse
#'       strand, and . for no strand information, in which case PeakMapper
#'       treats this peak as being on the forward strand.}
#' }
#' @source Daniel Fusca and Arneet Saltzman, University of Toronto (unpublished
#'     data)
"H3K27me3PeaksSmall"


#' Dataset of protein-coding genes in C. elegans
#'
#' A dataset of 20,094 protein-coding genes from C. elegans, taken from release
#' WS263 of the WormBase database (https://www.wormbase.org). Protein-coding genes
#' were extracted from the full set of C. elegans genes obtained from WormBase
#' by Daniel Fusca. This dataset is intended for annotating the peaks in the
#' H3K27me3Peaks and H3K27me3PeaksSmall datasets. This object was imported from
#' the BED file at inst/extdata/WS263Genes.bed using importBED.
#'
#' @format A dataframe with 20,094 rows and 6 columns, representing the first 6
#'   fields of a standard BED file. Note that unlike in BED format, where
#'   intervals are right-open, the base pair given in "End" is actually included
#'   in the feature. The 6 columns are:
#' \describe{
#'   \item{Chr}{The chromosome the gene is on.}
#'   \item{Start}{The left endpoint of the gene.}
#'   \item{End}{The right endpoint of the gene. Unlike standard BED format, this
#'       position is actually included in the gene.}
#'   \item{Name}{The name given to the gene.}
#'   \item{Score}{An arbitrary numeric value that can be given to different
#'       genes in BED format (e.g. quality scores).}
#'   \item{Strand}{The strand of the gene: + for forward strand, - for reverse
#'       strand, and . for no strand information, in which case PeakMapper
#'       treats this gene as being on the forward strand.}
#' }
#' @source WormBase Release WS263. Accessed 24 September 2019.
#'     https://wormbase.org/
"WS263Genes"
fuscada2/PeakMapper documentation built on Dec. 8, 2019, 12:35 p.m.