Description Usage Arguments Details Value Author(s) References See Also Examples
Imports CAGE data from different sources into a CAGEset
object. After the CAGEset
object has been created the data can be further manipulated and visualized using other functions available in the CAGEr package and integrated with other analyses in R. Available resources include:
- FANTOM5 datasets (Forrest et al. Nature 2014) for numerous human and mouse samples (primary cells, cell lines and tissues), which are fetched directly from FANTOM5 online resource.
- FANTOM3 and 4 datasets (Carninci et al. Science 2005, Faulkner et al. Nature Genetics 2009, Suzuki et al. Nature Genetics 2009) from FANTOM3and4CAGE data package available from Bioconductor
- ENCODE datasets (Djebali et al. Nature 2012) for numerous human cell lines from ENCODEprojectCAGE data package, which is available for download from http://promshift.genereg.net/CAGEr/.
- Zebrafish developmental timecourse datasets (Nepal et al. Genome Research 2013) from ZebrafishDevelopmentalCAGE data package, which is available for download from http://promshift.genereg.net/CAGEr/.
1 | importPublicData(source, dataset, group, sample)
|
source |
Character vector specifying one of the available resources for CAGE data. Can be one of the following:
|
dataset |
Character vector specifying one or more of the datasets available in the selected resource. For |
group |
Character string specifying one or more groups within specified dataset(s), from which the samples should be selected. |
sample |
Character string specifying one or more CAGE samples. Check the corresponding data package for available samples within each group and their labels. For FANTOM5 resource, list of all human (~1000) and mouse (~) samples can be obtained in CAGEr by loading |
CAGE data from different sources is available for importing directly into CAGEset
object for further manipulation with CAGEr.
FANTOM consortium provides single base-pair resolution TSS data for numerous human and mouse primary cells, cell lines and tissues produced within FANTOM5 project (Forrest et al. Nature 2014). These are directly fetched from their online resource at http://fantom.gsc.riken.jp/5/data and imported into a CAGEset
object. To use this resource specify source="FANTOM5"
. The dataset
argument can be either "human"
or "mouse"
, but not both at the same time. The list of all human and mouse samples can be obtained by loading data(FANTOM5humanSamples)
and data(FANTOM5mouseSamples)
. The sample
column gives the names of individual samples that should be provided as sample
argument. See example below.
TSS data from previous FANTOM3 and FANTOM4 projects (Carninci et al., Faulkner et al., Suzuki et al.) are also available through FANTOM3and4CAGE data package. This data package can be installed directly from Bioconductor. To use this resource install and load FANTOM3and4CAGE package and specify source="FANTOM3and4"
. The dataset
argument can be a name of any of the datasets available in this package. Load data(FANTOMhumanSamples)
or data(FANTOMmouseSamples)
for the list of available datasets with group and sample labels for specific human or mouse samples. These have to be provided as dataset
, group
and sample
arguments to import selected samples. If all samples belong to the same group, only this one group can be provided, otherwise, for each sample a corresponding group has to be specified, i.e. the number of elements in group
must match the numer of elements in sample
.
ENCODE consortium produced CAGE data for numerous human cell lines (Djebali et al. Nature 2012). We have used these data to derive single base-pair resolution TSSs and collected them into an R data package ENCODEprojectCAGE. This data package is available for download from http://promshift.genereg.net/CAGEr/. To use this resource install and load ENCODEprojectCAGE data package and specify source="ENCODE"
. The dataset
argument can be a name of any of the datasets available in this package. Load data(ENCODEhumanCellLinesSamples)
for the list of available datasets with group and sample labels for specific samples. These have to be provided as dataset
, group
and sample
arguments to import selected samples. Multiple datasets can be combined together, by specifying them as dataset
argument. If all samples belong to the same dataset and the same group, these dataset and group can be specified only once, otherwise, for each sample a corresponding dataset and group have to be specified, i.e. the number of elements in dataset
and group
must match the numer of elements in sample
.
Precise TSSs are also available for zebrafish (Danio Rerio) from CAGE data published by Nepal et al. for 12 developmental stages. These have been collected into a data package ZebrafishDevelopmentalCAGE, which is available for download from http://promshift.genereg.net/CAGEr/. To use this resource install and load ZebrafishDevelopmentalCAGE data package and specify source="ZebrafishDevelopment"
. Load data(ZebrafishSamples)
for the list of available datasets and group and sample labels, which have to be specified to import these data.
A CAGEset
object is returned. Slots librarySizes
, CTSScoordinates
and tagCountMatrix
are occupied by the single base-pair resolution TSS data imported from the selected resource.
Vanja Haberle
Carninci et al. (2005) The Transcriptional Landscape of the Mammalian Genome, Science 309(5740):1559-1563.
Djebali et al. (2012) Landscape of transcription in human cells, Nature 488(7414):101-108.
Faulkner et al. (2009) The regulated retrotransposon transcriptome of mammalian cells, Nature Genetics 41:563-571.
Forrest et al. (2014) A promoter-level mammalian expression atlas, Nature 507(7493):462-470.
Nepal et al. (2013) Dynamic regulation of the transcription initiation landscape at single nucleotide resolution during vertebrate embryogenesis, Genome Research 23(11):1938-1950.
Suzuki et al. (2009) The transcriptional network that controls growth arrest and differentiation in a human myeloid leukemia cell line, Nature Genetics 41:553-562.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | ### importing FANTOM5 data
# list of FANTOM5 human tissue samples
data(FANTOM5humanSamples)
head(subset(FANTOM5humanSamples, type == "tissue"))
# import selected samples
exampleCAGEset <- importPublicData(source="FANTOM5", dataset = "human",
sample = c("adipose_tissue__adult__pool1", "adrenal_gland__adult__pool1",
"aorta__adult__pool1"))
exampleCAGEset
### importing FANTOM3/4 data from a data package
library(FANTOM3and4CAGE)
# list of mouse datasets available in this package
data(FANTOMmouseSamples)
unique(FANTOMmouseSamples$dataset)
head(subset(FANTOMmouseSamples, dataset == "FANTOMtissueCAGEmouse"))
head(subset(FANTOMmouseSamples, dataset == "FANTOMtimecourseCAGEmouse"))
# import selected samples from two different mouse datasets
exampleCAGEset <- importPublicData(source="FANTOM3and4",
dataset = c("FANTOMtissueCAGEmouse", "FANTOMtimecourseCAGEmouse"),
group = c("brain", "adipogenic_induction"),
sample = c("CCL-131_Neuro-2a_treatment_for_6hr_with_MPP+", "DFAT-D1_preadipocytes_2days"))
exampleCAGEset
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.