inst/script/data.R

#' Example file of annotated integration matrix.
#'
#' @description This file is a very simple and compressed example to showcase
#' some of the functionalities of ISAnalytics package. The general structure of
#' the matrix is obtained as a product of the Vispa2 pipeline + create_matrix +
#' annotate_matrix functions.
#' For more information regarding Vispa2 please read this article:
#'
#' \href{https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5702242/}{VISPA2:
#' A Scalable Pipeline for High-Throughput Identification
#' and Annotation of Vector Integration Sites}.\cr
#'
#' The headers of the dataset are standard except for experimental data:
#' * chr : the chromosome number
#' * integration_locus: the site of integration
#' * strand: the DNA strand on which the integration took place
#' * GeneName : name of the gene
#' * GeneStrand : strand of the gene
#' * exp_... : names of the experimental data (not standard, can be anything)
#'
"ex_annotated_ISMatrix"

#' Example file of old style, not annotated integration matrix.
#'
#' @description This file is a very simple and compressed example to showcase
#' some of the functionalities of ISAnalytics package. The general structure of
#' the matrix is obtained as a product of the Vispa2 pipeline + create_matrix +
#' annotate_matrix functions.
#' For more information regarding Vispa2 please read this article:
#'
#' \href{https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5702242/}{VISPA2:
#' A Scalable Pipeline for High-Throughput Identification
#' and Annotation of Vector Integration Sites}.\cr
#'
#' The difference with standard Vispa2 annotated matrices lies in the fact that
#' the genomic coordinates are stored in a single column called IS_genomicID.
#'
#' Please note that this example and associated functionality is for
#' compatibility reasons only: if you're using the current version of Vispa2
#' you should always obtain separated columns for genomic coordinates and
#' the corresponding annotation columns.
#'
"ex_old_style_ISMatrix"

#' Example file of not annotate integration matrix.
#'
#' @description This file is a very simple and compressed example to showcase
#' some of the functionalities of ISAnalytics package. The general structure of
#' the matrix is obtained as a product of the Vispa2 pipeline + create_matrix +
#' annotate_matrix functions.
#' For more information regarding Vispa2 please read this article:
#'
#' \href{https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5702242/}{VISPA2:
#' A Scalable Pipeline for High-Throughput Identification
#' and Annotation of Vector Integration Sites}.\cr
#'
#' The difference with standard Vispa2 annotated matrices lies in the fact that
#' there are no annotation columns, namely "GeneName" and "GeneStrand".
#'
#'Please note that this example and associated functionality is for
#' compatibility reasons only: if you're using the current version of Vispa2
#' you should always obtain separated columns for genomic coordinates and
#' the corresponding annotation columns.
#'
"ex_notann_ISMatrix"

#' Example file of malformed integration matrix.
#'
#' @description This file is a very simple and compressed example to showcase
#' some of the functionalities of ISAnalytics package. The general structure of
#' the matrix is obtained as a product of the Vispa2 pipeline + create_matrix +
#' annotate_matrix functions.
#' For more information regarding Vispa2 please read this article:
#'
#' \href{https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5702242/}{VISPA2:
#' A Scalable Pipeline for High-Throughput Identification
#' and Annotation of Vector Integration Sites}.\cr
#'
#' This matrix is malformed on purpose, missing one of the fundamental columns.
#' Trying to import this matrix or convert it to an ISADataFrame object should
#' always result in some kind of error.
#'
"ex_malformed_ISMatrix"

#' Example of association file.
#'
#' @description This file is a simple example of association file. Use it as
#' reference to properly fill out yours.
#' To generate an empty association file to compile see the
#' `generate_blank_association_file` function.
#' @seealso \code{\link{generate_blank_association_file}}
#'
"ex_association_file"

#' Example of file system generated by Vispa2 output
#'
#' @description A compressed example of file system structure produced by
#' Vispa2 pipeline output. For more info on this:
#' \code{vignette("How to use import functions", package = "ISAnalytics")}
#'
"fs"

#' @describeIn "fs" Example of file system with issues
"fserr"

#' Gene annotation file for human hg19 genome.
#'
#' @description
#' This file was obtained following this steps:
#'
#' 1. Download from {http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/}
#' the refGene.sql, knownGene.sql, knownToRefSeq.sql, kgXref.sql tables
#' 2. Import everything it in mysql
#' 3. Generate views for annotation:
#'
#' ```
#' SELECT kg.`chrom`, min(kg.cdsStart) as CDS_minStart,
#' max(kg.`cdsEnd`) as CDS_maxEnd, k2a.geneSymbol,
#' kg.`strand` as GeneStrand, min(kg.txStart) as TSS_minStart,
#' max(kg.txEnd) as TSS_maxStart,
#' kg.proteinID as ProteinID, k2a.protAcc as ProteinAcc, k2a.spDisplayID
#' FROM `knownGene` AS kg JOIN kgXref AS k2a
#' ON BINARY kg.name = k2a.kgID COLLATE latin1_bin
#' -- latin1_swedish_ci
#' -- WHERE k2a.spDisplayID IS NOT NULL and (k2a.`geneSymbol` LIKE 'Tcra%' or
#' k2a.`geneSymbol` LIKE 'TCRA%')
#' WHERE (k2a.spDisplayID IS NOT NULL or k2a.spDisplayID NOT LIKE '')
#' and k2a.`geneSymbol` LIKE 'Tcra%'
#' group by kg.`chrom`, k2a.geneSymbol
#' ORDER BY kg.chrom ASC , kg.txStart ASC
#' ```
"hg19.refGene.oracle"

## keywords `proto-oncogenes` `tumor suppressor`

#' Data frames for proto-oncogenes (human and mouse)
#' amd tumor-suppressor genes from UniProt.
#'
#' @description
#' The file is simply a result of a research with the keywords
#' "proto-oncogenes" and "tumor suppressor" for the target genomes
#' on UniProt database.
"201806_uniprot-Proto-oncogene"
"201806_uniprot-Tumor-suppressor"

Try the ISAnalytics package in your browser

Any scripts or data that you put into this service are public.

ISAnalytics documentation built on April 9, 2021, 6:01 p.m.