Description Usage Arguments Details Value Author(s) References See Also Examples
This function takes a CancerPanel object and returns the regions ready to be submitted for sequencing
1 2 3 4 5 6 7 8 | panelDesigner(object
, alterationType = c("copynumber", "expression", "mutations", "fusions")
, padding_length = 100
, merge_window = 50
, utr = FALSE
, canonicalTranscript=TRUE
, BPPARAM=bpparam("SerialParam")
, myhost="www.ensembl.org")
|
object |
a CancerPanel Object |
alterationType |
by default, the design of the panel is created by mixing all the different types of alterations. With this parameter you can separate the design by alteration type. |
padding_length |
elongation on both side in case of single spot genomic request |
merge_window |
if two ranges are very close to each other what is the minimum length accepted for them to be separated and not merged? |
utr |
if TRUE, the genes ranges in the panel design are taken as CDS plus utr. Default is to take just the coding sequence |
canonicalTranscript |
if FALSE, every exon of every transcript of the gene is taken into consideration in calculating gene length. Default to TRUE is to select the canonical transcript |
BPPARAM |
an object of class BiocParallelParam to distribute REST API queries from Ensembl and HGNC. Serialization is the default. |
myhost |
In case of a biomart breakdown, choose a different host than the default ensembl.org. check availability on biomart mirrors |
In the majority of cases, copynumber and mutations data are retrieved using different technologies and the design should be separated. Use 'alterationType' parameter to create multiple libraries. In case of fusions, the design will take into account all the genes that form the fusion. The technology used to find fusion genes can rely on RNA rather than DNA, so in this case it is better to avoid this function. A similar idea can be applied for expression data. 'merge_window' parameter is generally calculated by the sequencing company, so set it to 0 if you don't want to decide it upfront. The ability of the machine to capture a region and the cost associated with a change in this measure depends on the technology itself. It can be very difficult to find the proper trade-off between library size and number of ranges. The larger is the intronic region accepted, the larger is the library size because you will accept a lot of off-targets. On the other end, the more regions in your library, the higher will be the number of amplicons used.
A list of 4 elements:
GeneIntervals |
a data.frame containing all gene wise intervals on cds and cds plus utr |
TargetIntervals |
if the panel contains specific regions, it is a data.frame of non-full gene sequences as requested in the panel |
FullGenes |
a character vector of genes that will be sequenced for their entire length |
BedStylePanel |
a bed style data.frame with chromosome start and end of the collapse of GeneIntervals and TargetIntervals |
Giorgio Melloni, Alessandro Guida
bed file format according to ensembl
canonical transcript definition according to ENSEMBL
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | #Load a Cancer Panel Object
data(cpObj2)
# Design your panel for expression data only
# Parallelize part of the code using BiocParallel backend for Unix systems
if(tolower(Sys.info()["sysname"])=="windows"){
mydesign <- panelDesigner(cpObj2
, alterationType="mutations")
} else {
mydesign <- panelDesigner(cpObj2
, alterationType="mutations"
, BPPARAM=BiocParallel::MulticoreParam(workers=2)
)
}
# Retrieve bed style sequences
head( mydesign[['BedStylePanel']] )
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.