annotateGRfromGR | R Documentation |
Annotate GRanges using another GRanges object
annotateGRfromGR(
GR1,
GR2,
grOL = NULL,
numAsStrings = FALSE,
stringShrinkFunc = function(...) {
jamba::cPasteUnique(..., doSort = TRUE)
},
numShrinkFunc = sum,
addStringCols = NULL,
type = c("any", "start", "end", "within", "equal"),
ignore.strand = FALSE,
select = "all",
sep = ",",
verbose = FALSE,
DEBUG = FALSE,
...
)
GR1 |
GRanges object, the reference object to which annotations are added. |
GR2 |
GRanges object used for annotations |
grOL |
overlaps object, optionally used when the
|
numAsStrings |
logical indicating whether numerical values should
be treated as strings for the purpose of added annotations. When
|
stringShrinkFunc |
function used to shrink string annotations
by overlapping feature. This function should take a list as input,
and |
numShrinkFunc |
function used to shrink numerical values
by overlapping feature. For numeric values, the |
addStringCols |
character vector, optional, of numeric colnames that
should be treated as string values. This option is a subset of the
|
type , ignore.strand , select |
arguments sent to
|
sep |
character value indicating the delimiter to use between string values. |
verbose |
logical indicating whether to print verbose output. |
DEBUG |
logical indicating whether to print debugging output. |
... |
additional arguments are ignored. |
This function adds annotations to features in the given GRanges
object, from overlapping features in a second GRanges object.
It is especially useful after performing a manipulation that
results in a GRanges object without any values
annotation,
for example GenomicRanges::reduce()
or GenomicRanges::intersect()
.
In theory this function is relatively simple, it applies annotations
to overlapping entries. In practice, it gets complicated when multiple
annotations overlap the same GRange entry. In that case, numerical
values by default return the base::mean()
while string values call
jamba::cPasteUnique()
by default, and return the unique, sorted,
comma-delimited set of string values. For example, overlapping several
exons from the same gene, the resulting annotation might include
just one gene symbol.
The numeric values can be aggregated using another function, controlled
by numShrinkFunction
named using the colname. For example:
numShrinkFunction="min"
would return the base::min()
value for all
numeric columns. But numShrinkFunction=list(default=mean, score=max)
would return the base::mean()
for all numeric columns except the
"score"
column which would report the base::max()
.
Numeric values currently use the data.table
package workflow, which
provides extremely efficient calculations for numeric values. However,
vectorized functions have the potential to be notably faster, as is
true with jamba::cPasteUnique()
especially when applying
jamba::mixedSort()
to sort alphanumeric values. In future, the
implementation may change to accomodate vectorized numeric functions,
instead of using data.table
.
TODO: This function is a specific case where another function
shrinkDataFrame()
may be a more general purpose use. That function
is yet to be ported to the Jam package suite.
GRanges object with colnames added to values
, with length
and order equal to the input GR1
GRanges object.
Other jam GRanges functions:
addGRLgaps()
,
addGRgaps()
,
annotateGRLfromGRL()
,
assignGRLexonNames()
,
closestExonToJunctions()
,
combineGRcoverage()
,
exoncov2polygon()
,
findOverlapsGRL()
,
flattenExonsBy()
,
getFirstStrandedFromGRL()
,
getGRLgaps()
,
getGRcoverageFromBw()
,
getGRgaps()
,
grl2df()
,
jam_isDisjoint()
,
make_ref2compressed()
,
sortGRL()
,
spliceGR2junctionDF()
,
stackJunctions()
gr12 <- GenomicRanges::GRanges(
seqnames=rep(c("chr1", "chr2", "chr1"), c(3,3,3)),
ranges=IRanges::IRanges(
start=c(100, 200, 400, 500, 300, 100, 200, 400, 600),
width=c(100,150,50, 50,50,100, 50,200,50)
),
strand=rep(c("+", "-", "+"), c(3,3,3)),
gene_name=rep(c("GeneA", "GeneB", "GeneC"), each=3)
)
gr1 <- gr12[,0];
# Say for example you have a GRanges object with no annotation
gr1;
# And had another GRanges object with annotations
gr12;
# To add annotations
annotateGRfromGR(gr1, gr12);
# Notice certain features overlap multiple annotations,
# which may be acceptable.
# If you want to keep annotations distinct,
# use annotateGRLfromGRL()
grl1 <- GenomicRanges::split(gr12[,0],
GenomicRanges::values(gr12)$gene_name);
grl2 <- GenomicRanges::split(gr12,
GenomicRanges::values(gr12)$gene_name);
# The first object is a GRangesList with no annotations
grl1;
# The second object is a GRangesList with annotation,
# assumed to be in the same order
grl2;
annotateGRLfromGRL(grl1, grl2);
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.