makeGRangesListFromDataFrame: Make a GRangesList object from a data.frame or DataFrame

Description Usage Arguments Value Author(s) See Also Examples

View source: R/makeGRangesListFromDataFrame.R

Description

makeGRangesListFromDataFrame extends the makeGRangesFromDataFrame functionality from GenomicRanges. It can take a data-frame-like object as input and tries to automatically find the columns that describe the genomic ranges. It returns a GRangesList object. This is different from the makeGRangesFromDataFrame function by requiring a split.field. The split.field acts like the "f" argument in the split function. This factor must be of the same length as the number of rows in the DataFrame argument. The split.field may also be a character vector.

Usage

1
2
3
4
makeGRangesListFromDataFrame(df,
                             split.field = NULL,
                             names.field = NULL,
                             ...)

Arguments

df

A DataFrame or data.frame class object

split.field

A character string of a recognized column name in df that contains the grouping. This column defines how the rows of df are split and is typically a factor or character vector. When split.field is not provided the df will be split by the number of rows.

names.field

An optional single character string indicating the name of the column in df that designates the names for the ranges in the elements of the GRangesList.

...

Additional arguments passed on to makeGRangesFromDataFrame

Value

A GRangesList of the same length as the number of levels or unique character strings in the df column indicated by split.field. When split.field is not provided the df is split by row and the resulting GRangesList has the same length as nrow(df).

Names on the individual ranges are taken from the names.field argument. Names on the outer list elements of the GRangesList are propagated from split.field.

Author(s)

M. Ramos

See Also

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
## ---------------------------------------------------------------------
## BASIC EXAMPLES
## ---------------------------------------------------------------------

df <- data.frame(chr="chr1", start=11:15, end=12:16,
                 strand=c("+","-","+","*","."), score=1:5,
                 specimen = c("a", "a", "b", "b", "c"),
                 gene_symbols = paste0("GENE", letters[1:5]))
df

grl <- makeGRangesListFromDataFrame(df, split.field = "specimen",
                                    names.field = "gene_symbols")
grl
names(grl)

## Keep metadata columns
makeGRangesListFromDataFrame(df, split.field = "specimen",
                             keep.extra.columns = TRUE)

Example output

Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit, which, which.max, which.min

Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
   chr start end strand score specimen gene_symbols
1 chr1    11  12      +     1        a        GENEa
2 chr1    12  13      -     2        a        GENEb
3 chr1    13  14      +     3        b        GENEc
4 chr1    14  15      *     4        b        GENEd
5 chr1    15  16      .     5        c        GENEe
GRangesList object of length 3:
$a 
GRanges object with 2 ranges and 0 metadata columns:
        seqnames    ranges strand
           <Rle> <IRanges>  <Rle>
  GENEa     chr1  [11, 12]      +
  GENEb     chr1  [12, 13]      -

$b 
GRanges object with 2 ranges and 0 metadata columns:
        seqnames   ranges strand
  GENEc     chr1 [13, 14]      +
  GENEd     chr1 [14, 15]      *

$c 
GRanges object with 1 range and 0 metadata columns:
        seqnames   ranges strand
  GENEe     chr1 [15, 16]      *

-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
[1] "a" "b" "c"
GRangesList object of length 3:
$a 
GRanges object with 2 ranges and 2 metadata columns:
      seqnames    ranges strand |     score gene_symbols
         <Rle> <IRanges>  <Rle> | <integer>     <factor>
  [1]     chr1  [11, 12]      + |         1        GENEa
  [2]     chr1  [12, 13]      - |         2        GENEb

$b 
GRanges object with 2 ranges and 2 metadata columns:
      seqnames   ranges strand | score gene_symbols
  [1]     chr1 [13, 14]      + |     3        GENEc
  [2]     chr1 [14, 15]      * |     4        GENEd

$c 
GRanges object with 1 range and 2 metadata columns:
      seqnames   ranges strand | score gene_symbols
  [1]     chr1 [15, 16]      * |     5        GENEe

-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths

GenomicRanges documentation built on Nov. 8, 2020, 5:46 p.m.