DataFrame-class: DataFrame objects

Description Details Constructor Accessors Coercion Subsetting Combining Author(s) See Also Examples

Description

The DataFrame class extends the RectangularData virtual class supports the storage of any type of object (with length and [ methods) as columns.

Details

On the whole, the DataFrame behaves very similarly to data.frame, in terms of construction, subsetting, splitting, combining, etc. The most notable exception is that the row names are optional. This means calling rownames(x) will return NULL if there are no row names. Of course, it could return seq_len(nrow(x)), but returning NULL informs, for example, combination functions that no row names are desired (they are often a luxury when dealing with large data).

As DataFrame derives from Vector, it is possible to set an annotation string. Also, another DataFrame can hold metadata on the columns.

For a class to be supported as a column, it must have length and [ methods, where [ supports subsetting only by i and respects drop=FALSE. Optionally, a method may be defined for the showAsCell generic, which should return a vector of the same length as the subset of the column passed to it. This vector is then placed into a data.frame and converted to text with format. Thus, each element of the vector should be some simple, usually character, representation of the corresponding element in the column.

Constructor

DataFrame(..., row.names = NULL, check.names = TRUE, stringsAsFactors)

Constructs a DataFrame in similar fashion to data.frame. Each argument in ... is coerced to a DataFrame and combined column-wise. No special effort is expended to automatically determine the row names from the arguments. The row names should be given in row.names; otherwise, there are no row names. This is by design, as row names are normally undesirable when data is large. If check.names is TRUE, the column names will be checked for syntactic validity and made unique, if necessary.

To store an object of a class that does not support coercion to DataFrame, wrap it in I(). The class must still have methods for length and [.

The stringsAsFactors argument is ignored. The coercion of column arguments to DataFrame determines whether strings become factors.

make_zero_col_DFrame(nrow)

Constructs a zero-column DFrame object with nrow rows. Intended for developers to use in other packages and typically not needed by the end user.

Accessors

In the following code snippets, x is a DataFrame.

dim(x): Get the length two integer vector indicating in the first and second element the number of rows and columns, respectively.

dimnames(x), dimnames(x) <- value: Get and set the two element list containing the row names (character vector of length nrow(x) or NULL) and the column names (character vector of length ncol(x)).

Coercion

as(from, "DataFrame"): By default, constructs a new DataFrame with from as its only column. If from is a matrix or data.frame, all of its columns become columns in the new DataFrame. If from is a list, each element becomes a column, recycling as necessary. Note that for the DataFrame to behave correctly, each column object must support element-wise subsetting via the [ method and return the number of elements with length. It is recommended to use the DataFrame constructor, rather than this interface.

as.list(x): Coerces x, a DataFrame, to a list.

as.data.frame(x, row.names=NULL, optional=FALSE): Coerces x, a DataFrame, to a data.frame. Each column is coerced to a data.frame and then column bound together. If row.names is NULL, they are retrieved from x, if it has any. Otherwise, they are inferred by the data.frame constructor.

NOTE: conversion of x to a data.frame is not supported if x contains any list, SimpleList, or CompressedList columns.

as(from, "data.frame"): Coerces a DataFrame to a data.frame by calling as.data.frame(from).

as.matrix(x): Coerces the DataFrame to a matrix, if possible.

as.env(x, enclos = parent.frame()): Creates an environment from x with a symbol for each colnames(x). The values are not actually copied into the environment. Rather, they are dynamically bound using makeActiveBinding. This prevents unnecessary copying of the data from the external vectors into R vectors. The values are cached, so that the data is not copied every time the symbol is accessed.

Subsetting

In the following code snippets, x is a DataFrame.

x[i,j,drop]: Behaves very similarly to the [.data.frame method, except i can be a logical Rle object and subsetting by matrix indices is not supported. Indices containing NA's are also not supported.

x[i,j] <- value: Behaves very similarly to the [<-.data.frame method.

x[[i]]: Behaves very similarly to the [[.data.frame method, except arguments j and exact are not supported. Column name matching is always exact. Subsetting by matrices is not supported.

x[[i]] <- value: Behaves very similarly to the [[<-.data.frame method, except argument j is not supported.

Combining

In the following code snippets, x is a DataFrame.

rbind(...): Creates a new DataFrame by combining the rows of the DataFrame objects in .... Very similar to rbind.data.frame, except in the handling of row names. If all elements have row names, they are concatenated and made unique. Otherwise, the result does not have row names. The return value inherits its metadata from the first argument.

cbind(...): Creates a new DataFrame by combining the columns of the DataFrame objects in .... Very similar to cbind.data.frame. The return value inherits its metadata from the first argument.

Author(s)

Michael Lawrence

See Also

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
score <- c(1L, 3L, NA)
counts <- c(10L, 2L, NA)
row.names <- c("one", "two", "three")
  
df <- DataFrame(score) # single column
df[["score"]]
df <- DataFrame(score, row.names = row.names) #with row names
rownames(df)
  
df <- DataFrame(vals = score) # explicit naming
df[["vals"]]

# arrays
ary <- array(1:4, c(2,1,2))
sw <- DataFrame(I(ary))  
  
# a data.frame
sw <- DataFrame(swiss)
as.data.frame(sw) # swiss, without row names
# now with row names
sw <- DataFrame(swiss, row.names = rownames(swiss))
as.data.frame(sw) # swiss

# subsetting
    
sw[] # identity subset
sw[,] # same

sw[NULL] # no columns
sw[,NULL] # no columns
sw[NULL,] # no rows

## select columns
sw[1:3]
sw[,1:3] # same as above
sw[,"Fertility"]
sw[,c(TRUE, FALSE, FALSE, FALSE, FALSE, FALSE)]

## select rows and columns
sw[4:5, 1:3]
  
sw[1] # one-column DataFrame
## the same
sw[, 1, drop = FALSE]
sw[, 1] # a (unnamed) vector
sw[[1]] # the same
sw[["Fertility"]]

sw[["Fert"]] # should return 'NULL'
 
sw[1,] # a one-row DataFrame
sw[1,, drop=TRUE] # a list

## duplicate row, unique row names are created
sw[c(1, 1:2),]

## indexing by row names  
sw["Courtelary",]
subsw <- sw[1:5,1:4]
subsw["C",] # partially matches

## row and column names
cn <- paste("X", seq_len(ncol(swiss)), sep = ".")
colnames(sw) <- cn
colnames(sw)
rn <- seq(nrow(sw))
rownames(sw) <- rn
rownames(sw)

## column replacement

df[["counts"]] <- counts
df[["counts"]]
df[[3]] <- score
df[["X"]]
df[[3]] <- NULL # deletion

Example output

Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, basename, cbind, colMeans, colSums, colnames,
    dirname, do.call, duplicated, eval, evalq, get, grep, grepl,
    intersect, is.unsorted, lapply, lengths, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, rank, rbind,
    rowMeans, rowSums, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which, which.max, which.min


Attaching package: 'S4Vectors'

The following object is masked from 'package:base':

    expand.grid

[1]  1  3 NA
[1] "one"   "two"   "three"
[1]  1  3 NA
             Fertility Agriculture Examination Education Catholic
Courtelary        80.2        17.0          15        12     9.96
Delemont          83.1        45.1           6         9    84.84
Franches-Mnt      92.5        39.7           5         5    93.40
Moutier           85.8        36.5          12         7    33.77
Neuveville        76.9        43.5          17        15     5.16
Porrentruy        76.1        35.3           9         7    90.57
Broye             83.8        70.2          16         7    92.85
Glane             92.4        67.8          14         8    97.16
Gruyere           82.4        53.3          12         7    97.67
Sarine            82.9        45.2          16        13    91.38
Veveyse           87.1        64.5          14         6    98.61
Aigle             64.1        62.0          21        12     8.52
Aubonne           66.9        67.5          14         7     2.27
Avenches          68.9        60.7          19        12     4.43
Cossonay          61.7        69.3          22         5     2.82
Echallens         68.3        72.6          18         2    24.20
Grandson          71.7        34.0          17         8     3.30
Lausanne          55.7        19.4          26        28    12.11
La Vallee         54.3        15.2          31        20     2.15
Lavaux            65.1        73.0          19         9     2.84
Morges            65.5        59.8          22        10     5.23
Moudon            65.0        55.1          14         3     4.52
Nyone             56.6        50.9          22        12    15.14
Orbe              57.4        54.1          20         6     4.20
Oron              72.5        71.2          12         1     2.40
Payerne           74.2        58.1          14         8     5.23
Paysd'enhaut      72.0        63.5           6         3     2.56
Rolle             60.5        60.8          16        10     7.72
Vevey             58.3        26.8          25        19    18.46
Yverdon           65.4        49.5          15         8     6.10
Conthey           75.5        85.9           3         2    99.71
Entremont         69.3        84.9           7         6    99.68
Herens            77.3        89.7           5         2   100.00
Martigwy          70.5        78.2          12         6    98.96
Monthey           79.4        64.9           7         3    98.22
St Maurice        65.0        75.9           9         9    99.06
Sierre            92.2        84.6           3         3    99.46
Sion              79.3        63.1          13        13    96.83
Boudry            70.4        38.4          26        12     5.62
La Chauxdfnd      65.7         7.7          29        11    13.79
Le Locle          72.7        16.7          22        13    11.22
Neuchatel         64.4        17.6          35        32    16.92
Val de Ruz        77.6        37.6          15         7     4.97
ValdeTravers      67.6        18.7          25         7     8.65
V. De Geneve      35.0         1.2          37        53    42.34
Rive Droite       44.7        46.6          16        29    50.43
Rive Gauche       42.8        27.7          22        29    58.33
             Infant.Mortality
Courtelary               22.2
Delemont                 22.2
Franches-Mnt             20.2
Moutier                  20.3
Neuveville               20.6
Porrentruy               26.6
Broye                    23.6
Glane                    24.9
Gruyere                  21.0
Sarine                   24.4
Veveyse                  24.5
Aigle                    16.5
Aubonne                  19.1
Avenches                 22.7
Cossonay                 18.7
Echallens                21.2
Grandson                 20.0
Lausanne                 20.2
La Vallee                10.8
Lavaux                   20.0
Morges                   18.0
Moudon                   22.4
Nyone                    16.7
Orbe                     15.3
Oron                     21.0
Payerne                  23.8
Paysd'enhaut             18.0
Rolle                    16.3
Vevey                    20.9
Yverdon                  22.5
Conthey                  15.1
Entremont                19.8
Herens                   18.3
Martigwy                 19.4
Monthey                  20.2
St Maurice               17.8
Sierre                   16.3
Sion                     18.1
Boudry                   20.3
La Chauxdfnd             20.5
Le Locle                 18.9
Neuchatel                23.0
Val de Ruz               20.0
ValdeTravers             19.5
V. De Geneve             18.0
Rive Droite              18.2
Rive Gauche              19.3
             Fertility Agriculture Examination Education Catholic
Courtelary        80.2        17.0          15        12     9.96
Delemont          83.1        45.1           6         9    84.84
Franches-Mnt      92.5        39.7           5         5    93.40
Moutier           85.8        36.5          12         7    33.77
Neuveville        76.9        43.5          17        15     5.16
Porrentruy        76.1        35.3           9         7    90.57
Broye             83.8        70.2          16         7    92.85
Glane             92.4        67.8          14         8    97.16
Gruyere           82.4        53.3          12         7    97.67
Sarine            82.9        45.2          16        13    91.38
Veveyse           87.1        64.5          14         6    98.61
Aigle             64.1        62.0          21        12     8.52
Aubonne           66.9        67.5          14         7     2.27
Avenches          68.9        60.7          19        12     4.43
Cossonay          61.7        69.3          22         5     2.82
Echallens         68.3        72.6          18         2    24.20
Grandson          71.7        34.0          17         8     3.30
Lausanne          55.7        19.4          26        28    12.11
La Vallee         54.3        15.2          31        20     2.15
Lavaux            65.1        73.0          19         9     2.84
Morges            65.5        59.8          22        10     5.23
Moudon            65.0        55.1          14         3     4.52
Nyone             56.6        50.9          22        12    15.14
Orbe              57.4        54.1          20         6     4.20
Oron              72.5        71.2          12         1     2.40
Payerne           74.2        58.1          14         8     5.23
Paysd'enhaut      72.0        63.5           6         3     2.56
Rolle             60.5        60.8          16        10     7.72
Vevey             58.3        26.8          25        19    18.46
Yverdon           65.4        49.5          15         8     6.10
Conthey           75.5        85.9           3         2    99.71
Entremont         69.3        84.9           7         6    99.68
Herens            77.3        89.7           5         2   100.00
Martigwy          70.5        78.2          12         6    98.96
Monthey           79.4        64.9           7         3    98.22
St Maurice        65.0        75.9           9         9    99.06
Sierre            92.2        84.6           3         3    99.46
Sion              79.3        63.1          13        13    96.83
Boudry            70.4        38.4          26        12     5.62
La Chauxdfnd      65.7         7.7          29        11    13.79
Le Locle          72.7        16.7          22        13    11.22
Neuchatel         64.4        17.6          35        32    16.92
Val de Ruz        77.6        37.6          15         7     4.97
ValdeTravers      67.6        18.7          25         7     8.65
V. De Geneve      35.0         1.2          37        53    42.34
Rive Droite       44.7        46.6          16        29    50.43
Rive Gauche       42.8        27.7          22        29    58.33
             Infant.Mortality
Courtelary               22.2
Delemont                 22.2
Franches-Mnt             20.2
Moutier                  20.3
Neuveville               20.6
Porrentruy               26.6
Broye                    23.6
Glane                    24.9
Gruyere                  21.0
Sarine                   24.4
Veveyse                  24.5
Aigle                    16.5
Aubonne                  19.1
Avenches                 22.7
Cossonay                 18.7
Echallens                21.2
Grandson                 20.0
Lausanne                 20.2
La Vallee                10.8
Lavaux                   20.0
Morges                   18.0
Moudon                   22.4
Nyone                    16.7
Orbe                     15.3
Oron                     21.0
Payerne                  23.8
Paysd'enhaut             18.0
Rolle                    16.3
Vevey                    20.9
Yverdon                  22.5
Conthey                  15.1
Entremont                19.8
Herens                   18.3
Martigwy                 19.4
Monthey                  20.2
St Maurice               17.8
Sierre                   16.3
Sion                     18.1
Boudry                   20.3
La Chauxdfnd             20.5
Le Locle                 18.9
Neuchatel                23.0
Val de Ruz               20.0
ValdeTravers             19.5
V. De Geneve             18.0
Rive Droite              18.2
Rive Gauche              19.3
DataFrame with 47 rows and 6 columns
             Fertility Agriculture Examination Education  Catholic
             <numeric>   <numeric>   <integer> <integer> <numeric>
Courtelary        80.2          17          15        12      9.96
Delemont          83.1        45.1           6         9     84.84
Franches-Mnt      92.5        39.7           5         5      93.4
Moutier           85.8        36.5          12         7     33.77
Neuveville        76.9        43.5          17        15      5.16
...                ...         ...         ...       ...       ...
Val de Ruz        77.6        37.6          15         7      4.97
ValdeTravers      67.6        18.7          25         7      8.65
V. De Geneve        35         1.2          37        53     42.34
Rive Droite       44.7        46.6          16        29     50.43
Rive Gauche       42.8        27.7          22        29     58.33
             Infant.Mortality
                    <numeric>
Courtelary               22.2
Delemont                 22.2
Franches-Mnt             20.2
Moutier                  20.3
Neuveville               20.6
...                       ...
Val de Ruz                 20
ValdeTravers             19.5
V. De Geneve               18
Rive Droite              18.2
Rive Gauche              19.3
DataFrame with 47 rows and 6 columns
             Fertility Agriculture Examination Education  Catholic
             <numeric>   <numeric>   <integer> <integer> <numeric>
Courtelary        80.2          17          15        12      9.96
Delemont          83.1        45.1           6         9     84.84
Franches-Mnt      92.5        39.7           5         5      93.4
Moutier           85.8        36.5          12         7     33.77
Neuveville        76.9        43.5          17        15      5.16
...                ...         ...         ...       ...       ...
Val de Ruz        77.6        37.6          15         7      4.97
ValdeTravers      67.6        18.7          25         7      8.65
V. De Geneve        35         1.2          37        53     42.34
Rive Droite       44.7        46.6          16        29     50.43
Rive Gauche       42.8        27.7          22        29     58.33
             Infant.Mortality
                    <numeric>
Courtelary               22.2
Delemont                 22.2
Franches-Mnt             20.2
Moutier                  20.3
Neuveville               20.6
...                       ...
Val de Ruz                 20
ValdeTravers             19.5
V. De Geneve               18
Rive Droite              18.2
Rive Gauche              19.3
DataFrame with 47 rows and 0 columns
DataFrame with 47 rows and 0 columns
DataFrame with 0 rows and 6 columns
DataFrame with 47 rows and 3 columns
             Fertility Agriculture Examination
             <numeric>   <numeric>   <integer>
Courtelary        80.2          17          15
Delemont          83.1        45.1           6
Franches-Mnt      92.5        39.7           5
Moutier           85.8        36.5          12
Neuveville        76.9        43.5          17
...                ...         ...         ...
Val de Ruz        77.6        37.6          15
ValdeTravers      67.6        18.7          25
V. De Geneve        35         1.2          37
Rive Droite       44.7        46.6          16
Rive Gauche       42.8        27.7          22
DataFrame with 47 rows and 3 columns
             Fertility Agriculture Examination
             <numeric>   <numeric>   <integer>
Courtelary        80.2          17          15
Delemont          83.1        45.1           6
Franches-Mnt      92.5        39.7           5
Moutier           85.8        36.5          12
Neuveville        76.9        43.5          17
...                ...         ...         ...
Val de Ruz        77.6        37.6          15
ValdeTravers      67.6        18.7          25
V. De Geneve        35         1.2          37
Rive Droite       44.7        46.6          16
Rive Gauche       42.8        27.7          22
 [1] 80.2 83.1 92.5 85.8 76.9 76.1 83.8 92.4 82.4 82.9 87.1 64.1 66.9 68.9 61.7
[16] 68.3 71.7 55.7 54.3 65.1 65.5 65.0 56.6 57.4 72.5 74.2 72.0 60.5 58.3 65.4
[31] 75.5 69.3 77.3 70.5 79.4 65.0 92.2 79.3 70.4 65.7 72.7 64.4 77.6 67.6 35.0
[46] 44.7 42.8
 [1] 80.2 83.1 92.5 85.8 76.9 76.1 83.8 92.4 82.4 82.9 87.1 64.1 66.9 68.9 61.7
[16] 68.3 71.7 55.7 54.3 65.1 65.5 65.0 56.6 57.4 72.5 74.2 72.0 60.5 58.3 65.4
[31] 75.5 69.3 77.3 70.5 79.4 65.0 92.2 79.3 70.4 65.7 72.7 64.4 77.6 67.6 35.0
[46] 44.7 42.8
DataFrame with 2 rows and 3 columns
           Fertility Agriculture Examination
           <numeric>   <numeric>   <integer>
Moutier         85.8        36.5          12
Neuveville      76.9        43.5          17
DataFrame with 47 rows and 1 column
             Fertility
             <numeric>
Courtelary        80.2
Delemont          83.1
Franches-Mnt      92.5
Moutier           85.8
Neuveville        76.9
...                ...
Val de Ruz        77.6
ValdeTravers      67.6
V. De Geneve        35
Rive Droite       44.7
Rive Gauche       42.8
DataFrame with 47 rows and 1 column
             Fertility
             <numeric>
Courtelary        80.2
Delemont          83.1
Franches-Mnt      92.5
Moutier           85.8
Neuveville        76.9
...                ...
Val de Ruz        77.6
ValdeTravers      67.6
V. De Geneve        35
Rive Droite       44.7
Rive Gauche       42.8
 [1] 80.2 83.1 92.5 85.8 76.9 76.1 83.8 92.4 82.4 82.9 87.1 64.1 66.9 68.9 61.7
[16] 68.3 71.7 55.7 54.3 65.1 65.5 65.0 56.6 57.4 72.5 74.2 72.0 60.5 58.3 65.4
[31] 75.5 69.3 77.3 70.5 79.4 65.0 92.2 79.3 70.4 65.7 72.7 64.4 77.6 67.6 35.0
[46] 44.7 42.8
 [1] 80.2 83.1 92.5 85.8 76.9 76.1 83.8 92.4 82.4 82.9 87.1 64.1 66.9 68.9 61.7
[16] 68.3 71.7 55.7 54.3 65.1 65.5 65.0 56.6 57.4 72.5 74.2 72.0 60.5 58.3 65.4
[31] 75.5 69.3 77.3 70.5 79.4 65.0 92.2 79.3 70.4 65.7 72.7 64.4 77.6 67.6 35.0
[46] 44.7 42.8
 [1] 80.2 83.1 92.5 85.8 76.9 76.1 83.8 92.4 82.4 82.9 87.1 64.1 66.9 68.9 61.7
[16] 68.3 71.7 55.7 54.3 65.1 65.5 65.0 56.6 57.4 72.5 74.2 72.0 60.5 58.3 65.4
[31] 75.5 69.3 77.3 70.5 79.4 65.0 92.2 79.3 70.4 65.7 72.7 64.4 77.6 67.6 35.0
[46] 44.7 42.8
NULL
DataFrame with 1 row and 6 columns
           Fertility Agriculture Examination Education  Catholic
           <numeric>   <numeric>   <integer> <integer> <numeric>
Courtelary      80.2          17          15        12      9.96
           Infant.Mortality
                  <numeric>
Courtelary             22.2
$Fertility
[1] 80.2

$Agriculture
[1] 17

$Examination
[1] 15

$Education
[1] 12

$Catholic
[1] 9.96

$Infant.Mortality
[1] 22.2

DataFrame with 3 rows and 6 columns
           Fertility Agriculture Examination Education  Catholic
           <numeric>   <numeric>   <integer> <integer> <numeric>
Courtelary      80.2          17          15        12      9.96
Courtelary      80.2          17          15        12      9.96
Delemont        83.1        45.1           6         9     84.84
           Infant.Mortality
                  <numeric>
Courtelary             22.2
Courtelary             22.2
Delemont               22.2
DataFrame with 1 row and 6 columns
           Fertility Agriculture Examination Education  Catholic
           <numeric>   <numeric>   <integer> <integer> <numeric>
Courtelary      80.2          17          15        12      9.96
           Infant.Mortality
                  <numeric>
Courtelary             22.2
DataFrame with 1 row and 4 columns
           Fertility Agriculture Examination Education
           <numeric>   <numeric>   <integer> <integer>
Courtelary      80.2          17          15        12
[1] "X.1" "X.2" "X.3" "X.4" "X.5" "X.6"
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14" "15"
[16] "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30"
[31] "31" "32" "33" "34" "35" "36" "37" "38" "39" "40" "41" "42" "43" "44" "45"
[46] "46" "47"
[1] 10  2 NA
NULL

S4Vectors documentation built on Dec. 11, 2020, 2:02 a.m.