SQLDataFrame-methods: SQLDataFrame methods

Description Usage Arguments Value Examples

Description

head, tail: Retrieve the first / last n rows of the SQLDataFrame object. See ?S4Vectors::head for more details.

dim, dimnames, length, names: Retrieve the dimension, dimension names, number of columns and colnames of SQLDataFrame object.

[i, j] supports subsetting by i (for row) and j (for column) and respects ‘drop=FALSE’.

Use select() function to select certain columns.

Use filter() to choose rows/cases where conditions are true.

mutate() adds new columns and preserves existing ones; It also preserves the number of rows of the input. New variables overwrite existing variables of the same name.

connSQLDataFrame returns the connection of a SQLDataFrame object.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
## S4 method for signature 'SQLDataFrame'
head(x, n = 6L)

## S4 method for signature 'SQLDataFrame'
tail(x, n = 6L)

## S4 method for signature 'SQLDataFrame'
dim(x)

## S4 method for signature 'SQLDataFrame'
dimnames(x)

## S4 method for signature 'SQLDataFrame'
length(x)

## S4 method for signature 'SQLDataFrame'
names(x)

## S4 method for signature 'SQLDataFrame,ANY'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'SQLDataFrame,SQLDataFrame'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'SQLDataFrame,list'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'SQLDataFrame'
x[[i, j, ...]]

## S4 method for signature 'SQLDataFrame'
x$name

## S3 method for class 'SQLDataFrame'
select(.data, ...)

## S3 method for class 'SQLDataFrame'
filter(.data, ...)

## S3 method for class 'SQLDataFrame'
mutate(.data, ...)

connSQLDataFrame(x)

Arguments

x

An SQLDataFrame object.

n

Number of rows.

i

Row subscript. Could be numeric / character / logical values, a named list of key values, and SQLDataFrame, data.frame, tibble objects.

j

Column subscript.

...

additional arguments to be passed.

  • select(): One or more unquoted expressions separated by commas. You can treat variable names like they are positions, so you can use expressions like ‘x:y’ to select ranges of variables. Positive values select variables; negative values drop variables. See ?dplyr::select for more details.

  • filter(): Logical predicates defined in terms of the variables in ‘.data’. Multiple conditions are combined with ‘&’. Only rows where the condition evaluates to ‘TRUE’ are kept. See ?dplyr::filter for more details.

  • mutate(): Name-value pairs of expressions, each with length 1 or the same length as the number of rows in the group (if using ‘group_by()’) or in the entire input (if not using groups). The name of each argument will be the name of a new variable, and the value will be its corresponding value. New variables overwrite existing variables of the same name. NOTE that the new value could only be of length 1 or the operation of existing columns. If a new vector of values are given, error will return. This is due to the internal method of 'mutate.tbl_lazy' not being able to take new arbitrary values.

drop

Whether to drop with reduced dimension. Default is TRUE.

name

column name to be extracted by $.

.data

A SQLDataFrame object.

Value

head, tail: An SQLDataFrame object with certain rows.

dim: interger vector

dimnames: A list of character vectors.

length: An integer

names: A character vector

[i, j]: A SQLDataFrame object or vector with realized column values (with single column subsetting and default drop=TRUE. )

select: always returns a SQLDataFrame object no matter how may columns are selected. If only key column(s) is(are) selected, it will return a SQLDataFrame object with 0 col (only key columns are shown).

filter: A SQLDataFrame object with subset rows of the input SQLDataFrame object matching conditions.

mutate: A SQLDataFrame object.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
##################
## basic methods
##################

test.db <- system.file("extdata/test.db", package = "SQLDataFrame")
conn <- DBI::dbConnect(DBI::dbDriver("SQLite"), dbname = test.db)
obj <- SQLDataFrame(conn = conn, dbtable = "state", dbkey = "state")
dim(obj)
dimnames(obj)
length(obj)
names(obj)

obj1 <- SQLDataFrame(conn = conn, dbtable = "state",
                     dbkey = c("region", "population"))
###############
## subsetting
###############

obj[1]
obj["region"]
obj$region
obj[]
obj[,]
obj[NULL, ]
obj[, NULL]

## by numeric / logical / character vectors
obj[1:5, 2:3]
obj[c(TRUE, FALSE), c(TRUE, FALSE)]
obj[c("Alabama", "South Dakota"), ]
obj1[c("South:3615.0", "West:3559.0"), ]
### Remeber to add `.0` trailing for numeric values. If not sure,
### check `ROWNAMES()`.

## by SQLDataFrame
obj_sub <- obj[sample(10), ]
obj[obj_sub, ]

## by a named list of key column values (or equivalently data.frame /
## tibble)
obj[data.frame(state = c("Colorado", "Arizona")), ]
obj[tibble::tibble(state = c("Colorado", "Arizona")), ]
obj[list(state = c("Colorado", "Arizona")), ]
obj1[list(region = c("South", "West"),
          population = c("3615.0", "365.0")), ]
### remember to add the '.0' trailing for numeric values. If not sure,
### check `ROWNAMES()`.

## Subsetting with key columns

obj["state"] ## list style subsetting, return a SQLDataFrame object with col = 0.
obj[c("state", "division")]  ## list style subsetting, return a SQLDataFrame object with col = 1.
obj[, "state"] ## realize specific key column value.
obj[, c("state", "division")] ## col = 1, but do not realize.


###################
## select, filter, mutate
###################
library(dplyr)
obj %>% select(division)  ## equivalent to obj["division"], or obj[, "division", drop = FALSE]
obj %>% select(region:size)

obj %>% filter(region == "West" & size == "medium")
obj1 %>% filter(region == "West" & population > 10000)

obj %>% mutate(p1 = population / 10)
obj %>% mutate(s1 = size)

obj %>% select(region, size, population) %>% 
    filter(population > 10000) %>% 
    mutate(pK = population/1000)
obj1 %>% select(region, size, population) %>% 
    filter(population > 10000) %>% 
    mutate(pK = population/1000)  

###################
## connection info
###################

connSQLDataFrame(obj)

SQLDataFrame documentation built on Nov. 29, 2020, 2:01 a.m.