column_misc_functions: Miscellaneous functions for Column operations

Description Usage Arguments Details Note Examples

Description

Miscellaneous functions defined for Column.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
crc32(x)

hash(x, ...)

md5(x)

sha1(x)

sha2(y, x)

## S4 method for signature 'Column'
crc32(x)

## S4 method for signature 'Column'
hash(x, ...)

## S4 method for signature 'Column'
md5(x)

## S4 method for signature 'Column'
sha1(x)

## S4 method for signature 'Column,numeric'
sha2(y, x)

Arguments

x

Column to compute on. In sha2, it is one of 224, 256, 384, or 512.

...

additional Columns.

y

Column to compute on.

Details

crc32: Calculates the cyclic redundancy check value (CRC32) of a binary column and returns the value as a bigint.

hash: Calculates the hash code of given columns, and returns the result as an int column.

md5: Calculates the MD5 digest of a binary column and returns the value as a 32 character hex string.

sha1: Calculates the SHA-1 digest of a binary column and returns the value as a 40 character hex string.

sha2: Calculates the SHA-2 family of hash functions of a binary column and returns the value as a hex string. The second argument x specifies the number of bits, and is one of 224, 256, 384, or 512.

Note

crc32 since 1.5.0

hash since 2.0.0

md5 since 1.5.0

sha1 since 1.5.0

sha2 since 1.5.0

Examples

1
2
3
4
5
6
7
8
## Not run: 
# Dataframe used throughout this doc
df <- createDataFrame(cbind(model = rownames(mtcars), mtcars)[, 1:2])
tmp <- mutate(df, v1 = crc32(df$model), v2 = hash(df$model),
                  v3 = hash(df$model, df$mpg), v4 = md5(df$model),
                  v5 = sha1(df$model), v6 = sha2(df$model, 256))
head(tmp)
## End(Not run)

danzafar/tidyspark documentation built on Sept. 30, 2020, 12:19 p.m.