Description Usage Arguments Details Note See Also Examples
Date time functions defined for Column
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 | current_date(x = "missing")
current_timestamp(x = "missing")
date_trunc(format, x)
dayofmonth(x)
dayofweek(x)
dayofyear(x)
from_unixtime(x, ...)
last_day(x)
to_date(x, format)
to_timestamp(x, format)
unix_timestamp(x, format)
weekofyear(x)
## S4 method for signature 'Column'
dayofmonth(x)
## S4 method for signature 'Column'
dayofweek(x)
## S4 method for signature 'Column'
dayofyear(x)
## S4 method for signature 'Column'
last_day(x)
## S4 method for signature 'Column,missing'
to_date(x, format)
## S4 method for signature 'Column,character'
to_date(x, format)
## S4 method for signature 'Column,missing'
to_timestamp(x, format)
## S4 method for signature 'Column,character'
to_timestamp(x, format)
## S4 method for signature 'Column'
weekofyear(x)
## S4 method for signature 'Column'
from_unixtime(x, format = "yyyy-MM-dd HH:mm:ss")
## S3 method for class 'Column'
window(x, windowDuration, slideDuration = NULL, startTime = NULL, ...)
## S4 method for signature 'missing,missing'
unix_timestamp(x, format)
## S4 method for signature 'Column,missing'
unix_timestamp(x, format)
## S4 method for signature 'Column,character'
unix_timestamp(x, format = "yyyy-MM-dd HH:mm:ss")
## S4 method for signature 'Column'
trunc(x, format)
## S4 method for signature 'character,Column'
date_trunc(format, x)
## S4 method for signature 'missing'
current_date()
## S4 method for signature 'missing'
current_timestamp()
|
x |
Column to compute on. In |
format |
The format for the given dates or timestamps in Column
|
... |
additional argument(s). |
windowDuration |
a string specifying the width of the window, e.g. '1 second', '1 day 12 hours', '2 minutes'. Valid interval strings are 'week', 'day', 'hour', 'minute', 'second', 'millisecond', 'microsecond'. Note that the duration is a fixed length of time, and does not vary over time according to a calendar. For example, '1 day' always means 86,400,000 milliseconds, not a calendar day. |
slideDuration |
a string specifying the sliding interval of the window.
Same format as |
startTime |
the offset with respect to 1970-01-01 00:00:00 UTC with
which to start window intervals. For example, in order to
have hourly tumbling windows that start 15 minutes past the
hour, e.g. 12:15-13:15, 13:15-14:15... provide
|
dayofmonth
: Extracts the day of the month as an integer from a
given date/timestamp/string.
dayofweek
: Extracts the day of the week as an integer from a
given date/timestamp/string.
dayofyear
: Extracts the day of the year as an integer from a
given date/timestamp/string.
last_day
: Given a date column, returns the last day of the month which the
given date belongs to. For example, input "2015-07-27" returns "2015-07-31" since
July 31 is the last day of the month in July 2015.
to_date
: Converts the column into a DateType. You may optionally
specify a format according to the
http://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html.
If the string cannot be parsed according to the specified format
(or default), the value of the column will be null.
By default, it follows casting rules to a DateType if the format is omitted
(equivalent to cast(df$x, "date")
).
to_timestamp
: Converts the column into a TimestampType. You may
optionally specify a format according to the rules in:
http://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html.
If the string cannot be parsed according to the specified format (or default),
the value of the column will be null.
By default, it follows casting rules to a TimestampType if the format is omitted
(equivalent to cast(df$x, "timestamp")
).
weekofyear
: Extracts the week number as an integer from a given
date/timestamp/string.
from_unixtime
: Converts the number of seconds from unix epoch
(1970-01-01 00:00:00 UTC) to a string representing the timestamp of that
moment in the current system time zone in the JVM in the given format.
See
Customizing Formats for available options.
window
: Bucketizes rows into one or more time windows given a
timestamp specifying column. Window starts are inclusive but the window ends
are exclusive, e.g. 12:05 will be in the window [12:05,12:10) but not in
[12:00,12:05). Windows can support microsecond precision. Windows in the
order of months are not supported. It returns an output column of struct
called 'window' by default with the nested columns 'start' and 'end'
unix_timestamp
: Gets current Unix timestamp in seconds.
trunc
: Returns date truncated to the unit specified by the format.
date_trunc
: Returns timestamp truncated to the unit specified by the format.
current_date
: Returns the current date as a date column.
current_timestamp
: Returns the current timestamp as a timestamp column.
dayofmonth since 1.5.0
dayofweek since 2.3.0
dayofyear since 1.5.0
last_day since 1.5.0
to_date(Column) since 1.5.0
to_date(Column, character) since 2.2.0
to_timestamp(Column) since 2.2.0
to_timestamp(Column, character) since 2.2.0
weekofyear since 1.5.0
from_unixtime since 1.5.0
window since 2.0.0
unix_timestamp since 1.5.0
unix_timestamp(Column) since 1.5.0
unix_timestamp(Column, character) since 1.5.0
trunc since 2.3.0
date_trunc since 2.3.0
current_date since 2.3.0
current_timestamp since 2.3.0
Other data time functions:
add_months()
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | ## Not run:
dts <- c("2005-01-02 18:47:22",
"2005-12-24 16:30:58",
"2005-10-28 07:30:05",
"2005-12-28 07:01:05",
"2006-01-24 00:01:10")
y <- c(2.0, 2.2, 3.4, 2.5, 1.8)
df <- createDataFrame(data.frame(time = as.POSIXct(dts), y = y))
## End(Not run)
## Not run:
head(select(df, df$time, year(df$time), quarter(df$time), month(df$time),
dayofmonth(df$time), dayofweek(df$time), dayofyear(df$time),
weekofyear(df$time)))
head(agg(groupBy(df, year(df$time)), count(df$y), avg(df$y)))
head(agg(groupBy(df, month(df$time)), avg(df$y)))
## End(Not run)
## Not run:
head(select(df, df$time, last_day(df$time), month(df$time)))
## End(Not run)
## Not run:
tmp <- createDataFrame(data.frame(time_string = dts))
tmp2 <- mutate(tmp, date1 = to_date(tmp$time_string),
date2 = to_date(tmp$time_string, "yyyy-MM-dd"),
date3 = date_format(tmp$time_string, "MM/dd/yyy"),
time1 = to_timestamp(tmp$time_string),
time2 = to_timestamp(tmp$time_string, "yyyy-MM-dd"))
head(tmp2)
## End(Not run)
## Not run:
tmp <- mutate(df, to_unix = unix_timestamp(df$time),
to_unix2 = unix_timestamp(df$time, 'yyyy-MM-dd HH'),
from_unix = from_unixtime(unix_timestamp(df$time)),
from_unix2 = from_unixtime(unix_timestamp(df$time),
'yyyy-MM-dd HH:mm'))
head(tmp)
## End(Not run)
## Not run:
# One minute windows every 15 seconds 10 seconds after the minute, e.g.
# 09:00:10-09:01:10, 09:00:25-09:01:25, 09:00:40-09:01:40, ...
window(df$time, "1 minute", "15 seconds", "10 seconds")
# One minute tumbling windows 15 seconds after the minute, e.g.
# 09:00:15-09:01:15, 09:01:15-09:02:15...
window(df$time, "1 minute", startTime = "15 seconds")
# Thirty-second windows every 10 seconds, e.g. 09:00:00-09:00:30,
# 09:00:10-09:00:40, ...
window(df$time, "30 seconds", "10 seconds")
## End(Not run)
## Not run:
head(select(df, df$time, trunc(df$time, "year"), trunc(df$time, "yy"),
trunc(df$time, "month"), trunc(df$time, "mon")))
## End(Not run)
## Not run:
head(select(df, df$time,
date_trunc("hour", df$time),
date_trunc("minute", df$time),
date_trunc("week", df$time),
date_trunc("quarter", df$time)))
## End(Not run)
## Not run:
head(select(df, current_date(), current_timestamp()))
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.