Description Usage Arguments Details Value See Also Examples
Select distinct/unique rows
1 2 3 4 5 6 7 8 9 10 |
.data |
A tbl for an Xdf data source; or a raw Xdf data source. |
... |
Variables to use for determining uniqueness. If left blank, all variables in |
.keep_all |
Whether to keep all the variables in the dataset, or only those used in determining uniqueness. |
.outFile |
Output format for the returned data. If not supplied, create an xdf tbl; if |
.rxArgs |
A list of RevoScaleR arguments. See |
.keep_all |
If |
This verb calls dplyr::distinct
on each chunk in an Xdf file. The individual data frames are rbind
ed together and dplyr::distinct
is called on the overall result. This may be slow if there are many chunks in the file; and the operation will be limited by memory if the number of distinct rows is large.
This verb can be used on HDFS data in the local compute context (on the edge node), but not in the Hadoop or Spark compute contexts.
An object representing the unique rows. This depends on the .outFile
argument: if missing, it will be an xdf tbl object; if NULL
, a data frame; and if a filename, an Xdf data source referencing a file saved to that location.
distinct
in package dplyr
1 2 3 4 5 6 7 8 9 10 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.