hdfs_filetransfer: Transfer files and directories to and from HDFS

Description Usage Arguments Details Value See Also Examples

Description

Transfer files and directories to and from HDFS

Usage

1
2
3
4
5
hdfs_upload(src, dest, overwrite = FALSE, nativeTarget = "/tmp",
  host = hdfs_host(), ...)

hdfs_download(src, dest, overwrite = FALSE, nativeTarget = "/tmp",
  host = hdfs_host(), ...)

Arguments

src, dest

Character strings giving the source and destination paths.

overwrite

Whether to overwrite existing files at the destination.

nativeTarget

Only when transferring to/from a remote client. The directory on the edge node in which to stage files.

...

Other arguments to the Hadoop copyFromLocal/copyToLocal command.

Details

These functions transfer files and directories between the native filesystem and HDFS. hdfs_upload copies files from the native filesystem into HDFS, and hdfs_download does the reverse. They can be used both from the edge node of a Hadoop/Spark cluster, and from a remote client. In the latter case, the transfer is a two-stage process: for downloading, the files are copied to the edge node in the directory given by nativeTarget, and then copied to the client; and vice-versa for uploading.

Note that renaming directories as part of the transfer is supported for downloading from HDFS, but not for uploading.

Value

A logical value indicating whether the file transfer succeeded.

See Also

download.file, rxHadoopCopyFromLocal, rxHadoopCopyFromClient

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
## Not run: 
as_xdf(mtcars, "mtcars.xdf", overwrite=TRUE)
hdfs_upload("mtcars.xdf", ".")

write.csv(mtcars, "mtcars.csv", row.names=FALSE)
hdfs_upload("mtcars.csv", "mtcars_uploaded.csv")

file.remove("mtcars.csv")
hdfs_download("mtcars_uploaded.csv", "mtcars.csv")
read.csv("mtcars.csv")

# hdfs_upload() and hdfs_download() can transfer any file, not just datasets
desc <- system.file("DESCRIPTION", package="dplyrXdf")
hdfs_upload(desc, "dplyrXdf_description")

# uploading to attached ADLS storage
hdfs_upload("mtcars.xdf", ".", host="adls.host.name")

## End(Not run)

RevolutionAnalytics/dplyrXdf documentation built on June 3, 2019, 9:08 p.m.