Description Usage Arguments Author(s) Examples
Input raw text data file is download from NCDC, and is available in the drsstl package in ./inst/extdata. It is read in and divided into by-month division saved on HDFS
1 2 | readIn(input, output, info, cluster_control = mapreduce.control(),
model_control = spacetime.control(), cshift = 1)
|
input |
The path of input file on HDFS. It should be raw text file. |
output |
The path of output file on HDFS. It is by time division. |
info |
The RData on HDFS which contains all station metadata. Make sure copy the RData of station_info.RData, which is also available in the drsstl package, to HDFS first using rhput. |
cluster_control |
all parameters that are needed for mapreduce job |
model_control |
Should be a list object generated from |
cshift |
number of columns to be shifted when reading raw text file |
Xiaosu Tong
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | ## Not run:
rhput("./station_info.RData", "/tmp/station_info.RData")
FileInput <- "/tmp/tmax.txt"
FileOutput <- "/tmp/bymth"
ccontrol <- mapreduce.control(
libLoc=NULL, reduceTask=5, io_sort=100, slow_starts = 0.5,
reduce_input_buffer_percent=0.9, reduce_parallelcopies=5,
spill_percent=0.9, reduce_shuffle_input_buffer_percent = 0.9,
reduce_shuffle_merge_percent = 0.5
)
readIn(
FileInput, FileOutput, info="/tmp/station_info.RData", cluster_control=ccontrol
)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.