Description Usage Arguments Author(s) Examples
Input raw text data file is download from NCDC, and is available in the drsstl package in ./inst/extdata. It is read in and divided into by-month division saved on HDFS
1 2  | readIn(input, output, info, cluster_control = mapreduce.control(),
  model_control = spacetime.control(), cshift = 1)
 | 
input | 
 The path of input file on HDFS. It should be raw text file.  | 
output | 
 The path of output file on HDFS. It is by time division.  | 
info | 
 The RData on HDFS which contains all station metadata. Make sure copy the RData of station_info.RData, which is also available in the drsstl package, to HDFS first using rhput.  | 
cluster_control | 
 all parameters that are needed for mapreduce job  | 
model_control | 
 Should be a list object generated from   | 
cshift | 
 number of columns to be shifted when reading raw text file  | 
Xiaosu Tong
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  | ## Not run: 
    rhput("./station_info.RData", "/tmp/station_info.RData")
    FileInput <- "/tmp/tmax.txt"
    FileOutput <- "/tmp/bymth"
    ccontrol <- mapreduce.control(
      libLoc=NULL, reduceTask=5, io_sort=100, slow_starts = 0.5,
      reduce_input_buffer_percent=0.9, reduce_parallelcopies=5,
      spill_percent=0.9, reduce_shuffle_input_buffer_percent = 0.9,
      reduce_shuffle_merge_percent = 0.5
    )
    readIn(
      FileInput, FileOutput, info="/tmp/station_info.RData", cluster_control=ccontrol
    )
## End(Not run)
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.