Description Usage Arguments Details Value Author(s) See Also Examples
View source: R/hsCmdLineArgs.R
Offers several command line arguments useful for Hadoop streaming. Allows specifying input and output files, column separators, and much more. Optionally opens the I/O connections.
1 | hsCmdLineArgs(spec=c(),openConnections=TRUE,args=commandArgs(TRUE))
|
spec |
A vector specifying the command line args to support. |
openConnections |
A boolean specifying whether to open the I/O connections. |
args |
Character vector of arguments. Defaults to command line args. |
The spec
vector has length 6*n
, where n
is the number of command line
arguments specified. The spec
has the same format as the spec
parameter in the getopt function of the getopt package, though we have
one additional entry specifying a defaut value. The six entries per
argument are the following:
long flag name (a multi-character string)
short flag name (a single character)
Argument specification: 0=no arg, 1=required arg, 2=optional arg
Data type ('logical', 'integer', 'double', 'complex', or 'character')
A string describing the option
The default value to be assigned to this parameter
See getopt in getopt.package for details.
The following vector defines the default command line
args. The vector is appended to the user-supplied spec
vector in the
call to getopt.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
basespec = c(
'mapper', 'm',0, "logical","Runs the mapper.",F,
'reducer', 'r',0, "logical","Runs the reducer, unless already running mapper.",F,
'mapcols', 'a',0, "logical","Prints column headers for mapper output.",F,
'reducecols', 'b',0, "logical","Prints column headers for reducer output.",F,
'infile' , 'i',1, "character","Specifies an input file, otherwise use stdin.",NA,
'outfile', 'o',1, "character","Specifies an output file, otherwise use stdout.",NA,
'skip', 's',1,"numeric","Number of lines of input to skip at the beginning.",0,
'chunksize', 'C',1,"numeric","Number of lines to read at once, a la scan.",-1,
'numlines', 'n',1,"numeric","Max num lines to read per mapper or reducer job.",0,
'sepr', 'e',1,"character","Separator character, as used by scan.",'\t',
'insep', 'f',1,"character","Separator character for input, defaults to sepr.",NA,
'outsep', 'g',1,"character","Separator character output, defaults to sepr.",NA,
'help', 'h',0,"logical","Get a help message.",F
)
|
Returns a list. The names of the entries in the list are the long flag names. Their values are either those specified on the command line, or the default values.
If openConnections=TRUE, then the returned list has two additional entries: incon and outcon. incon is a readable connection to the input source specified, and outcon is a writable connection to the appropriate output destination.
An additional entry in the returned list is named 'set'
.
When this list entry is FALSE, none of the options were set
(generally because -h or –help was requested). The calling
procedure should probably stop execution when the 'set'
is
returned as FALSE.
David S. Rosenberg drosen@sensenetworks.com
This package relies heavily on package getopt
1 2 3 4 5 6 7 8 | spec = c('myChunkSize','C',1,"numeric","Number of lines to read at once, a la scan.",-1)
## Displays the help string
hsCmdLineArgs(spec, args=c('-h'))
## Call with the mapper flag, and request that connections be opened
opts = hsCmdLineArgs(spec, openConnections=TRUE,args=c('-m'))
opts # a list of argument values
opts$incon # an input connection
opts$outcon # an output connection
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.