Description Usage Arguments Value Examples
View source: R/usgs_time_slice.R
Make timeslices from USGS discharge data files gathered using dataRetrieval..
1 2 | MkUsgsTimeSlice(realTimeFiles, outPath, nearestMin = 5,
oldestTime = NULL, qcFunction, varianceFunction)
|
realTimeFiles |
Character vector of active RData format files to be processed. |
outPath |
Character, the directory path where ncdf files are to be written. |
nearestMin |
Numeric, the time resolution to which the observation times are rounded and the netcdf timeslice files are to be written. Must evenly divide 60. |
oldestTime |
POSIXct, the date BEFORE which data will be ignored. |
qcFunction |
Function, used to apply quality control procedures to the timeslice in metric units. Note that this QC can only use information from the current time. QC procedures involving the temporal domain will be applied elsewhere. |
varianceFunction |
Function, used to derive the observation variance
from a dataframe with the following columns: |
A dataframe with two columns: POSIXct
and filename
which given the time of the timeslice and the corresponding file name with
full path.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | ## Not run:
realTimeFiles <- list.files(pattern='huc.*.RData',
path='~/usgsStreamData/realTimeData',
full.names=TRUE)
outPath = '~/usgsStreamData/timeSliceData/'
library(doMC)
registerDoMC(4)
## A first test
ret1 <- MkUsgsTimeSlice( realTimeFiles[1:21], outPath=outPath,
oldest=as.POSIXct('2015-04-15 00:00:00', tz='UTC') )
nrow(ret1)
## delete the files and see how many more are created without the oldestTime set
unlink(ret1$file)
ret1 <- MkUsgsTimeSlice( realTimeFiles[1:21], outPath=outPath )
nrow(ret1) ## quite a few more files.
ncdump(ret1$file[230]) ## 27 stations
ret2 <- MkUsgsTimeSlice( realTimeFiles[22:42], outPath=outPath )
ncdump(ret1$file[230]) ## 58 stations
## new experiment
unlink(unique(c(ret1$file, ret2$file)))
ret1 <- MkUsgsTimeSlice( realTimeFiles, outPath=outPath, nearest=60,
oldest=as.POSIXct('2015-04-15 00:00:00', tz='UTC'))
nStn <-
plyr::ldply(NamedList(ret1$file),
function(ff) { nc <- ncdump(ff, quiet=TRUE)
data.frame(nStn=nc$dim$stationId$len,
time=as.POSIXct('1970-01-01 00:00:00',tz='UTC') +
nc$dim$time$vals,
nUniqueStn = length(unique(nc$dim$stationId$vals)) )},
.parallel=TRUE)
library(ggplot2)
ggplot(nStn, aes(x=time,y=nStn)) + geom_point(color='red')
###############################
## process on hydro-c1
realTimeFiles <- list.files(pattern='huc.*.RData',
path='~/usgsStreamData/realTimeData',
full.names=TRUE)
outPath = '~/usgsStreamData/timeSliceData/'
library(doMC)
registerDoMC(12)
## I'm worried about using too much memory, when I run this on all
## previously collected data, so break up the problem
chunkSize <- 1000
chunkDf <- data.frame( ind = 0:(length(realTimeFiles) %/% chunkSize) )
chunkDf <- within(chunkDf, { start = (ind)*chunkSize+1
end = pmin( (ind+1)*chunkSize, length(realTimeFiles)) } )
for (ii in 1:nrow(chunkDf) ) {
ret1 <- MkUsgsTimeSlice( realTimeFiles[chunkDf$start[ii]:chunkDf$end[ii]],
outPath=outPath, nearest=60,
oldest=as.POSIXct('2015-04-15 00:00:00', tz='UTC')
)
}
## end dontrun
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.