GetMultiNcdf: Get WRF Hydro output/restart (scalar) timeseries spread over...

Description Usage Arguments Value Examples

View source: R/ncdf_get_multi.R

Description

GetMultiNcdf is designed to get *all* your output/restart data which are spread over multiple files. Three collated lists specify 1) file groups, 2) variables for each file group, and 3) indices or statistics for each variable in each file group. The names of the lists must match. See examples for details. While the routine can read and summarize raster data at each time via specificied statistics, it only returns scalar timeseries. (It may be possible to extend to return both scalar and raster data if there's demand.)

Usage

1
2
GetMultiNcdf(filesList, variableList, indexList, env = parent.frame(),
  parallel = FALSE)

Arguments

filesList

The list of file groups. Names must match those in the other lists.

variableList

The list of variables for each file group. Names must match filesList.

indexList

The list of indices or statistics to be applied to each variable.

env

The environment where the stat function lives

parallel

Logical, this is the .parallel argument of plyr functions. Parallelization is at the file level (not file group).Typcially we achieve parallelization using the DoMC package. See examples.

Value

A dataframe (in an awesome format).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# This example only shows data for 3 dates, because of limitation of package data.
# Find the package data directory on your machine
## Not run: 
tcPath <- '~/wrfHydroTestCases/'
fcPath <- paste0(tcPath,'Fourmile_Creek/')
dataPath <- paste0(fcPath,'/RUN.RTTESTS/OUTPUT_CHRT_DAILY/')
fileList - These are the groups of files.
lsmFiles <- list.files(path=dataPath, pattern='LDASOUT_DOMAIN', full.names=TRUE)
hydroFiles <- list.files(path=dataPath, pattern='HYDRO_RST', full.names=TRUE)
fileList <- list( lsm=lsmFiles, hydro=hydroFiles)

# varList - Define which variables are desired for each file group.
lsmVars   <- list(TRAD='TRAD', SWE='SNEQV')
## smc1-4 will correspond to the vertical layers.
hydroVars <- list(streamflow='qlink1', smc1='sh2ox', smc2='sh2ox', 
                  smc3='sh2ox', smc4='sh2ox')
# Note that the outer names collate with fileList.
variableList <- list(lsm=lsmVars, hydro=hydroVars)

# indexList - Define what indices/stats are desired for each variable.
# Note that only scalars can be returned for each entry. Spatial fields can 
# be summarized via statistics. 
# Show how to define your own useful stats to use.
# For basin average and max we need the basin mask (this is a non-standard
# field in the fine grid file).
basinMask <- ncdump(paste0(fcPath,'DOMAIN/hydro_OrodellBasin_100m.nc'), 
                    'basn_msk_geogrid')
nc_close(fineGridNc)
basAvg <- function(var) sum(basinMask*var)/sum(basinMask)
basMax <- function(var) max(ceiling(basinMask)*var)
basinKm2 <- sum(basinMask)  ## just asking about the total area of the basin.

# Note that the list names at this level collate with the variable names
# in VarList. You are responsible for entering the correct indices. Note
# that these are reverse order from what is shown in "ncdump -h".
lsmInds   <- list(TRAD=list(start=c(1,1,1), end=c(21,7,1), stat='basAvg'),
                  SNEQV=list(start=c(1,1,1), end=c(21,7,1), stat='basMax'))
hydroInds <- list(qlink1=1,
                  smc1=list(start=c(1,1,1), end=c(21,7,1), stat='basAvg'),
                  smc2=list(start=c(1,1,2), end=c(21,7,2), stat='basAvg'),
                  smc3=list(start=c(1,1,3), end=c(21,7,3), stat='basAvg'),
                  smc4=list(start=c(1,1,4), end=c(21,7,4), stat='basAvg') )
indexList <- list( lsm=lsmInds, hydro=hydroInds)

library(doMC)   ## Showing parallelization, which is at the file level within
registerDoMC(3) ## each file grous; pointless to be longer than your timeseries.
fileData <- GetMultiNcdf(file=fileList,var=variableList, ind=indexList,
                         parallel=TRUE)

# plot
# the lsm and hyro output for this spinup were at different times... 
library(ggplot2)
library(scales)
ggplot( fileData, aes(x=POSIXct, y=value, color=fileGroup)) +
  geom_line() + geom_point() +
  facet_wrap(~variableGroup, scales='free_y', ncol=1) +
  scale_x_datetime(breaks = date_breaks("5 days")) + theme_bw()

## End(Not run)

mccreigh/rwrfhydro documentation built on Feb. 28, 2021, 1:53 p.m.