dahstat: Statistical summaries of the homogenized data

Description Usage Arguments Details Value See Also Examples

View source: R/depurdat.R

Description

Lists means, standard deviations, quantiles or trends, for a specified period, from series homogenized by homogen.

Usage

1
2
3
4
dahstat(varcli, anyi, anyf, anyip=anyi, anyfp=anyf, stat="me", ndc=NA, vala=2,
cod=NULL, mnpd=0, mxsh=0, prob=.5, last=FALSE, long=FALSE, lsnh=FALSE,
lerr=FALSE, relref=FALSE, mh=FALSE, pernys=100, estcol=c(1,2,4), sep=',',
dec='.', eol='\n', nei=NA, x=NA)

Arguments

varcli

Acronym of the name of the studied climatic variable, as in the data file name.

anyi

Initial year of the homogenized period.

anyf

Final year of the homogenized period.

anyip

First year of the period to analyze. (Defaults to anyi).

anyfp

Last year of the period to analyze. (Defaults to anyf).

stat

Statistical parameter to compute for the selected period:

"me":

Means (default),

"mdn"

Medians,

"max"

Maxima,

"min"

Minima,

"std"

Standard deviations,

"q"

Quantiles (see the prob parameter),

"tnd"

Trends,

"series"

Do not compute any statistics; only output all homogenized series in individual *.csv files.

ndc

Number of decimal places to be saved in the output file (1 by default).

vala

Annual values to compute from the sub-annual data:

0:

None,

1:

Sums,

2:

Means (default),

3:

Maxima,

4:

Minima.

cod

Optional vector of codes of the stations to be processed.

mnpd

Minimum percentage of original data. (0 = no limit).

mxsh

Maximum SNHT. (0 = no limit).

prob

Probability for the computation of quantiles (0.5 by default, i.e., medians). You can set probabilities with more than 2 decimals, but the name of the output file will be identified with the rounded percentile.

last

Logical value to compute statistics only for stations working at the end of the period of study. (FALSE by default).

long

Logical value to compute statistics only for series built from the longest homogeneous sub-period. (FALSE by default).

lsnh

Logical value to compute statistics from series built from the homogeneous sub-period with lowest SNHT. (FALSE by default).

lerr

Logical value to compute statistics only for series built from the homogeneous sub-period with lowest RMSE. (FALSE by default).

relref

If TRUE, statistics from reliable reference series will be also listed. (FALSE by default).

mh

If TRUE, read monthly data computed from daily adjusted series. (FALSE by default).

pernys

Number of years on which to compute trends. (Defaults to 100).

estcol

Columns of the homogenized stations file to be included in the output file. (Defaults to c(1,2,4), the columns of station coordinates and codes).

sep

String to use for separating the output data. (',').

dec

Character to use as decimal point in the output data. ('.').

eol

Line termination style. ('\n').

nei

Number of stations in the input files. (To be read from the *.rda file.)

x

Vector of dates. (To be read from the *.rda file.)

Details

Homogenized data are read from the file ‘VAR_ANYI-ANYF.rda’ saved by homogen, while this function saves the computed data for the specified period in ‘VAR_ANYIP-ANYFP.STAT’, where STAT is substituted by the stat requested statistic. An exception is when stat="q", since then the extension of the output file will be qPP, where PP stands for the specified prob probability (in percent). The output period ANYIP-ANYFP must of course be comprised within the period of the input data, ANYI-ANYF.

Parameters mnpd and mxsh act as filters to produce results only for series that have those minimum percentages of original data and maximum SNHT values. Alternatively, long, last, lsnh and lerr allow the selection of series reconstructed from the preferred homogeneous sub-period, depending on the parameter set to TRUE. However, note that in many cases the shorter sub-periods may have lower SNHT and RMSE values, and therefore parameters lsnh and lerr should be used with caution. The most advisable paramenters to select most suitable reconstructions are long for computing normal values and last for climate monitoring of new incoming data.

to select only those stations working at the end of the period studied. No selection is performed by default, listing the desired statistic for all the reconstructed series (from every homogeneous sub-period).

stat='tnd' computes trends by OLS linear regression on time, listing them in a CSV file ‘*_tnd.csv’ and their p-values in ‘*_pval.csv

If stat='series' is chosen, two text files in CSV format will be produced for every station, one with the data and another with their flags: 0 for original, 1 for infilled and 2 for corrected data. (Not useful for daily series.)

Value

This function does not return any value, since outputs are saved to files.

See Also

homogen, dahgrid.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
#Set a temporal working directory and write input files:
wd <- tempdir()
wd0 <- setwd(wd)
data(Ptest)
dim(dat) <- c(720,20)
dat[601:720,5] <- dat[601:720,5]*1.8
write(dat[481:720,1:5],'pcp_1991-2010.dat')
write.table(est.c[1:5,1:5],'pcp_1991-2010.est',row.names=FALSE,col.names=FALSE)
homogen('pcp',1991,2010,std=2)
#Now run the examples:
dahstat('pcp',1991,2010)
dahstat('pcp',1991,2010,stat='tnd')
#Return to user's working directory:
setwd(wd0)
#Input and output files can be found in directory:
print(wd)

climatol documentation built on May 4, 2018, 9:04 a.m.

Related to dahstat in climatol...