read.STdata: Imports a text file in R

View source: R/read.STdata.R

read.STdataR Documentation

Imports a text file in R

Description

A function for importing a text file containing spatio-temporal data. In particular, it (a) generates the spatial and temporal IDs, (b) converts the time series of each spatial point (with non-existing values for some dates) into a regularly spaced object within the observed time period, by filling the missing dates with ‘NA’ (c) converts the data into a STFDF, according to the standard of the spacetime package, or into a data frame

Usage

read.STdata(
  file,
  header = FALSE,
  dec = ".",
  sep = "",
  iclx,
  icly,
  iclt,
  icldate = c(icl.date = 0, iclty = 0, icltm = 0, icltd = 0),
  icltime = c(icl.time = 0, icltH = 0, icltM = 0, icltS = 0),
  iclvr,
  iclsp = 0,
  missing.v = NA,
  save.as = "data.frame",
  date.format = c("code", format = NA),
  bytime = NA,
  tlag,
  time.zone = ""
)

Arguments

file

the name of the data file and its extension. The file is searched in the current working directory, otherwise the absolute path has to be included in the file name. Note that data for each spatial point and each temporal point are given by row; each row of the file contains at least the x and y coordinates of a spatial point, the temporal code (or date) and the measurement of the variable of interest.

header

logical, value indicating whether the file contains the names of the variables in the first line. If this argument is missing, header is set to FALSE (default choice)

dec

character, used to indicate decimal points

sep

field separator character. If sep = "" (default choice) columns of the file are separated by white space or tabs (see read.table for more details)

iclx

numeric, the column in which the x-coordinate of the spatial points are stored

icly

numeric, the column in which the y-coordinate of the spatial points are stored

iclt

numeric, the column in which numeric temporal codes are stored. This argument is provided only if the icldate argument is not available: iclt and icldate are mutually exclusive. This argument is set equal to 0, if not available

icldate

numeric vector to set the columns in which the dates are stored. The user has to set icl.date if the date is stored in a single column, otherwise the user has to specify the colunn in which the years (iclty), the months (icltm) or the days (icltd) are stored separately. This argument is set equal to 0 (default choice) if not available

icltime

numeric vector to set the columns in which the time component (hour, minute, second) of a date (if available) is stored. The user has to set icl.time if the time is stored in a single column, otherwise the user has to specify the column in which the hours (icltH), the minutes (icltM) or the seconds (icltS) are stored separately. This argument is set equal to 0 (default choice) if not available

iclvr

numeric, the column in which the values of the variable are stored

iclsp

numeric, the column in which the identification codes (IDs) for the spatial locations are stored. This argument is set equal to 0 (default choice) if IDs for the spatial locations are not available

missing.v

code used to indicate the presence of missing values in the imported data. By default this argument is set equal to NA

save.as

character, indicating the class of the data to be returned. It is allowed to choose between two options for saving the file ("STFDF" or "data.frame")

date.format

vector, whose first element date.format[1] denotes the class of the temporal component to be imported and the second one date.format[2] represents the corresponding format. Note that the supported class of dates are "yearmon", "yearqtr", "Date", "POSIX" (see Base, lubridate, zoo); moreover the personalized options "year" and "code" are also admissible and are used if the temporal coordinate is given by year or as a numerical code, respectively. By default, the argument date.format is set equal to ("code", format = NA). If the temporal component, provided for example in year and month, is given in separeted columns in the text file, the required format in date.format[2] is of the type "%Y %m"; in general the format requires the use of white spaces between two consecutive time units

bytime

character, which denotes the time disaggregation of interest, set NA (default choice) for numeric temporal code, otherwise "%Y" or "%y" if values are taken by year, "%m" if values are taken by month, "%d" if values are taken by day, "%q" if values are taken by quarter, "%H" if values are taken by hour, "%M" if values are taken by minute and "%S" if values are taken by seconds

tlag

numeric, time increment/lag between two temporal observations

time.zone

character, time zone for dates with time component

Details

  • Uncomplete time series, for each spatial point, are filled with NA

  • Some checks on the admissibility of the supported classes of dates are implemented

  • Time indexes for temporal points are coded for data.frame output by using consecutive numbers starting from 1 (column 'timeIndex')

  • The spatial points are coded by using the string 'id' and the consecutive numbers starting from 1 (column 'spatialIndex')

Value

object of the STFDF-class or data.frame, which contains coordinates of the spatial points, the spatial IDs, the temporal IDs, the dates (if available in the input file) and the observed values of the variable of interest

References

Bivand, R. S., Pebesma, E., Gomez-Rubio, V., 2013, Applied spatial data analysis with R, Second edition. New York: Springer. https://asdar-book.org/

Grolemund, G, Wickham, H., 2011, Dates and Times Made Easy with lubridate. Journal of Statistical Software, 40(3) 1–25.

Pebesma, E.J., 2012, spacetime: Spatio-Temporal Data in R. Journal of Statistical Software, 51(7) 1–30.

Zeileis, A., Grothendieck. G., 2005, zoo: S3 Infrastructure for Regular and Irregular Time Series. Journal of Statistical Software, 14(6) 1–27.

See Also

STFDF-class

read.table

yearmon

yearqtr

Dates for dates without times

DateTimeClasses

timezones for OlsonNames

Examples

#example 1: import a text file, with dates stored in a single column (the 4th)
# and fill missing time points in monthly time series, with time lag equal to one


## Not run
## To run example 1 paste and copy the following lines (without the symbol '#')
## in the console:
#file_date <- system.file("extdata", "file_date.txt", package = "covatest")
#db.date <- read.STdata(file = file_date, header = TRUE, iclx = 2, icly = 3, iclt = 0,
#icldate = c(icl.date = 4, iclty = 0, icltm = 0, icltd = 0),
#icltime = c(icl.time = 0, icltH =0, icltM = 0, icltS = 0),
#iclvr = 5, iclsp = 1, missing.v = -99999, save.as = "data.frame",
#date.format = c("Date", "%d-%m-%Y"), bytime = "%m", tlag = 1)


#example 2: import a text file, with dates and times stored in different columns
# (from the 4th to the 9th) and fill missing time points in hourly time series,
# with time lag equal to three

## Not run
## To run example 2 paste and copy the following lines (without the symbol '#')
## in the console:
#file_datetime <- system.file("extdata", "file_datetime.txt", package = "covatest")
#db.datetime <- read.STdata(file = file_datetime, header = TRUE, iclx = 2, icly = 3, iclt = 0,
#icldate = c(icl.date = 0, iclty = 6, icltm = 5, icltd = 4),
#icltime = c(icl.time = 0, icltH = 7, icltM = 8, icltS = 9),
#iclvr = 10, iclsp = 1, missing.v = -99999, save.as = "data.frame",
#date.format = c("POSIX", "%Y %m %d %H %M %S"), bytime = "%H", tlag = 3)


#example 3: import a text file, with dates and times stored in different columns
# (from the 4th to the 9th) and fill missing time points in quarterly time series,
# with time lag equal to one

## Not run
## To run example 3 paste and copy the following lines (without the symbol '#')
## in the console:
#file_yq <- system.file("extdata", "file_yq.txt", package = "covatest")
#db.yq <- read.STdata(file = file_yq, header = TRUE, iclx = 2, icly = 3, iclt = 0,
#icldate = c(icl.date = 4, iclty = 0, icltm = 0, icltd = 0),
#icltime = c(icl.time = 0, icltH =0, icltM = 0, icltS = 0),
#iclvr = 5, iclsp = 1, missing.v = -99999, save.as = "data.frame",
#date.format = c("yearqtr", "%Y-Q%q"), bytime = "%q", tlag = 1)



covatest documentation built on July 9, 2023, 5:29 p.m.