Description Usage Arguments Details Value Examples
View source: R/HICFunctionsForCleaningContinuousBioParameters.R
Bach process function for importing the continuous water quality data from the HIC Hydrological Information Center(HIC) database. This function takes the csv file that is exported from the HIC data base and converts it into a format that can be used easily in R. The header of the HIC csv file has horizontally oriented metadata. These meta data are taken from the header and put into columns in the dataset. The dates and times are converted into R friendly datetime format and into UNIX numeric format for easier handling in R.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | HIC.Continuous.Data.Import.Format(
InputDirectory,
OutputDirectory,
Data.sep = "\t",
Meta.sep = ":",
Data.header.line = NULL,
Dec = ".",
DateFormat = "%d/%m/%Y",
TimeZone = "Etc/GMT-1",
OneYearDataSet = F,
ValueColumnNum = 3,
ParamNameColmn = "Parameter.Name",
StationNoColmn = "Station.Number",
DateColmn = "Date",
TimeColmn = "Time"
)
|
InputDirectory |
Path to the folder directory containing all the csv tables that you wish to format placed in quotations. Must have no back slashes (\), they must be all forward slashes (/) or double back slashes (\\). |
OutputDirectory |
Path to the directory where you wish to save the formatted data in quotations. Must have no back slashes (\), they must be all forward slashes (/) or double back slashes (\\). |
Data.sep |
The field separator character for the value data. The columns are separated by this character. The default is tab separated "\t". |
Meta.sep |
The field separator for the metadata values. The columns are separated by this character. The default is colon separated ":". |
Data.header.line |
It is assumed that the data header is the first line with the most separations. But if not, then the line number of the data header can be specified. |
Dec |
The decimal character. By default ".". |
DateFormat |
Character string giving the date format. See the strptime() help file for additional help. |
TimeZone |
The time zone is by default UTC+1 "Etc/GMT-1". Use OlsonNames() for a list of all time zone names. |
OneYearDataSet |
If the dataset is only within one calender year, then you can change this to TRUE and the year will be added to the ID and the file name, but if there are more than one calender years in the dataset then a warning message will appear and the year will not be added to the ID or file name. |
ValueColumnNum |
The column number of the data values. This column has inconsistent naming and thus must be refered to by column number. |
ParamNameColmn |
The parameter name column name in the meta data. If you enter in new names, then replace all spaces and special characters with "." |
StationNoColmn |
The station number column name in the meta data. If you enter in new names, then replace all spaces and special characters with "." |
DateColmn |
Date column name in quotations. |
TimeColmn |
Time column name in quotations. |
Place all HIC csv files into one directory. Specify this InputDirectory in the function in quotes and with forward-slashes(/) or double-back-slashes(\\) no back-slashes(\). Specify the OutputDirectory where you would like to have the data be exported to in quotes and with forward-slashes(/) or double-back-slashes(\\) no back-slashes(\). If you copy the directory path from windows, it will have back-slashes(\) and these need to be changed to forward-slashes(/) or double-back-slashes(\\). If you don't write the full path for the OutputDirectory, then it will create that directory in your working directory.
Data table format assumed to be a vertical list of the meta data on top of the horizontally oriented data table.
Example of the assumed data table structure of the input data:
Station.Number: RTZ25
Parameter.Name: temp
Parameter.Unit: C
Date | Value | State.of.Value |
25/01/2018 | 5.6 | 110 |
25/01/2018 | 7.8 | 110 |
25/01/2018 | 4.2 | 110 |
This code can't deal with extremely inconsistent column names between files. It searches for key words to find the columns in the meta data, but if there are no common words between the different files, then it can't find them. Data is all saved with auto-names Station.ParameterName.Year.SystemTimeInSecondsFileNumber.csv so there is a risk of overwriting older data if you run this batch process in a loop and if multiple files are processed within less than a second of each other with the same station and parameter and happen to be the same file number in their folder. This is unlikely to occur but in theory is possible.
This function returns each seperate csv file in the input directory as a separate comma separated csv file in the output directory with all the metadata placed into columns to the right of the data, date and time merged into one datetime column ("DateTime"), a numeric datetime column ("DateTimeUnix") in UNIX seconds and all parameter values in the column "Value".
1 2 3 4 | HIC.Continuous.Data.Import.Format(
InputDirectory = "C:/Rdata/originaldata/HICdata",
OutputDirectory = "FormattedHICdata")
#the folder "FormattedHICdata" will be created in your working directory since it is not a full path.
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.