Description Usage Arguments Details Value Examples
View source: R/read_metadata.R
Reads meta-data from Excel files to the format accepted by NPI data managers
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | read_metadata(
meta_file,
sheet = 1,
additional = NULL,
add_time = 0,
date_origin = "1899-12-30",
expedition = NULL,
station = NULL,
type = NULL,
sample_name = NULL,
longitude = NULL,
latitude = NULL,
date = NULL,
bottom_depth = NULL,
gear = NULL,
from = NULL,
to = NULL,
responsible = NULL,
comment = NULL,
guess_colnames = TRUE
)
|
meta_file |
File name of the meta-data table or a data frame containing meta-data. If file name, the file has to have |
sheet |
sheet number or name where meta-data are located. See |
additional |
Additional columns to be included in meta-data. Must be specified as a named character vector, which lists the exact column names to be included together with exported column names as names of the vector. See Examples. |
add_time |
Hours to be added to the ISO 8601 |
date_origin |
The origin for recorded dates in the Excel sheet in "YYYY-MM-DD" format. See Details. |
expedition |
Column name specifying the name of the expedition. See Details. |
station |
Column name specifying the station name. See Details. |
type |
Column name specifying the sample type column. See Details. |
sample_name |
Column name specifying the sample name column. See Details. |
longitude |
Column name specifying the longitude column. Coordinates should be given in decimal degrees. See Details. |
latitude |
Column name specifying the latitude column. Coordinates should be given in decimal degrees. See Details. |
date |
Column name specifying the sampling time. The information will be transformed to ISO 8601 standard format. See note about date conversion in Details. See Details. |
bottom_depth |
Column name specifying the bottom depth during sampling. See Details. |
gear |
Name of the column specifying the sampling gear. See Details. |
from |
Name of the column specifying the depth from which sampling was started. See Details. |
to |
Name of the column specifying the depth from which sampling was ended. See Details. |
responsible |
Name of the column specifying the responsible persons for the sampling. See Details. |
comment |
Name of the column specifying comments. Any non-numeric values in |
guess_colnames |
Logical indicating whether |
The guess_colname
function is used to search for column names in meta_file
by default. This procedure saves the user from specifying all the required column names (expedition
, station
, type
, sample_name
, longitude
, latitude
, date
, bottom_depth
, gear
, from
, to
, responsible
and comment
). The function works well with tested Excel sheets, but might cause an error if column names are far from tested names. Modify coln_search_words
in such case (guessing is done using regular expressions).
The easiest solution is to change the column names close to column names listed in Usage (or close to those listed in coln_search_words
). Alternatively you can list all column names manually. Note that any white space in column names must be denoted by period (.
). All column names should be specified as character strings of length 1. The column names refer to the meta-data table (dt
).
Date conversion between Excel and R is fairly unstable business as your current locale (i.e. to what time zone your computer is set to) might affect the outcome. The easiest way to convert dates is to enter meta_file
as a character string specifying the Excel file name (and location). This allows the usage of openxlsx package's convertToDateTime
function, which often reads date-times correctly, given that the dates are entered in a consistent format (this is not always the case in sample logs).
Another option is to input a data frame already opened from Excel. This option might be necessary when doing changes to the data frame prior passing it through the read_metadata
function. In this case, you have to make sure that date_origin
is correct for your operating system (see https://www.r-bloggers.com/date-formats-in-r/).
The add_time
argument can be used to add or subtract hours from the output, if the times do not match with those in the Excel sheet. This can be helpful in either cases, if your locale causes an offset between recorded dates.
Returns a list of class MetaData
that contains modified meta-data records ($meta
), removed meta-data records ($deleted
), file name to be used when saved for database import (file_id
) and a data frame containing row numbers of duplicates in the original Excel sheet ($duplicates
).
Dates ($meta$date
) are returned as UTC date-time in ISO 8601 format.
1 2 3 4 5 6 7 | ## Read meta-data and let the function to find errors in it:
x <- read_metadata("Samplelog MOSJ 2015.xlsx")
## Meta-data reading follows openxlsx syntax.
## You can add columns using the additional argument:
x <- read_metadata("GlacierFront_2017_Samplelog_20171024.xlsx",
sheet = "SAMPLELOG", additional = c(Conveyance = "Sampled.from"))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.