View source: R/prep_importData.r
getFullData | R Documentation |
If everyting is left at the defaults, the function first tries to load an R-object containing previously imported spectral data. If this was not found, it tries to import spectral data from a file in the rawdata-folder, fuses (if slType is not NULL) these data together with the class-header provided in the sampleLists/sl_in folder and saves the resulting dataset. It is also possible to use a user-defined custom function to import data from a file in any format, containing the NIR-spectra as well as all the class- and numerical variables. In the latter case it is still possible to fuse additional variables provided in a file in sampleLists/sl_in with the imported data.
getFullData(
md = getmd(),
filetype = "def",
slType = "def",
trhLog = "def",
multiplyRows = "def",
ttl = TRUE,
stf = TRUE,
naString = "NA",
dol = "def",
sh = NULL,
remDC = getstn()$imp_remDoubleCols,
rawOnlyNIR = FALSE
)
gfd(
md = getmd(),
filetype = "def",
slType = "def",
trhLog = "def",
multiplyRows = "def",
ttl = TRUE,
stf = TRUE,
naString = "NA",
dol = "def",
sh = NULL,
remDC = getstn()$imp_remDoubleCols,
rawOnlyNIR = FALSE
)
md |
List. The object with the metadat of the experiment.
The default is to get the metadata file via |
filetype |
Character. The type of the spectral raw data file. If a value other than "def" is provided, this is overriding the value of "filetype" in the metadata file. Possible values are:
|
slType |
Character. The type of sample-list file in the sampleLists/ sl_in folder. Possible values are:
|
trhLog |
If data from temperatur and rel.humidity logger should be imported and aligned to a timestamp in the dataset. Possible values are:
|
multiplyRows |
Character or Logical. If the rows in the sample list
should be multiplied by the number of consecutive scans as specified
in the variable
Please also refer to |
ttl |
Logical, 'try to load'. If a possibly existing r-data file should be loaded. From the provided metadata (argument 'md') the experiment name is extracted, and if a file having the same name as the experiment name is found in folder 'R-data' it is loaded. If there is no such file, the spectra and class variables are imported from raw-data, and the whole dataset is safed if argument 'stf' is TRUE. In other words, providing 'FALSE' to argument 'ttl' always imports the spectra from the raw-data. |
stf |
Logical, 'save to file'. If the final dataset should be saved to the 'R-data' folder after import from the raw-data file. Defaults to 'TRUE'. |
naString |
Character. What to use as 'NA'. Applies only when 'filetype'
is |
dol |
Detect outliers. If outliers should be detected using the flags
provided by |
sh |
Character length one. Manual path to settings home. Can and should
be left at the default |
remDC |
Logical. Takes its factory-fresh default value |
rawOnlyNIR |
Logical. If class- and numerical variables that got possibly
imported from within a raw data file should be discarded. Defaults to
|
From the metadata, provided in the first argument, the experiment
name is extracted, and (if 'ttl' is TRUE) first the dataset-file having this
name is looked for in the 'R-data' folder and, if there, is being loaded.
If the file could not be found (or if 'ttl' is FALSE) the spectral file having
the same name as the experiment name (plus its specific ending) is imported
from the rawdata-folder. The sample list (what is used to create the header)
must be in the sampleLists/sl_in folder and must be named with the experiment
name, followed by a "-in" and then the file extension. To be recognized as
such, the standard columns have to be named with the standard column names
as defined in the settings.r file. (see printStdColnames
)
If you use a custom function and provide all the class- and numerical variables
together with the spectral data, set argument 'slType' to NULL.
If you import from a .pir file and have all the class- and numerical variables
inside the .pir file, set argument 'slType'to NULL.
If the dataset is the result of the fusion of other datasets
mergeDatasets
, the slot 'mergeInfo' will contain further information.
An object of class 'aquap_data' containing a data frame and six slots:
dataframe Consists of 'header', 'colRep' and 'NIR'.
metadata A list with the metadata of the experiment
anproc Possibly a list with an analysis procedure
mergeInfo Possibly an object of class 'aquap_mergeLabels'
calcVarInfo Possibly a list containing information on calculated variables.
(generateMergeLabels
), if the dataset is the result of merging
other datsets.
ncpwl Numeric length one, the number of characters before the wavelength in the column names of the NIR spectra.
version A length one character noting the version of the dataset.
The strict regime with the filenames (see Details) seems maybe at first at bit complicated, but it proved to be good practise to ensure a strict and conscious handling of the files.
For the raw spectra to be imported from a xlsx file, a few prerequisites have
to be fulfilled. It is recommended to look at the file structure of
xlsx files generated via export_ap2_ToXlsx
and use that as a
template.
At least two worksheets are required to be in the xlsx file: One contains the data, the other some metadata describing the data.
Data Worksheet: The worksheet containing the data can either contain
only NIR spectra, or class and numerical variables (what is called the
'header') **and** NIR spectra. (Compare export_ap2_ToXlsx
).
The data worksheet´s name should either end in _data
, or it has to be
the first worksheet.
_meta Worksheet: The name of the worksheet containing the metadata must end
in _meta
. There can only be one worksheet ending in _meta
in the
file. In this worksheet, there has to be one row with three columns. The names
of the columns have to be ncol_header
, rownamesAsFirstColumn
and ncpwl
.
First column in _meta: Provide an integer denoting the number of columns
in the header. Provide 0
(zero) if the data only contain NIR spectra.
Second column in _meta: Logical, denotes whether there are rownames in
the data. Set to TRUE
or FALSE
.
Third column in _meta: Provide an integer denoting the number of
characters in front of the wavelength-number. Set to 0
(zero) if there
are no characters in front of the wavelengths.
Timestamps: Should there be timestamps in the xlsx file, their column
name has to be Timestamp
, and the format has to be POSIXct
in
order to be recognized correctly. If these requirements can not be met, it is
advised to write a custom import function to import from xlsx files. Please
see custom_import
for further information.
If there are class and numerical variables present in the xlsx file **and**
variables from a sample list are imported as well (so slType
is **not**
NULL
), the sample list must contain a column denoting the
sample number. In this case, the sample number and the number of consecutive
scans get imported from the sample list file, and it will result in an error
to have those variables in the xlsx file as well.
Generally, it is not possible to have two variables with the same name.
Please look at the files generated via export_ap2_ToXlsx
as a
reference.
It is possible to have all or some of the class- and numeric variables in the
.dat
file. Whatever is present will be read out, and if an additional
sample list is demanded to be imported (parameter slType != NULL
) it
will be combined. The tab-separated .dat file by the Yunosato Aquaphotomics
lab is styled as follows:
The first rows starts with #D
and contains the dimension in columns
x rows (e.g. 25x30
)
The second row starts with #C
and contains the column names, with
a w
preceding the wavelengths, a *
preceding the class variables,
and a $
preceding the numeric variables. Please consider the standard
column names, see printStdColnames
.
The following rows all start with #S
and contain the data, and in
the first columnn there is a string. This string is structured via _
,
and in its last element there is a timestamp in the format
"YYYYMMDDHHMMSS"
, and in its second last element there are the
consecutive scans. All previous elements stay as they are and are used as
base for rownnames and provided as an extra class variable.
Designed to read the .csv file as produced by the MicroNIR software from VIAVI.
The number of consecutive scan is taken from the name assigned by the MicroNIR
software (someSample-1.sam
).
Decide for one of the following options when providing user input at the sample-ID
input in the MicroNIRs GUI:
Only Numbers: These numbers have be unique, and they will be used as as the sample number. No sampleID will be produced. In case of a misstake, i.e. a repeated number at a later measurement, just put in a character at some next measurement so that all sampleIDs are forced to be treated as character. Then the sample numbers will be auto-generated.
Character: Provide any character as sampleID. It should be unique for
each sample. In case of a misstake, i.e. a second instance of a sampleID,
all instances of this sampleID will be renamed by appending #n
with
n
being the number of the instance, starting with 1 with the first.
The consecutive scans will be renumbered to always range from 1 to n for
each sample instance.
The device temperature will be imported, also the notes, and the time as given
in the MicroNIR file. The DateTime format on your computer will decide about
the format of the timestamp in the MicroNIR file. aquap2´s input format to read
this time can be changed via the global settings file (parameter
imp_timeFormat_microNir
).
The instruments serial number will be stored in the slot instrument
in
the resulting R-object.
The HOBOware logger file is structured as follows:
first row contains a title, second row the column names
first column contains rownumber
second column contains the timestamp in the format
day/month/Year Hour:Minutes:Seconds
(24h format, day and
month in 2 digits, year in 4 digits). Please note that the time format
for importing from HOBOware data loggers can be specified in the
global settings file at the key imp_timeFormat_HOBOware
.
third column contains temperature data
fourth column contains relative humidity data
By creating or formatting your own temperature and data export like this,
it is possible to use the built in HOBO
import function for importing
temperature and rel. humidity data. Please also note the possibility to create
a custom import function for the temp. and rel.hum. data, see the input option
custom@yourFile.R
at the parameter trhLog
and
custom_TRH
.
readSpectra
, readHeader
,
aquap_data-methods
Other Core functions:
exportSampleList()
,
gdmm()
,
plot,aquap_cube,missing-method
,
plot,aquap_data,missing-method
## Not run:
md <- getmd()
fd <- getFullData(md)
fd <- getFullData() # the same as above
fd <- gfd(getmd(expName="OtherName")) # to override the experiment name specified in
# the metadata.r file and load the dataset called 'Foo' instead. (see ?getmd)
fd <- gfd(md=getmd("foo.r")) # loads metadata from file 'foo.r'
fd <- getFullData(filetype="custom@myFunc.r", slType="xls")
# This would use a custom function to read in the raw spectra, and read in
# the class- and numerical variables from an Excel file.
##
md <- getmd()
md$meta$expName <- "bar"
fd <- getFullData(md) # load a rawdata-file called "bar"
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.