mergeDatasets: Merge Datasets

mergeDatasets,aquap_data,aquap_data,missing-methodR Documentation

Merge Datasets

Description

Merge together two or more datasets, and possibly add class- or numerical variables to each dataset via the 'mergeLabels' object.

Usage

## S4 method for signature 'aquap_data,aquap_data,missing'
mergeDatasets(
  ds1,
  ds2,
  mergeLabels = NULL,
  noMatchH = getstn()$gen_merge_noMatchH,
  noMatchW = getstn()$gen_merge_noMatchW,
  resaTo = "best",
  resaMethod = getstn()$gen_resample_method,
  dol = getstn()$gen_merge_detectOutliers
)

## S4 method for signature 'aquap_data,aquap_data,aquap_mergeLabels'
mergeDatasets(
  ds1,
  ds2,
  mergeLabels,
  noMatchH = getstn()$gen_merge_noMatchH,
  noMatchW = getstn()$gen_merge_noMatchW,
  resaTo = "best",
  resaMethod = getstn()$gen_resample_method,
  dol = getstn()$gen_merge_detectOutliers
)

## S4 method for signature 'list,missing,missing'
mergeDatasets(
  ds1,
  ds2 = NULL,
  mergeLabels = NULL,
  noMatchH = getstn()$gen_merge_noMatchH,
  noMatchW = getstn()$gen_merge_noMatchW,
  resaTo = "best",
  resaMethod = getstn()$gen_resample_method,
  dol = getstn()$gen_merge_detectOutliers
)

## S4 method for signature 'list,missing,aquap_mergeLabels'
mergeDatasets(
  ds1,
  ds2 = NULL,
  mergeLabels,
  noMatchH = getstn()$gen_merge_noMatchH,
  noMatchW = getstn()$gen_merge_noMatchW,
  resaTo = "best",
  resaMethod = getstn()$gen_resample_method,
  dol = getstn()$gen_merge_detectOutliers
)

## S4 method for signature 'list,aquap_mergeLabels,missing'
mergeDatasets(
  ds1,
  ds2,
  mergeLabels = NULL,
  noMatchH = getstn()$gen_merge_noMatchH,
  noMatchW = getstn()$gen_merge_noMatchW,
  resaTo = "best",
  resaMethod = getstn()$gen_resample_method,
  dol = getstn()$gen_merge_detectOutliers
)

Arguments

ds1

An object of class 'aquap_data' or a list containing any number of objects of class 'aquap_data'

ds2

An object of class 'aquap_data', can be missing.

mergeLabels

An object of class 'aquap_mergeLabels' as generated by generateMergeLabels, can be missing.

noMatchH

Character length one. Defines what should happen in the case of non-matching header structures, i.e. the column names of the headers of the datasets to me merged can not be overlapped. The default value is defined in the settings.r file (gen_merge_noMatchH). Possible values are:

ask

The non-matching header-columns in each dataset are displayed, and the user is asked interactively what to do, with the three options below as possible options.

delete

Non-matching header columns are automatically deleted.

fill

Each column name not existing in all of the datasets to be merged is added to those datasets where it does not exist. The data is filled in with 'NAs'.

stop

In case of non-overlapping header structures, the merging process is stopped, with possibly a message being displayed.

noMatchW

Character length one. Defines what should happen in the case of non-matching wavelengths, i.e. the wavelengths in the datasets to be merged are not identical. The default value is defined in the settings.r file (gen_merge_noMatchH). Possible values are:

ask

The non-matching wavelenghts in each dataset are displayed, and the user is asked interactively what to do, with the five options below as possible options.

cut

All wavelengths outside a range common to all datasets will be deleted. In other words, for some datasets the 'outsiders', i.e. the wavelengths outside of that common range, will be deleted.

fill

Missing wavelengths will be filled in with 'NAs'. In other words, the wavelengths of all datasets will be expanded to encompass the overal mimimum and the overal maximum of the wavelengths of the datasets.

resacut

Same as 'cut', but datasets are resampled to have all the same delta wavelength.

resafill

Same as 'fill', but datasets are resampled to have all the same delta wavelength.

stop

In case of non-matching wavelengths, the merging process is stopped, with possibly a message being displayed.

resaTo

Target wavelength for a (possible) resampling process (which uses the function do_resampleNIR. Can be one of the following:

"best"

If left at the default 'best' the best target wavelength will be automatically determined. The best target wavelength is a solution where as few as possible datasets get resampled.

Character length one

The name of the dataset (if a named list is provided) containing the target wavelength.

Integer length one

The number of the dataset (e.g. in the provided list) containing the target wavelength.

Numeric Vector

Provide a numeric vector as target wavelengths to which all datasets will be resampled. The vector will be checked for plausibility, i.e. if it is in range of the provided datasets etc. For 'filling in' (option 'fill' or 'resafill' in argument 'noMatchW') only numeric vectors x with length(unique(diff(x))) == 1 are accepted.

resaMethod

Character length one. Which of the resampling methods should be used. Factory-fresh defaults to 'cubic'; the default can be changed in the settings.r file parameter gen_resample_method. See do_resampleNIR and interp1. 'linear' is much faster than e.g. 'spline' or 'cubic', but the quality of the resampling is not as good.

dol

Logical length one. If outliers should be detected based on the scope of the new, merged dataset. The default value is defined in the settings file at gen_merge_detectOutliers.

Details

The resulting dataset is void of metadata (object@metadata) and analysis procedures. The order of column names in each header in a dataset is irrelevant, e.g. a header with the column names 'AA, BB, CC' does overlap with a header with the column names 'AA, CC, BB'.

Value

An object of class 'aquap_data', with all the single datasets merged together.

See Also

generateMergeLabels

Other dataset modification functions: [,aquap_data-method, calculateVariable(), combineVariable(), generateMergeLabels,aquap_data,aquap_data,character,character-method


bpollner/aquap2 documentation built on March 29, 2024, 7:33 a.m.