sits_reduce_imbalance: Reduce imbalance in a set of samples

View source: R/sits_sample_functions.R

sits_reduce_imbalanceR Documentation

Reduce imbalance in a set of samples

Description

Takes a sits tibble with different labels and returns a new tibble. Deals with class imbalance using the synthetic minority oversampling technique (SMOTE) for oversampling. Undersampling is done using the SOM methods available in the sits package.

Usage

sits_reduce_imbalance(
  samples,
  n_samples_over = 200,
  n_samples_under = 400,
  multicores = 2
)

Arguments

samples

Sample set to rebalance

n_samples_over

Number of samples to oversample for classes with samples less than this number.

n_samples_under

Number of samples to undersample for classes with samples more than this number.

multicores

Number of cores to process the data (default 2).

Value

A sits tibble with reduced sample imbalance.

Author(s)

Gilberto Camara, gilberto.camara@inpe.br

References

The reference paper on SMOTE is N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.

Undersampling uses the SOM map developed by Lorena Santos and co-workers and used in the sits_som_map() function. The SOM map technique is described in the paper: Lorena Santos, Karine Ferreira, Gilberto Camara, Michelle Picoli, Rolf Simoes, “Quality control and class noise reduction of satellite image time series”. ISPRS Journal of Photogrammetry and Remote Sensing, vol. 177, pp 75-88, 2021. https://doi.org/10.1016/j.isprsjprs.2021.04.014.

Examples

if (sits_run_examples()) {
    # print the labels summary for a sample set
    summary(samples_modis_ndvi)
    # reduce the sample imbalance
    new_samples <- sits_reduce_imbalance(samples_modis_ndvi,
        n_samples_over = 200,
        n_samples_under = 200,
        multicores = 1
    )
    # print the labels summary for the rebalanced set
    summary(new_samples)
}

sits documentation built on Nov. 2, 2023, 5:59 p.m.