In pat-s/2019-feature-selection: Research Compendium package for the publication "Monitoring forest health using hyperspectral imagery: Does feature selection improve the performance of machine-learning techniques?"

Modeling defoliation as a proxy for tree health: Comparison of feature-selection methods across multiple feature sets derived from hyperspectral data

Authors

Patrick Schratz (patrick.schratz@gmail.com)

Jannes Muenchow

Eugenia Iturritxa

José Cortés

Bernd Bischl

Alexander Brenning

Paper

This repository contains the research compendium of our work on comparing algorithms across multiple feature sets and filtering methods (including ensemble filter methods).

keywords
hyperspectral imagery
forest health monitoring
machine learning
feature selection
feature effects
model comparison
filter
imaging spectroscopy
Using machine-learning algorithms to model defoliation of Pinus Radiata trees.
Compare filtering methods (ensemble filter methods) across various algorithms and datasets
Predict defoliation to all available plots (24) and the whole Basque Country (at 200 m resolution)

The following directories belong to this project

code/01-download.R
code/02-hyperspectral-processing.R
code/04-data-processing.R
code/05-modeling/
code/06-benchmark-matrix/
code/07-reports/

How to use

Reading the code, accessing the data

See the code directory on GitHub for the source code that generated the figures and statistical results contained in the manuscript. See the data directory for instructions how to access the raw data used in the manuscript.

Installing the R package

This repository is organized as an R package, providing functions and raw data to reproduce and extend the analysis reported in the publication. Note that this package has been written explicitly for this project and is not suited a for more general use.

This project is setup with a drake workflow, ensuring reproducibility. Intermediate targets/objects will be stored in a hidden .drake directory.

The R library of this project is managed by renv. This makes sure that the exact same package versions are used when recreating the project. When calling renv::restore(), all required packages will be installed with their specific version.

Please note that this project was built with R version 4.0.4 on a CentOS 7 operating system. Some packages from this project are not compatible with R versions prior version 3.6.0.

To clone the project, a working installation of git is required. Open a terminal in the directory of your choice and execute:

git clone https://github.com/pat-s/2019-feature-selection.git

Then start R in this directory and run

renv::restore()
r_make()

Creating targets with {drake}

Calling r_make() will create targets specified in drake_config(targets = <target>) in _drake.R with the additional drake settings specified.

Out of the 400+ targets in this project, the following targets are important:

bm_aggregated: Aggregated benchmark results of all models using a 1 meter buffer for hyperspectral data extraction.
eda_wfr: Creates the report which shows Exploratory Data Analysis (EDA) plots and tables.
eval_performance_wfr: Creates the report which evaluates the model performances.
spectral_signatures_wfr: Creates the report which inspects the spectral signatures of the hyperspectral data.
feature_importance_wfr: Creates the report which inspects the feature importance of variables.
filter_correlations_wfr: Creates the report which inspects correlations among filter methods.

Note that most reports require some/all fitted models. Creating these (e.g. target benchmark_no_models) is a costly process and takes several days on a HPC and way longer on a single machine.

Notes and resources

The organisation of this compendium was inspired by the works of Carl Boettiger and Ben Marwick.

pat-s/2019-feature-selection documentation built on Dec. 24, 2021, 8:40 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

pat-s/2019-feature-selection
Research Compendium package for the publication "Monitoring forest health using hyperspectral imagery: Does feature selection improve the performance of machine-learning techniques?"

In pat-s/2019-feature-selection: Research Compendium package for the publication "Monitoring forest health using hyperspectral imagery: Does feature selection improve the performance of machine-learning techniques?"

Authors

Contents

Paper

Other Content

How to use

Reading the code, accessing the data

Installing the R package

Creating targets with {drake}

Notes and resources

R Package Documentation

Browse R Packages

We want your feedback!

pat-s/2019-feature-selection Research Compendium package for the publication "Monitoring forest health using hyperspectral imagery: Does feature selection improve the performance of machine-learning techniques?"

In pat-s/2019-feature-selection: Research Compendium package for the publication "Monitoring forest health using hyperspectral imagery: Does feature selection improve the performance of machine-learning techniques?"

Authors

Contents

Paper

Other Content

How to use

Reading the code, accessing the data

Installing the R package

Creating targets with {drake}

Notes and resources

R Package Documentation

Browse R Packages

We want your feedback!

pat-s/2019-feature-selection
Research Compendium package for the publication "Monitoring forest health using hyperspectral imagery: Does feature selection improve the performance of machine-learning techniques?"