R/breast_cancer.R
In ODRF: Oblique Decision Random Forest for Classification and Regression

#' Breast Cancer Dataset
#'
#' Breast cancer is the most common cancer amongst women in the world. It accounts for \eqn{25\%} of all cancer cases, and affected over 2.1 Million people in 2015 alone.
#' It starts when cells in the breast begin to grow out of control. These cells usually form tumors that can be seen via X-ray or felt as lumps in the breast area.
#' The key challenges against it's detection is how to classify tumors into malignant (cancerous) or benign(non cancerous).
#'
#' The actual linear program used to obtain the separating plane in the 3-dimensional space is that described in:
#' \itemize{
#' \item ID number
#' \item Diagnosis (M = malignant, B = benign)
#' \item Ten real-valued features are computed for each cell nucleus:
#' \itemize{
#' \item radius (mean of distances from center to points on the perimeter)
#' \item texture (standard deviation of gray-scale values)
#' \item perimeter
#' \item area
#' \item smoothness (local variation in radius lengths)
#' \item compactness (perimeter^2 / area - 1.0)
#' \item concavity (severity of concave portions of the contour)
#' \item concave points (number of concave portions of the contour)
#' \item symmetry
#' \item fractal dimension ("coastline approximation" - 1)
#' }}
#'
#' @docType data
#' @keywords datasets internal
#' @format A data frame with 569 rows and 30 covariate variables and 1 response variable
#' @source \url{https://www.kaggle.com/datasets/yasserh/breast-cancer-dataset?select=breast-cancer.csv} and \url{https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(diagnostic)}
#' @references Wolberg WH, Street WN, Mangasarian OL. Machine learning techniques to diagnose breast cancer from image-processed nuclear features of fine needle aspirates. Cancer Lett. 1994 Mar 15;77(2-3):163-71.
#' @name breast_cancer
#'
#' @seealso  \code{\link{body_fat}} \code{\link{seeds}}
#' @examples
#' data(breast_cancer)
#' set.seed(221212)
#' train <- sample(1:569, 80)
#' train_data <- data.frame(breast_cancer[train, -1])
#' test_data <- data.frame(breast_cancer[-train, -1])
#'
#' forest <- ODRF(diagnosis ~ ., train_data, split = "gini", parallel = FALSE, ntrees = 50)
#' pred <- predict(forest, test_data[, -1])
#' # classification error
#' (mean(pred != test_data[, 1]))
#'
#' tree <- ODT(diagnosis ~ ., train_data, split = "gini")
#' pred <- predict(tree, test_data[, -1])
#' # classification error
#' (mean(pred != test_data[, 1]))
NULL

Any scripts or data that you put into this service are public.

ODRF documentation built on May 31, 2023, 8:22 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

ODRF
Oblique Decision Random Forest for Classification and Regression

R/breast_cancer.R
In ODRF: Oblique Decision Random Forest for Classification and Regression

Try the ODRF package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

ODRF Oblique Decision Random Forest for Classification and Regression

R/breast_cancer.R In ODRF: Oblique Decision Random Forest for Classification and Regression

Try the ODRF package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

ODRF
Oblique Decision Random Forest for Classification and Regression

R/breast_cancer.R
In ODRF: Oblique Decision Random Forest for Classification and Regression