drml: Dimension Reduction in Image Classification

Description Usage Arguments Details Value Author(s) Examples

Description

Functions for the "dimension reduction + machine learning" approach to image classification.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
   drmlTDAsweep(data, yName, qeFtnName, opts = NULL, RGB = FALSE, pixAug = 0, 
       tdasAug = 0, holdout = floor(min(1000, 0.1 * nrow(imgs))), 
       nr = 0, nc = 0, thresh = c(50, 100, 150), intervalWidth = 2) 
   drmlPCA(data, yName, qeFtnName, opts = NULL, dataAug = NULL, 
       holdout = floor(min(1000, 0.1 * nrow(data))), pcaProp) 
   drmlUMAP(data, yName, qeFtnName, opts = NULL, dataAug = NULL, 
       holdout = floor(min(1000, 0.1 * nrow(data))), nComps = 25) 
   drmlDCT(data, yName, qeFtnName, opts = NULL, dataAug = NULL, 
       holdout = floor(min(1000, 0.1 * nrow(data))), nFreqs) 
   drmlRLRN(data, yName, qeFtnName, opts = NULL, RGB = FALSE, pixAug = 0, 
       holdout = floor(min(1000, 0.1 * nrow(imgs))), nr = 0, nc = 0, 
       thresh = c(50, 100, 150)) 

Arguments

data

Data frame, one image per row, pixels within an image being stored in row-major. For color images, 3 sets of columns, for the 3 primary colors.

yName

Name of the column within data that stores the image labels, an R factor.

qeFtnName

Name of the function from the qeML to be used in the "ML" portion of "DR+ML."

opts

Options for qeFtnName.

RGB

TRUE for color, FALSE for grayscale.

pixAug

Number of images to add via data augmentation, between the DR and ML stages.

holdout

Size of holdout set.

nr

Number of pixel rows within an image.

nc

Number of pixel columns within an image.

thresh

Vector specifying the threshold values. If this is a negative scalar -m, then then m threshold values will be generated, partitioning [0,255] into m+1 equal parts.

Details

Dimension reduction is done on the pixel data, after which the ML method is applied. If data augmentation is requested, this is performed on the dimension-reduced data, before applying ML. This should yield a speedup over doing data augmentation before dimension reduction. Half the augmented images are horizontal flips, half vertical.

Value

If holdout is nonzero, the data are first randomly partitioned into training and validations sets, and overall misclassification rate is reported in the testAcc component of the return value.

New cases can be classified with the generic predict function (only for TDAsweep as of now).

Author(s)

Norm Matloff, Yu-Shih Chen, Melissa Goh

Examples

1
2
3
4
5
6
7
8
9
## Not run: 

data(hm)  # histology MNIST, built-in dataset
tdasOut <- drmlTDAsweep(hm,'label','qeRF',nr=28,nc=28,thresh=-7)
tdasOut$testAcc
# 0.216, 22


## End(Not run)

matloff/dimRedImage documentation built on Dec. 21, 2021, 2:53 p.m.