pmml.neighbr: Generate PMML for a neighbr object from the *neighbr*...

Description Usage Arguments Details Value See Also Examples

View source: R/pmml.neighbr.R

Description

Generate PMML for a neighbr object from the neighbr package.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
## S3 method for class 'neighbr'
pmml(
  model,
  model_name = "kNN_model",
  app_name = "SoftwareAG PMML Generator",
  description = "K Nearest Neighbors Model",
  copyright = NULL,
  transforms = NULL,
  missing_value_replacement = NULL,
  ...
)

Arguments

model

A neighbr object.

model_name

A name to be given to the PMML model.

app_name

The name of the application that generated the PMML.

description

A descriptive text for the Header element of the PMML.

copyright

The copyright notice for the model.

transforms

Data transformations.

missing_value_replacement

Value to be used as the 'missingValueReplacement' attribute for all MiningFields.

...

Further arguments passed to or from other methods.

Details

The model is represented in the PMML NearestNeighborModel format.

The current version of this converter does not support transformations (transforms must be left as NULL), sets categoricalScoringMethod to "majorityVote", sets continuousScoringMethod to "average", and isTransoformed to "false".

Value

PMML representation of the neighbr object.

See Also

pmml, PMML KNN specification

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
## Not run: 

# Continuous features with continuous target, categorical target,
# and neighbor ranking:

library(neighbr)
data(iris)

# Add an ID column to the data for neighbor ranking:
iris$ID <- c(1:150)

# Train set contains all predicted variables, features, and ID column:
train_set <- iris[1:140, ]

# Omit predicted variables or ID column from test set:
test_set <- iris[141:150, -c(4, 5, 6)]

fit <- knn(
  train_set = train_set, test_set = test_set,
  k = 3,
  categorical_target = "Species",
  continuous_target = "Petal.Width",
  comparison_measure = "squared_euclidean",
  return_ranked_neighbors = 3,
  id = "ID"
)

fit_pmml <- pmml(fit)


# Logical features with categorical target and neighbor ranking:

library(neighbr)
data("houseVotes84")

# Remove any rows with N/A elements:
dat <- houseVotes84[complete.cases(houseVotes84), ]

# Change all {yes,no} factors to {0,1}:
feature_names <- names(dat)[!names(dat) %in% c("Class", "ID")]
for (n in feature_names) {
  levels(dat[, n])[levels(dat[, n]) == "n"] <- 0
  levels(dat[, n])[levels(dat[, n]) == "y"] <- 1
}

# Change factors to numeric:
for (n in feature_names) {
  dat[, n] <- as.numeric(levels(dat[, n]))[dat[, n]]
}

# Add an ID column for neighbor ranking:
dat$ID <- c(1:nrow(dat))

# Train set contains features, predicted variable, and ID:
train_set <- dat[1:225, ]

# Test set contains features only:
test_set <- dat[226:232, !names(dat) %in% c("Class", "ID")]

fit <- knn(
  train_set = train_set, test_set = test_set,
  k = 5,
  categorical_target = "Class",
  comparison_measure = "jaccard",
  return_ranked_neighbors = 3,
  id = "ID"
)

fit_pmml <- pmml(fit)

## End(Not run)

pmml documentation built on Jan. 16, 2021, 5:30 p.m.