Use knn() to fit the 1NN classifier to the last column of the sonar traiing data.
setwd("./data") ##download.file("nice redirects google","sonar_train.csv", method="curl") ##download.file("https://sites.google.com/site/stats202/data/sonar_train.csv", ## "sonar_test.csv",method="curl") train<-read.csv("sonar_train.csv",header=F) test<-read.csv("sonar_test.csv",header=F) setwd("../")
Load libraries.
library(class)
Using default values.
## set the y var as factor for class prediction. y<-as.factor(train[,61]) ## set the remaining colums to be test data. x<-train[,1:60] ## train model to predict y using all x. fit<-knn(train=x,test=x,cl=y,k=1) ## Note: both train and test parms use the same, in this case, train data set. This ## gives the training misclass error.
Assess fit on training data. (Really silly b/c you're using the training data to predict itself. Further, w/ k=1 it's not doing any real fitting.)
fit ## fit give the predicted class labels of the training data. 1-sum(y==fit)/length(y) ## sum(y==fit) gives the number of times your prediction (fit) matched your actual ## class label (y). So, this gives the overall accuracy (1-that pct) or the overall ## error rate. Zero in this case.
Compute the misclass error on the training and test data.
y_test<-as.factor(test[,61]) x_test<-test[,1:60] fit_test<-knn(x,x_test,y,k=1) 1-sum(y_test==fit_test)/length(y_test) ## how often does my predicted test lables (fit) = my true test labels (y_test) ## 0.21
So, when k=1, train error = 0, test error = 0.21... slightly better than decision trees. Not bad w/ 60 dimension data.
So, let's try using k=5.
fit_test<-knn(x,x_test,y,k=5) 1-sum(y_test==fit_test)/length(y_test) ## 0.23
Could be we're generalizing away from the signal. k=3.
fit_test<-knn(x,x_test,y,k=3) 1-sum(y_test==fit_test)/length(y_test) ## 0.18
k=3 looks like balance between over-fitting (k=1) and over-generalizing (k=5). Now would be a good time to bring in another test data set.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.