xvalReco: Cross Validation

View source: R/xvalReco.R

xvalMM,xvalMLE,xvalReco,xvalCos,xvalMultiplic,getTestSet,getTrainSet,plot.xvalbR Documentation

Cross Validation

Description

Cross-validation for the various methods in this package, with parallel computation capability in some cases.

Usage

        xvalMM(ratingsIn, trainprop = 0.5, minN = 0)
        xvalMLE(ratingsIn, trainprop = 0.5, cls = NULL) 
	xvalReco(ratingsIn, trainprop = 0.5,cls = NULL,rnk = 10)
        xvalCos(ratingsIn, k, usrCovs = NULL, itmCats = NULL, wtcovs = NULL, 
           wtcats = NULL, trainprop = 0.5)
        xvalMultiplic(ratingsIn) 
	getTrainSet(ratingsIn,trainprop)
	getTestSet(ratingsIn,trainSet)
        plot.xvalb(xvalObj,whichIdxs=NULL)

Arguments

ratingsIn

Input data frame. Within-row format is UserID, ItemID, rating

minN

Applies to situations in which covariates are present. In predicting for user i, then either Yi. or the regression-based prediction will be used, depending on whether N_i >= minN.

trainprop

The fraction of ratingsIn we want to use for our training set

cls

R paralell cluster.

rnk

Desired rank for recosystem analysis.

xvalObj

An object of class 'xvalb'.

whichIdxs

A vector of indices of rows of the test set to be used in plotting.

Details

These functions perform cross-validation using the various methods in this package. A number of measures of prediction accuracy are output (see Value), including comparison to accuracy obtained by simply predicting by a constant, thus enabling one to ask the question, Are we predicting better with our model than by chance?

The functions getTrainSet and getTestSet are helper functions to generate the training and test sets.

The function plot.xvalb is a method for the generic function plot. It plots the estimated density of the predicted ratings, and a smoothed scatter plot of the predicted ratings against the actual ones. If whichIdxs is specified, the user can choose to plot only a subset of the data, say rows corresponding to large values of a covariate.

Value

The xval functions return an object of class 'xvalb', with the following components:

  • ndata: Number of rows in the original input data

  • trainprop: As above.

  • numpredna: Number of rows in the test set for which prediction was not possible.

  • acc: Accuracy measures; see below.

  • idxs: Indices in the original input data selected for the test set.

  • actuals: The actual ratings in the test set.

  • preds: The predicted ratings in the test set.

The acc component is an R list with these elements:

  • exact:Proportion of ratings predicted exactly correctly.

  • madMean absolute prediction error.

  • rmsL2 ("root mean squared") prediction error.

  • overallexact:Proportion of ratings predicted exactly correctly by simply taking our guess to be the (rounded) overall mean item rating.

  • overallmad:Mean absolute prediction error resulting from simply taking our guess to be the overall mean item rating.

  • overallmad:L2 prediction error resulting from simply taking our guess to be the overall mean item rating.

Author(s)

Pooja Rajkumar and Norm Matloff

Examples

       ivl <- InstEval 
       ivl$s <- as.numeric(ivl$s) 
       ivl$d <- as.numeric(ivl$d) 
       ivlsdy <- ivl[,c(1,2,7)]
       # Test for xvalReco
       res <- xvalReco(ivlsdy)
      # Get accuracy of xvalReco test
       res$acc	
       xvoutmm <- xvalMM(ivlsdy) 
       xvoutmm$acc
       # plot(xvoutmm)  
       xvoutcos5 <- xvalCos(ivlsdy,5)  # takes time
       xvoutcos5$acc

matloff/rectools documentation built on March 31, 2022, 12:09 p.m.