HPC Job Scheduling Data
These data consist of information on 4331 jobs in a high performance computing environment. Seven attributes were recorded for each job along with a discrete class describing the execution time.
The predictors are:
Protocol (the type of computation),
Compounds (the number of data points for each jobs),
InputFields (the number of characteristic being estimated),
Iterations (maximum number of iterations for the computations),
NumPending (the number of other jobs pending at the time of launch),
Hour (decimal hour of day for launch time) and
Day (of launch time).
The classes are:
VF (very fast),
M (moderate) and
a data frame with 4331 rows and 8 columns
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
data(schedulingData) library(caret) set.seed(1104) inTrain <- createDataPartition(schedulingData$Class, p = .8, list = FALSE) schedulingData$NumPending <- schedulingData$NumPending + 1 trainData <- schedulingData[ inTrain,] testData <- schedulingData[-inTrain,] modForm <- as.formula(Class ~ Protocol + log10(Compounds) + log10(InputFields)+ log10(Iterations) + log10(NumPending) + Hour + Day)