dataM: Example dataset
In nmm: Nonlinear Multivariate Models

Description Usage Format Details Examples

Data "MathPlacement" taken from Stat2Data package.

1	data(dataM)

A data frame containing:

Student: Identification number for each student
Gender: 0=Female, 1=Male
PSATM: PSAT score in Math
SATM: SAT score in Math
ACTM: ACT Score in Math
Rank: Adjusted rank in HS class
Size: Number of students in HS class
GPAadj: Adjusted GPA
PlcmtScore: Score on math placement exam
Recommends: Recommended course: R0 R01 R1 R12 R2 R3 R4 R6 R8
Course: Actual course taken
Grade: Course grade
RecTaken: 1=recommended course, 0=otherwise
TooHigh: 1=took course above recommended, 0=otherwise
TooLow: 1=took course below recommended, 0=otherwise
CourseSuccess: 1=B or better grade, 0=grade below B
DR_Course: according to recommendations, which level of course was taken: alow - lower, bnormal - recommended, chigh - higher

Code for data modifications can be found in the example section.

data(dataM)
library(magrittr)
library(dplyr)
if (requireNamespace("recipes", quietly = TRUE)&requireNamespace("Stat2Data", quietly = TRUE)) {
data("MathPlacement", package="Stat2Data")
head(MathPlacement)
library(recipes)
# As some of the data is missing, k-nearest neighbors (knn) imputation is 
# used to fill the gaps. This is done with recipes package and function 
# step_knnimpute.
dataM <- recipe(~ ., data = MathPlacement) %>%
step_knnimpute(everything()) %>% prep() %>% juice()
# Afterwards we create a categorical variable that will show whether a 
# student took a course which was too high, too low, the recommended one or
# something else happened:
dataM %<>% mutate(Student = 1:n(), DR_Course = case_when(
TooHigh == 1 ~ "chigh",
TooLow == 1 ~ "alow",
RecTaken == 1 ~ "bnormal",
TRUE ~"dother"
)) 
# We remove observations with ambiguous course status:
dataM %<>% filter(DR_Course!="dother")
dataM %>% select(DR_Course) %>% table %>% t 
}