data_titanic | R Documentation |
This dataset contains information on 1309 passengers of the RMS Titanic. The goal is to predict survival based on 11 characteristics such as the travel class, age and sex of the passengers.
The original data source is https://www.kaggle.com/c/titanic/data
The data is split up in a training data consisting of 891 observations and a test data of 418 observations. The response in the test set was obtained by combining information from other data files, and has been verified by submitting it as a ‘prediction’ to kaggle and getting perfect marks.
data("data_titanic")
A data frame with 1309 observations on the following variables.
a unique identified for each passenger.
travel class of the passenger.
name of the passenger.
sex of the passenger.
age of the passenger.
number of siblings and spouses traveling with the passenger.
number of parents and children traveling with the passenger.
Ticket number of the passenger.
fare paid for the ticket.
cabin number of the passenger.
Port of embarkation. Takes the values C (Cherbourg), Q (Queenstown) and S (Southampton).
factor indicating casualty or survivor.
vector taking the values “train” or “test” indicating whether the observation belongs to the training or the test data.
https://www.kaggle.com/c/titanic/data
data("data_titanic")
traindata <- data_titanic[which(data_titanic$dataType == "train"), -13]
testdata <- data_titanic[which(data_titanic$dataType == "test"), -13]
str(traindata)
table(traindata$y)
# The data are used in:
## Not run:
vignette("Rpart_examples")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.