R/data_titanic.R

#' Passengers and Crew on the RMS Titanic Data
#'
#' The \code{titanic} data is a complete list of passengers and crew members on  the RMS Titanic.
#' It includes a variable indicating whether a person did  survive the sinking of the RMS
#' Titanic on April 15, 1912.
#'
#' This dataset was copied from the \code{stablelearner} package and went through few variable
#' transformations. Levels in \code{embarked} was replaced with full names, \code{sibsp}, \code{parch} and \code{fare}
#' were converted to numerical variables and values for crew were replaced with 0.
#' If you use this dataset please cite the original package.
#'
#' From \code{stablelearner}: The website \url{https://www.encyclopedia-titanica.org} offers detailed  information about passengers and crew
#' members on the RMS Titanic. According to the website 1317 passengers and 890 crew member were abord.
#' 8 musicians and 9 employees of the shipyard company are listed as passengers, but travelled with a
#' free ticket, which is why they have \code{NA} values in \code{fare}. In addition to that, \code{fare}
#' is truely missing for a few regular passengers.
#'
#' \itemize{
#' \item gender a factor with levels \code{male} and \code{female}.
#' \item age a numeric value with the persons age on the day of the sinking.
#' \item class a factor specifying the class for passengers or the type of service aboard for crew members.
#' \item embarked a factor with the persons place of of embarkment (Belfast/Cherbourg/Queenstown/Southampton).
#' \item country a factor with the persons home country.
#' \item fare a numeric value with the ticket price (\code{0} for crew members, musicians and employees of the shipyard company).
#' \item sibsp an ordered factor specifying the number if siblings/spouses aboard; adopted from Vanderbild data set (see below).
#' \item parch an ordered factor specifying the number of parents/children aboard; adopted from Vanderbild data set (see below).
#' \item survived a factor with two levels (\code{no} and \code{yes}) specifying whether the person has survived the sinking.
#' }
#'
#' NOTE: The \code{titanic_imputed} dataset use following imputation rules.
#' \itemize{
#' \item Missing `age` is replaced with the mean of the observed ones, i.e., 30.
#' \item For sibsp and parch, missing values are replaced by the most frequently observed value, i.e., 0.
#' \item For fare, mean fare for a given class is used, i.e., 0 pounds for crew, 89 pounds for the 1st, 22 pounds for the 2nd, and 13 pounds for the 3rd class.
#' }
#'
#' @docType data
#' @keywords titanic
#' @name titanic
#' @aliases titanic_imputed
#' @references   \url{https://www.encyclopedia-titanica.org} and \url{https://CRAN.R-project.org/package=stablelearner}
#' @source This dataset was copied from the \code{stablelearner} package and went through few variable
#' transformations. The complete list of persons on the RMS titanic was downloaded from
#' \url{https://www.encyclopedia-titanica.org} on April 5, 2016. The  information given
#' in \code{sibsp} and \code{parch} was adopoted from a data set obtained from \url{https://biostat.app.vumc.org/wiki/Main/DataSets}.
#' @usage
#' data(titanic)
#' data(titanic_imputed)
#' @format a data frame with 2207 rows and 9 columns
NULL

Try the DALEX package in your browser

Any scripts or data that you put into this service are public.

DALEX documentation built on Jan. 16, 2023, 1:06 a.m.