heartdisease | R Documentation |
The dataset is to be used with a supervised classification ML model to classify heart disease.
heartdisease
A data frame with 918 rows and 10 variables:
age of the patient presenting with heart disease
gender of the patient
blood pressure for resting heart beat
Cholesterol reading
blood sample of glucose after a patient fasts https://www.diabetes.co.uk/diabetes_care/fasting-blood-sugar-levels.html
Resting echocardiography is an indicator of previous myocardial infarction e.g. heart attack
Maximum heart rate
chest pain caused by decreased flood flow https://www.nhs.uk/conditions/angina/
reading at the peak of the heart rate
the classification label of whether patient has heart disease or not
Collected by Gary Hutson hutsons-hacks@outlook.com, Dec-2021
library(dplyr) library(ConfusionTableR) data(heartdisease) # Convert diabetes data to factor' hd <- heartdisease %>% glimpse() %>% mutate(HeartDisease = as.factor(HeartDisease)) # Check that the label is now a factor is.factor(hd$HeartDisease) # Dummy encoding # Get categorical columns hd_cat <- hd %>% dplyr::select_if(is.character) # Dummy encode the categorical variables # Specify the columns to encode cols <- c("RestingECG", "Angina", "Sex") # Dummy encode using dummy_encoder in ConfusionTableR package coded <- ConfusionTableR::dummy_encoder(hd_cat, cols, remove_original = TRUE) coded <- coded %>% select(RestingECG_ST, RestingECG_LVH, Angina=Angina_Y, Sex=Sex_F) # Remove column names we have encoded from original data frame hd_one <- hd[,!names(hd) %in% cols] # Bind the numerical data on to the categorical data hd_final <- bind_cols(coded, hd_one) # Output the final encoded data frame for the ML task glimpse(hd_final)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.