nhanes: National Health and Nutrition Examination Survey

Description Usage Format Details Author(s) Source

Description

The National Health and Nutrition Examination Survey (NHANES) is a research survey distributed to adults and children across the United States to examine health and nutritional status nationwide. It is unique in that it conducts both interviews and physical examination to produce their data. The survey is run by the National Center for Health Statistics (NCHS), a part of the Centers for Disease Control and Prevention.

Usage

1

Format

A tibble with 10,000 observations and 15 variables:

gender

character variable with values "Male" and "Female"

survey

integer variable with the year the survey was conducted in. Most tests were conducted across two years, so the earlier year is used.

age

character variable for age in years

race

character variable with values "White", "Black", "Hispanic", "Mexican", and "Other"

education

ordered factor variable with levels "Middle school" < "High school" < "Some college" < "College"

hh_income

ordered factor variable for household income groups with levels "0-4999" < ... < "over 99999"

weight

double variable for weight in kilograms

height

double variable for height in centimeters

bmi

double variable for body mass index

pulse

integer variable for pulse in beats per minute or bpm

diabetes

integer variable with values 0/1 indicating whether patient suffers from diabetes

general_health

integer variable for general health ranked from 1-5. 1 maps to "poor", 2 is "fair", 3 is "good", 4 is "very good", and 5 is "excellent".

depressed

character variable answering question how often patient feels depressed. Includes values "Several", "None", "Most"

pregnancies

integer variable for number of pregnanices. NA if the individual, regardless of gender, hasn't experienced a pregnancy

sleep_night_hrs

integer value for hours of sleep per night on average

Details

In order to get a demographically diverse range of responses, minority groups were surveyed more heavily. However, this skewed the survey results to have an inaccurate demographic makeup. To address this, some observations of more common groups are resampled in the data.

 

Table: Data summary

Name nhanes
Number of rows 10000
Number of columns 15
_______________________
Column type frequency:
character 2
factor 3
numeric 10
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
gender 0 1 4 6 0 2 0
race 0 1 5 8 0 5 0

Variable type: factor

skim_variable n_missing complete_rate ordered n_unique top_counts
education 4877 0.51 TRUE 3 Som: 2267, Hig: 1517, Mid: 1339
hh_income 811 0.92 TRUE 12 ove: 2220, 750: 1084, 250: 958, 350: 863
depressed 3327 0.67 TRUE 3 Non: 5246, Sev: 1009, Mos: 418

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
survey 0 1.00 2010.00 1.00 2009.00 2009.00 2010.00 2011.00 2011.00 ▇▁▁▁▇
age 0 1.00 36.74 22.40 0.00 17.00 36.00 54.00 80.00 ▇▇▇▆▅
weight 78 0.99 70.98 29.13 2.80 56.10 72.70 88.90 230.70 ▂▇▂▁▁
height 353 0.96 161.88 20.19 83.60 156.80 166.00 174.50 200.40 ▁▁▁▇▂
bmi 366 0.96 26.66 7.38 12.88 21.58 25.98 30.89 81.25 ▇▆▁▁▁
pulse 1437 0.86 73.56 12.16 40.00 64.00 72.00 82.00 136.00 ▂▇▃▁▁
diabetes 142 0.99 0.08 0.27 0.00 0.00 0.00 0.00 1.00 ▇▁▁▁▁
general_health 2461 0.75 3.38 0.94 1.00 3.00 3.00 4.00 5.00 ▁▃▇▇▂
pregnancies 7396 0.26 3.03 1.80 1.00 2.00 3.00 4.00 32.00 ▇▁▁▁▁
sleep 2245 0.78 6.93 1.35 2.00 6.00 7.00 8.00 12.00 ▁▅▇▁▁

Author(s)

David Kane

Source

https://cran.r-project.org/web/packages/NHANES/NHANES.pdf


davidkane9/PPBDS.data documentation built on Nov. 18, 2020, 1:17 p.m.