NHANES III data
17030 observations (rows)
and 16 variables (columns).
A subset of data from the National Health and Nutrition
Examination Study (NHANES) III. Subjects age >=20 are included.
A sample of 39,695 subjects was selected, representing more than 250 million people living in the USA. Data was collected 1988-1994.
49 pseudo strata were created with 2 pseudo-PSU's in each stratum (primary sampling units).
This is a subset of the original dataset.
Respondent sequence number.
Pseudo-PSU (primary sampling unit).
Statistical weight. Range 225.93 to 139744.9.
Body weight (lbs).
Standing height (inches).
Average Systolic BP.
Average Diastolic BP.
Has respondent smoked >100 cigarettes
in life (
Does respondent smoke cigarettes now?
never (HAR1 = 2)
>100 cigs (HAR1 = 1 & HAR3 = 2)
current (HAR1 =1 & HAR3 = 1)
Serum cholesterol (mg/100ml).
High blood pressure? (
yes (PEPMNK1R > 140)
no (PEPMNK1R <= 140)
ANALYTIC AND REPORTING GUIDELINES: The Third National Health and Nutrition Examination Survey, NHANES III (1988-94).
In the NHANES III, 89 survey locations were randomly divided into 2 sets or phases, the first consisting of 44 and the other, 45 locations. One set of primary sampling units (PSUs) was allocated to the first 3-year survey period (1988-91) and the other set to the second 3-year period (1991-94).
Therefore, unbiased national estimates of health and nutrition characteristics can be independently produced for each phase as well as for both phases combined. Computation of national estimates from both phases combined (i.e. total NHANES III) is the preferred option; individual phase estimates may be highly variable. In addition, individual phase estimates are not statistically independent.
It is also difficult to evaluate whether differences in individual phase estimates are real or due to methodological differences. That is, differences may be due to changes in sampling methods or data collection methodology over time. At this time, there is no valid statistical test for examining differences between phase 1 and phase 2.
NHANES III is based on a complex multistage probability sample design. Several aspects of the NHANES design must be taken into account in data analysis, including the sampling weights and the complex survey design. Appropriate sampling weights are needed to estimate prevalence, means, medians, and other statistics. Sampling weights are used to produce correct population estimates because each sample person does not have an equal probability of selection. The sampling weights incorporate the differential 3 probabilities of selection and include adjustments for noncoverage and nonresponse.
With the large oversampling of young children, older persons, black persons, and Mexican Americans in NHANES III, it is essential that the sampling weights be used in all analyses. Otherwise, misinterpretation of results is highly likely.
Other aspects of the design that must be taken into account in data analyses are the strata and PSU pairings from the sample design. These pairings should be used to estimate variances and test for statistical significance.
For weighted analyses, analysts can use special computer software packages that use an appropriate method for estimating variances for complex samples such as SUDAAN (Shah 1995) and WesVarPC (Westat 1996).
Although initial exploratory analyses may be performed on unweighted data with standard statistical packages assuming simple random sampling, final analyses should be done on weighted data using appropriate sampling weights.
H&L 2nd ed. Page 215. Table 6.3.
National Center for Health Statistics (US) and others 1996. NHANES III reference manuals and reports. National Center for Health Statistics. CDC (free)
1 2 3 4 5 6
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.