NELS: National Education Longitudinal Study Data
In copulaData: Data Sets for Copula Modeling

Description Usage Format Source Examples

Random sample of size 1000 from the US National Education Longitudinal Study (NELS) data containing the mathematics, science and reading scores, together with covariates, of 8th graders in 1988.

1	data("NELS88")

data.frame containing the identification number of the school to which the student belongs (ID), the standardized score of the student on a mathematics achievement test (Math; rescaled by an Item Response Theory (IRT) method where a higher score indicates greater proficiency in mathematics), the standardized score of the student on a science achievement test (Science), the standardized score of the student on a reading achievement test (Reading), a factor indicating whether the student is a member of an ethnic minority group (Minority), a numeric measure of the socio-economic status of the student and family (SES), a factor indicating whether the student is female (Female), a factor indicating whether the school is publicly funded (Public), the size of the student's school (Size), a factor indicating whether the school is located in an urban environment (Urban) and a factor indicating whether the school is located in a rural environment (Rural).

Edward W. Frees, ‘Student Achievement Data’ in https://sites.google.com/a/wisc.edu/jed-frees/multivariate-regression-using-copulas.

Originally, the National Center for Education Statistics page, https://nces.ed.gov/surveys/nels88/

data("NELS88")
str(NELS88)
ftable(xtabs(~ Urban+Rural + Public, NELS88))#
## Add more sensible variable, ordered factor rural < agglo < urban
NELS88. <- within(NELS88, {
       UR <- factor(Urban:Rural, labels = c("agglo", "rural", "urban"))
      Urbanity <- ordered(UR, levels = c("rural", "agglo", "urban"))
      rm(UR) })
unique(NELS88.[, c("Urban","Rural", "Urbanity")]) # indeed, just 3 combination cases

xtabs(~ Minority+Urbanity, NELS88.) # (_not_ independent)
ftable(xtabs(~ Public+Urbanity+Female+Minority, NELS88.) -> tab.)
summary(tab.) # very very clearly not independent

'data.frame':	1000 obs. of  11 variables:
 $ ID      : Factor w/ 501 levels "1249","1806",..: 173 495 14 65 102 404 479 206 417 435 ...
 $ Math    : num  18.7 43 19.5 36.5 45.1 ...
 $ Science : num  12.5 22.4 13.6 17.1 27 ...
 $ Reading : num  13.9 34 22.8 34.8 27.4 ...
 $ Minority: Factor w/ 2 levels "0","1": 1 1 1 1 1 1 2 1 2 1 ...
 $ SES     : num  -0.404 -0.073 -0.636 0.079 -0.448 ...
 $ Female  : Factor w/ 2 levels "0","1": 2 2 2 2 1 2 1 1 2 2 ...
 $ Public  : Factor w/ 2 levels "0","1": 2 2 1 2 2 2 2 2 1 1 ...
 $ Size    : int  900 500 100 300 900 900 900 900 300 300 ...
 $ Urban   : Factor w/ 2 levels "0","1": 2 1 1 1 2 1 2 1 2 1 ...
 $ Rural   : Factor w/ 2 levels "0","1": 1 1 1 2 1 1 1 2 1 1 ...
            Public   0   1
Urban Rural               
0     0            155 292
      1             17 245
1     0            115 176
      1              0   0
  Urban Rural Urbanity
1     1     0    urban
2     0     0    agglo
4     0     1    rural
        Urbanity
Minority rural agglo urban
       0   217   332   157
       1    45   115   134
                       Minority   0   1
Public Urbanity Female                 
0      rural    0                 8   0
                1                 9   0
       agglo    0                54  17
                1                67  17
       urban    0                42   9
                1                56   8
1      rural    0                92  23
                1               108  22
       agglo    0               110  38
                1               101  43
       urban    0                27  53
                1                32  64
Call: xtabs(formula = ~Public + Urbanity + Female + Minority, data = NELS88.)
Number of cases in table: 1000 
Number of factors: 4 
Test for independence of all factors:
	Chisq = 233.27, df = 18, p-value = 2.02e-39