communities.and.crime | R Documentation |
Combined socio-economic data from the 1990 Census, law enforcement data from the 1990 LEMAS survey, and crime data from the 1995 FBI UCR for various communities in the United States.
data(communities.and.crime)
The data contains 1969 observations and 104 variables. See the UCI Machine Learning Repository for details.
The data set has been pre-processed as in Komiyama et al. (2018), with the following exceptions:
the variable community
has been dropped, as it is
non-predictive and contains a sizeable number of missing values;
the variables LemasSwornFT
, LemasSwFTPerPop
,
LemasSwFTFieldOps
, LemasSwFTFieldPerPop
,
LemasTotalReq
, LemasTotReqPerPop
, PolicReqPerOffic
,
PolicPerPop
, RacialMatchCommPol
, PctPolicWhite
,
PctPolicBlack
, PctPolicHisp
, PctPolicAsian
,
PctPolicMinor
, OfficAssgnDrugUnits
,
NumKindsDrugsSeiz
, PolicAveOTWorked
, PolicCars
,
PolicOperBudg
, LemasPctPolicOnPatr
,
LemasGangUnitDeploy
and PolicBudgPerPop
have been dropped
because they have more than 80% missing values.
In that paper, ViolentCrimesPerPop
is the response variable,
racepctblack
and PctForeignBorn
are the sensitive attributes and
the remaining variables are used as predictors.
The data contain too many variable to list them here: we refer the reader to the documentation on the UCI Machine Learning Repository.
UCI Machine Learning Repository:
http://archive.ics.uci.edu/ml/datasets/communities+and+crime
data(communities.and.crime)
# short-hand variable names.
cc = communities.and.crime[complete.cases(communities.and.crime), ]
r = cc[, "ViolentCrimesPerPop"]
s = cc[, c("racepctblack", "PctForeignBorn")]
p = cc[, setdiff(names(cc), c("ViolentCrimesPerPop", names(s)))]
m = nclm(response = r, sensitive = s, predictors = p, unfairness = 0.05)
summary(m)
m = frrm(response = r, sensitive = s, predictors = p, unfairness = 0.05)
summary(m)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.