sbs: Artificial Structural Business Statistics Data

sbsR Documentation

Artificial Structural Business Statistics Data

Description

The sbs data frame stores artificial sbs-like sampling data, while sbs.frame is the artificial sampling frame from which the sbs units have been drawn. They allow to run R code contained in the ‘Examples’ section of the ReGenesees package help pages.

Usage

data(sbs)

Format

The sbs data frame mimics data observed in a Structural Business Statistics survey, under a one-stage stratified unit sampling design. The sample is made up of 6909 units, for which the following 22 variables were observed:

id

Identifier of the sampling units (enterprises), numeric

public

Does the enterprise belong to the Public Sector? factor with levels 0 (No) and 1 (Yes)

emp.num

Number of employees, numeric

emp.cl

Number of employees classified into 5 categories, factor with levels [6,9] (9,19] (19,49] (49,99] (99,Inf] (notice that small enterprises with less than 6 employees fell outside the scope of the survey)

nace5

Economic Activity code with 5 digits, factor with 596 levels

nace2

Economic Activity code with 2 digits, factor with 57 levels

area

Territorial Division, factor with 24 levels

cens

Flag identifying statistical units to be censused (hence defining take-all strata), factor with levels 0 (No) and 1 (Yes)

region

Macroregion, factor with levels North Center South

va.cl

Class of Value Added, factor with 27 levels

va

Value Added, numeric (contains NAs)

dom1

A planned estimation domain, factor with 261 levels (dom1 crosses nace2 and emp.cl)

nace.macro

Economic Activity Macrosector, factor with levels Agriculture Industry Commerce Services

dom2

A planned estimation domain, factor with 12 levels (dom2 crosses nace.macro and region)

strata

Stratification Variable, a factor with 664 levels (obtained by crossing variables region, nace2, emp.cl and cens)

va.imp1

Value Added Imputed1, numeric (NAs were replaced with average values computed inside imputation strata obtained by crossing region, nace.macro, emp.cl)

va.imp2

Value Added Imputed2, numeric (NAs were replaced with median values computed inside imputation strata obtained by crossing region, nace.macro, emp.cl)

y

A numeric variable correlated with va

weight

Direct weights, numeric

fpc

Finite Population Corrections (given as sampling fractions inside strata), numeric

ent

Convenience numeric variable identically equal to 1 (sometimes useful, e.g. to estimate the total number of enterprises)

dom3

An unplanned estimation domain, factor with 4 levels

The sbs.frame sampling frame (from which sbs units have been drawn) contains 17318 units.

Examples

data(sbs)
head(sbs)
str(sbs)
str(sbs.frame)

DiegoZardetto/ReGenesees documentation built on Dec. 16, 2024, 2:03 p.m.