record_format: Define custom fields for NAACCR records

Description Usage Arguments Details Value Format Types Examples

View source: R/record_format.R

Description

Create a record_format object, which is used to read NAACCR records.

Usage

1
2
3
4
record_format(name, item, start_col, end_col, type, alignment = "left",
  padding = " ", name_literal = NULL)

as.record_format(x, ...)

Arguments

name

Item name appropriate for a data.frame column name.

item

NAACCR item number.

start_col

First column of the field in a fixed-width record.

end_col

Last column of the field in a fixed-width record.

type

Name of the column class.

alignment

Alignment of the field in fixed-width files. Either "left" (default) or "right".

padding

Single-character strings to use for padding in fixed-width files.

name_literal

(Optional) Item name in plain language.

x

Object to be coerced to a record_format, usually a data.frame or list.

...

Other arguments passed to record_format.

Details

To define registry-specific fields in addition to the standard fields, create a record_format object for the registry-specific fields and combine it with one of the formats provided with the package using rbind.

Value

An object of class "record_format" which has the following columns:

name

(character) XML field name.

item

(integer) Field item number.

start_col

(integer) First column of the field in a fixed-width text file.

end_col

(integer) Last column of the field in a fixed-width text file.

type

(factor) R class for the column vector.

alignment

(factor) Alignment of the field's values in a fixed-width text file.

padding

(character) String used for padding field values in a fixed-width text file.

name_literal

(character) Field name in plain language.

Format Types

The levels type can take, along with the functions used to process them when reading a file:

address

(clean_address_number_and_street) Street number and street name parts of an address.

age

(clean_age) Age in years.

boolean01

(naaccr_boolean, with false_value = "0") True/false, where "0" means false and "1" means true.

boolean12

(naaccr_boolean, with false_value = "1") True/false, where "1" means false and "2" means true.

census_block

(clean_census_block) Census Block ID number.

census_tract

(clean_census_tract) Census Tract ID number.

character

(clean_text) Miscellaneous text.

city

(clean_address_city) City name.

count

(clean_count) Integer count.

county

(clean_county_fips) County FIPS code.

Date

(as.Date, with format = "%Y%m%d") NAACCR-formatted date (YYYYMMDD).

datetime

(as.POSIXct, with format = "%Y%m%d%H%M%S") NAACCR-formatted datetime (YYYYMMDDHHMMSS)

facility

(clean_facility_id) Facility ID number.

icd_9

(clean_icd_9_cm) ICD-9-CM code.

icd_code

(clean_icd_code) ICD-9 or ICD-10 code.

integer

(as.integer) Miscellaneous whole number.

numeric

(as.numeric) Miscellaneous decimal number.

override

(naaccr_override) Field describing why another field's value was over-ridden.

physician

(clean_physician_id) Physician ID number.

postal

(clean_postal) Postal code for an address (a.k.a. ZIP code in the United States).

ssn

(clean_ssn) Social Security Number.

telephone

(clean_telephone) 10-digit telephone number.

Examples

1
2
3
4
5
6
7
8
  my_fields <- record_format(
    name      = c("foo", "bar"),
    item      = c(2163, 1180),
    start_col = c(975, 1381),
    end_col   = c(975, 1435),
    type      = c("numeric", "facility")
  )
  my_format <- rbind(naaccr_format_16, my_fields)

naaccr documentation built on Jan. 11, 2020, 9:17 a.m.