View source: R/record_format.R
record_format | R Documentation |
Create a record_format
object, which is used to read NAACCR records.
record_format(
name,
item,
start_col = NA_integer_,
end_col = NA_integer_,
type = "character",
alignment = "left",
padding = " ",
parent = "Tumor",
cleaner = list(NULL),
unknown_finder = list(NULL),
name_literal = NA_character_,
width = NA_integer_
)
as.record_format(x, ...)
name |
Item name appropriate for a |
item |
NAACCR item number. |
start_col |
First column of the field in a fixed-width record. |
end_col |
*Deprecated: Use the |
type |
Name of the column class. |
alignment |
Alignment of the field in fixed-width files. Either
|
padding |
Single-character strings to use for padding in fixed-width files. |
parent |
Name of the parent node to include this field under when
writing to an XML file.
Values can be |
cleaner |
(Optional) List of functions to handle special cases of
cleaning field data (e.g., convert all values to uppercase).
Values of |
unknown_finder |
(Optional) List of functions to detect when codes mean
the actual values are unknown or not applicable.
Values of |
name_literal |
(Optional) Item name in plain language. |
width |
(Optional) Item width in characters. |
x |
Object to be coerced to a |
... |
Other arguments passed to |
To define registry-specific fields in addition to the standard fields, create
a record_format
object for the registry-specific fields and combine it
with one of the formats provided with the package using rbind
.
An object of class "record_format"
which has the following
columns:
name
(character
) XML field name.
item
(integer
) Field item number.
start_col
(integer
) First column of the field in a fixed-width text file.
If NA
, the field will not be read from or written to fixed-width
files. They will included in XML files.
end_col
(integer
) (*Deprecated: Use width
instead.*)
Last column of the field in a fixed-width text file.
If NA
, the field will not be read from or written to fixed-width
files. This is the norm for fields only found in XML formats.
type
(factor
) R class for the column vector.
alignment
(factor
) Alignment of the field's values in a fixed-width
text file.
padding
(character
) String used for padding field values in a
fixed-width text file.
parent
(factor
) Parent XML node for the field. One of
"NaaccrData"
, "Patient"
, or "Tumor"
.
cleaner
(list
of function
objects) Function to prepare the
field's values for analysis.
Values of NULL
will use the standard cleaner functions for the
type
(see below).
unknown_finder
(list
of function
objects) Function to detect codes
meaning the actual values are missing or unknown for the field.
name_literal
(character
) Field name in plain language.
width
(integer
) Character width of the field values.
Mostly meant for reading and writing flat files.
The levels type
can take, along with the functions used to process
them when reading a file:
address
(clean_address_number_and_street
)
Street number and street name parts of an address.
age
(clean_age
)
Age in years.
boolean01
(naaccr_boolean
, with false_value = "0"
)
True/false, where "0"
means false and "1"
means true.
boolean12
(naaccr_boolean
, with false_value = "1"
)
True/false, where "1"
means false and "2"
means true.
census_block
(clean_census_block
)
Census Block ID number.
census_tract
(clean_census_tract
)
Census Tract ID number.
character
(clean_text
)
Miscellaneous text.
city
(clean_address_city
)
City name.
count
(clean_count
)
Integer count.
county
(clean_county_fips
)
County FIPS code.
Date
(as.Date
, with format = "%Y%m%d"
)
NAACCR-formatted date (YYYYMMDD).
datetime
(as.POSIXct
, with format = "%Y%m%d%H%M%S"
)
NAACCR-formatted datetime (YYYYMMDDHHMMSS)
facility
(clean_facility_id
)
Facility ID number.
icd_9
(clean_icd_9_cm
)
ICD-9-CM code.
icd_code
(clean_icd_code
)
ICD-9 or ICD-10 code.
integer
(as.integer
)
Miscellaneous whole number.
numeric
(as.numeric
)
Miscellaneous decimal number.
override
(naaccr_override
)
Field describing why another field's value was over-ridden.
physician
(clean_physician_id
)
Physician ID number.
postal
(clean_postal
)
Postal code for an address (a.k.a. ZIP code in the United States).
ssn
(clean_ssn
)
Social Security Number.
telephone
(clean_telephone
)
10-digit telephone number.
my_fields <- record_format(
name = c("foo", "bar", "baz"),
item = c(2163, 1180, 1181),
start_col = c(975, 1381, NA),
width = c(1, 55, 4),
type = c("numeric", "facility", "character"),
parent = c("Patient", "Tumor", "Tumor"),
cleaner = list(NULL, NULL, trimws)
)
my_format <- rbind(naaccr_format_16, my_fields)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.