Description Usage Arguments Examples
Plots a histogram on the proportion of missing values (NA) for all the features in the SEER dataframe.
1 | plotHistNA(dataframe, summary = FALSE, additional_na, binwidth)
|
dataframe |
The dataframe with SEER data |
summary |
If set to TRUE, the function will print a summary on the NA values apart from the histogram. |
additional_na |
A vector with additional symbol(s) that also should be considered NA. This is important for some datasets exported from SEER*Stat software that come with NA values and also strings 'Blank(s)' representing also lack of values. |
binwidth |
This parameter is automatically set by ggplot2. You can set it to a specific number if you want. |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | # First build parsing instructions
## Not run:
instr <- buildSEERParser(file_path = 'read.seer.research.nov17.sas',
file_source = 'download')
# Now you can read it
paths <- c('/home/yourusername/SEER/yr1973_2015.seer9/BREAST.TXT',
'/home/yourusername/SEER/yr2000_2015.ca_ky_lo_nj_ga/BREAST.TXT')
# I'm interested here in patients with breast cancer diagnosed between 2012
# and 2015
seer_data <- readSEER(path = paths,
instructions = instr,
year_dx = c(2012:2015),
primary_site = 'Breast')
# Plot the histogram
plotHistNA(seer_data, summary=TRUE)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.