knitr::opts_chunk$set( collapse = TRUE # , comment = "#>" )
Suppose you want to test the equality of proportions of a specific variable across all levels of another variable. For example, suppose you are interested in visits by patients aged 65-74 years (AGER
value of "65-74 years"
). You'd like to test the proportion of these visits for different values of physician specialty (SPECCAT
).
To perform this test, use the tab_subset()
function, with argument test
set to the level of interest, in this case, "65-74 years"
. Thus, in this case, the analysis is as follows:
library(surveytable) set_survey(namcs2019sv) tab_subset("AGER", "SPECCAT", test = "65-74 years")
If you use surveytable
in your research, please include the following statement in the Methods section of your article or report:
version = packageVersion("surveytable")
Data analyses were performed using the R package "surveytable" (version
r version
).
Please cite this package as follows:
Strashny A (2023). surveytable: Streamlining Complex Survey Estimation and Reliability Assessment in R. doi:10.32614/CRAN.package.surveytable, R package version
r version
, https://cdcgov.github.io/surveytable/.
Depending on your coding style, you might be changing options in multiple locations in your code. In these situations, set_opts(reset = TRUE)
might be useful. This command resets all surveytable
options to their default values.
Some analysts might wish to compare the output from surveytable
to the output from other statistical software, such as SAS / SUDAAN. In this situation, set_opts(output = "raw")
might be useful. This command tells surveytable
to print unformatted and unrounded tables. In addition, when performing hypothesis testing, this option prints the test statistic and the degrees of freedom, not just the p-value.
library(surveytable) set_survey(namcs2019sv)
set_opts(output = "raw") tab("SPECCAT", test = TRUE) set_opts(reset = TRUE)
set_opts(output = "raw") print( tab("SPECCAT", test = TRUE), destination = "") set_opts(reset = TRUE)
Consider this example, in which we estimate the number of medications by age group:
library(surveytable) set_survey(namcs2019sv)
tab_subset("NUMMED", "AGER")
What if we'd like to estimate the same thing, but only for the visits for which NUMMED > 0
?
One way to do this is to create another survey object for which NUMMED > 0
, and then analyze this new survey object.
newsurvey = survey_subset(namcs2019sv, NUMMED > 0 , label = "NAMCS 2019 PUF: NUMMED 1+") set_survey(newsurvey)
Note that we called set_survey()
, to let R know that we now want to analyze the new object newsurvey
, not namcs2019sv
.
Now, let's create the table:
tab_subset("NUMMED", "AGER")
Be sure to check the table title to verify that you are tabulating the new survey object.
First, let's review what I call "advanced variable editing".
surveytable
provides a number of functions to create or modify survey variables. var_collapse()
and var_cut()
.Keep in mind that every survey object has an element called variables
. This is a data frame where the survey's variables are located.
variables
data frame (which is part of the survey object).set_survey()
again. Any time you modify the variables
data frame, call set_survey()
.For an example of this, see vignette("Example-Residential-Care-Community-Services-User-NSLTCP-RCC-SU-report")
.
The above explanation raises the question of why set_survey()
must be called again, after variables
is modified. Here is an explanation:
The survey that you're analyzing actually exists in three separate places:
mysurvey.rds
.mysurvey
.surveytable
. This is what surveytable
analyzes. Why is there (3) that's different from (2), you might ask. That's due to an arcane issue with how R packages work -- both (2) and (3) are necessary.
Normally, information only flows forwards, from (1) to (2) and from (2) to (3).
Forwards flow:
readRDS()
.set_survey()
.Backwards flow:
surveytable:::.load_survey()
.saveRDS()
. Normally, you probably don't want to do this. Normally, the survey file (mysurvey.rds
) should probably not be changed. The functions for modifying or creating variables that are part of the surveytable
package (like var_cut()
or var_collapse()
) modify (3). Since (3) is what surveytable
works with and tabulates, you can use var_collapse()
, and then immediately use tab()
. You don't need to do anything extra in between.
If you are modifying the variables
data frame directly, you are modifying (2). After you modify (2), you need to copy it over to (3), so that surveytable
can use it. You do that by calling set_survey()
.
Thus, any time you modify variables
yourself, call set_survey()
. You modify (2), then copy (2) -> (3) by calling set_survey()
.
On the flip side, the changes that you make in (3) (using surveytable
functions like var_cut()
or var_collapse()
) are not reflected in (2). If you make changes in (3), then call set_survey()
, those changes are lost, because set_survey()
copies (2) -> (3). If those changes were important, you can just rerun the code that created them. If you really need to go from (3) to (2), use mysurvey = surveytable:::.load_survey()
.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.