Example: National Ambulatory Medical Care Survey (NAMCS) tables"

knitr::opts_chunk$set(
  collapse = TRUE
  # , comment = "#>"
)

This example uses the National Ambulatory Medical Care Survey (NAMCS) 2019 Public Use File (PUF) to replicate certain tables from the National Ambulatory Medical Care Survey: 2019 National Summary Tables. NAMCS is "an annual nationally representative sample survey of visits to non-federal office-based patient care physicians, excluding anesthesiologists, radiologists, and pathologists." Note that the unit of observation is visits, not patients -- this distinction is important since a single patient can make multiple visits.

Selected variables from NAMCS 2019 come with the surveytable package, for use in examples, in an object called namcs2019sv.

Begin

Begin by loading the surveytable package.

library(surveytable)

Now, specify the survey that you'd like to analyze.

set_survey(namcs2019sv)

Check the survey name, survey design variables, and the number of observations to verify that it all looks correct.

For this example, we do want to turn on certain NCHS-specific options, such as identifying low-precision estimates. If you do not care about identifying low-precision estimates, you can skip this command. To turn on the NCHS-specific options:

set_opts(mode = "NCHS")

Table 1

Counts and percentages

This table shows the overall estimated count as well as the counts and percentages by type of doctor, physician specialty, and metropolitan statistical area.

The variables that are necessary for creating this table are already in the survey, making the commands very straightforward.

total()
tab("MDDO", "SPECCAT", "MSA")

Rates

The published table also shows several rates. To calculate rates, in addition to the survey, we need a source of information with population estimates. You would typically use a function such as read.csv() to load the population estimates and get them into the correct format. The surveytable package comes with an object called uspop2019 that contains several population estimates for use in these examples.

class(uspop2019)
names(uspop2019)

Here is the overall population estimate:

uspop2019$total

Once we have the overall population estimate, the overall rate is:

total_rate(uspop2019$total)

To calculate the rates for a particular variable, we need to provide a data frame with a variable called Level that matches the levels of the variable in the survey, and a variable called Population that gives the population size (which is assumed to be a constant rather than a random variable).

For MSA, we can see the levels of the variables just by using the tab() command, just as we did above. Thus, to calculate rates, we need a data frame as follows:

uspop2019$MSA

Now that we have the appropriate population estimates, the rate is:

tab_rate("MSA", uspop2019$MSA)

We can also calculate rates of a specific variable based on the entire population:

tab_rate("MDDO", uspop2019$total)
tab_rate("SPECCAT", uspop2019$total)

Table 3

Counts and percentages

This table presents estimates for each age group, as well as for each age group by sex.

var_list("age")

The survey has a couple of relevant age-related variables. AGE is the patient age in years. AGER is a categorical variable based on AGE. However, for this table, in addition to AGER, we need another age group variable, with different age categories. We create it using the var_cut function.

var_cut("Age group", "AGE"
        , c(-Inf, 0, 4, 14, 64, Inf)
        , c("Under 1", "1-4", "5-14", "15-64", "65 and over") )

Now that we've created the Age group variable, we can create the tables:

tab("AGER", "Age group", "SEX")
tab_cross("AGER", "SEX")

Rates

tab_rate("AGER", uspop2019$AGER)
tab_rate("Age group", uspop2019$`Age group`)
tab_rate("SEX", uspop2019$SEX)

To calculate the rates for one variable (AGER) by another variable (SEX), we need population estimates in the following format:

uspop2019$`AGER x SEX`

Once we have these population estimates, the rates are:

tab_subset_rate("AGER", "SEX", uspop2019$`AGER x SEX`)

Table 5

This table gives the expected sources of payment. We use the PAY* variables to create several new variables that are required by the table. Note that the PAY* variables are logical (TRUE or FALSE), which simplifies the workflow. (The survey was imported into R using the importsurvey package, which automatically detects binary variables and imports them as logical variables.)

#
var_all("Medicare and Medicaid", c("PAYMCARE", "PAYMCAID"))

#
var_any("Payment used", c("PAYPRIV", "PAYMCARE", "PAYMCAID"
  , "PAYWKCMP", "PAYOTH", "PAYDK"))
var_not("No other payment used", "Payment used")

var_all("Self-pay", c("PAYSELF", "No other payment used"))
var_all("No charge", c("PAYNOCHG", "No other payment used"))
var_any("No insurance", c("Self-pay", "No charge"))

#
var_case("No pay", "NOPAY", "No categories marked")
var_any("Unknown or blank", c("PAYDK", "No pay"))

##
tab("PAYPRIV", "PAYMCARE", "PAYMCAID", "Medicare and Medicaid"
  , "No insurance", "Self-pay", "No charge"
  , "PAYWKCMP", "PAYOTH", "Unknown or blank")

Check the presentation standards flags! Under NCHS presentation standards rules, some of these estimates should not be shown.

Table 6

This table shows the primary care provider and referral status, by prior-visit status.

In the table, the "Unknown" and "Blank" values are collapsed into a single value. We can collapse two or more levels of a factor into a single level using the var_collapse function.

var_collapse("PRIMCARE", "Unknown if PCP", c("Unknown", "Blank"))
var_collapse("REFER", "Unknown if referred", c("Unknown", "Blank"))

Now, for the table:

tab("PRIMCARE", "REFER", "SENBEFOR")

The percentages within each subset that is defined by SENBEFOR add up to 100% -- for this reason, we want to use tab_subset(), not tab_cross().

tab_subset("PRIMCARE", "SENBEFOR")
tab_subset("REFER", "SENBEFOR")

Table 11

This table shows the same information as Table 3, but only for preventive care visits. That is, estimates for each age group, as well as for each age group by sex, but only for preventive care visits.

Let's create Age group from AGE and cross AGER and SEX to create a variable called Age x Sex:

var_cut("Age group", "AGE"
        , c(-Inf, 0, 4, 14, 64, Inf)
        , c("Under 1", "1-4", "5-14", "15-64", "65 and over") )
var_cross("Age x Sex", "AGER", "SEX")

To see the possible values of MAJOR (Major reason for this visit), and to estimate the total count for preventive care visits:

tab("MAJOR")

To create the tables of age, sex, and their interaction, and limit them to only the preventive care visits:

tab_subset("AGER", "MAJOR", "Preventive care")
tab_subset("Age group", "MAJOR", "Preventive care")
tab_subset("SEX", "MAJOR", "Preventive care")
tab_subset("Age x Sex", "MAJOR", "Preventive care")

As each of the above commands is similar, and differs only in the first variable that is passed to the tab_subset() function, this code can be streamlined with a for loop:

for (vr in c("AGER", "Age group", "SEX", "Age x Sex")) {
    print( tab_subset(vr, "MAJOR", "Preventive care") )
}

Note that when called from inside a for loop, the print() function needs to be called explicitly.

More advanced coding

In addition, for each age-sex category, the published table shows the percentage of preventive care visits made to primary care physicians.

To calculate these percentages, a slightly more involved for loop is needed. Below is the code, followed by an explanation:

set_opts(output = "csv", file = "my_output.csv")
set_opts(output = "csv", file = "my_output.csv", .file_temp = TRUE)
for (vr in c("AGER", "Age group", "SEX", "Age x Sex")) {
    var_cross("tmp", "MAJOR", vr)
    for (lvl in levels(surveytable:::env$survey$variables[,vr])) {
        print( tab_subset("SPECCAT", "tmp", paste0("Preventive care: ", lvl)) )
    }
}
set_opts(output = "auto")

If you run this code, all of the tables should be stored in the CSV file. To give you an idea of what the tables should look like, here is just one of the tables:

vr = "AGER"
var_cross("tmp", "MAJOR", vr)
lvl = levels(surveytable:::env$survey$variables[,vr])[1]
tab_subset("SPECCAT", "tmp", paste0("Preventive care: ", lvl))

To match the percentage in the published table, see the "Primary care specialty" row. Be sure to check the presentation standards flags.



Try the surveytable package in your browser

Any scripts or data that you put into this service are public.

surveytable documentation built on Aug. 26, 2025, 1:07 a.m.