asir: Calculate age-standardized incidence rates

View source: R/asir.R

asirR Documentation

Calculate age-standardized incidence rates

Description

Calculate age-standardized incidence rates

Usage

asir(
  df,
  dattype = NULL,
  std_pop = "ESP2013",
  truncate_std_pop = FALSE,
  futime_src = "refpop",
  summarize_groups = "none",
  count_var,
  stdpop_df = standard_population,
  refpop_df = population,
  region_var = NULL,
  age_var = NULL,
  sex_var = NULL,
  year_var = NULL,
  site_var = NULL,
  futime_var = NULL,
  pyar_var = NULL,
  alpha = 0.05
)

Arguments

df

dataframe in wide format

dattype

can be "zfkd" or "seer" or NULL. Will set default variable names if dattype is "seer" or "zfkd". Default is NULL.

std_pop

can be either "ESP2013, ESP1976, WHO1960, WHO2000

truncate_std_pop

if TRUE standard population will be truncated for all age-groups that do not occur in df

futime_src

can be either "refpop" or "cohort". Default is "refpop".

summarize_groups

option to define summarizing stratified groups. Default is "none". If you want to define variables that should be summarized into one group, you can chose from region_var, sex_var, year_var. Define multiple summarize variables by summarize_groups = c("region", "sex", "year")

count_var

variable to be counted as observed case. Should be 1 for case to be counted.

stdpop_df

df where standard population is defined. It is assumed that stdpop_df has the columns "sex" for biological sex, "age" for age-groups, "standard_pop" for name of standard population (e.g. "European Standard Population 2013) and "population_n" for size of standard population age-group. stdpop_df must use the same category coding of age and sex as age_var and sex_var.

refpop_df

df where reference population data is defined. Only required if option futime = "refpop" is chosen. It is assumed that refpop_df has the columns "region" for region, "sex" for biological sex, "age" for age-groups (can be single ages or 5-year brackets), "year" for time period (can be single year or 5-year brackets), "population_pyar" for person-years at risk in the respective age/sex/year cohort. refpop_df must use the same category coding of age, sex, region, year and site as age_var, sex_var, region_var, year_var and site_var.

region_var

variable in df that contains information on region where case was incident. Default is set if dattype is given.

age_var

variable in df that contains information on age-group. Default is set if dattype is given.

sex_var

variable in df that contains information on biological sex. Default is set if dattype is given.

year_var

variable in df that contains information on year or year-period when case was incident. Default is set if dattype is given.

site_var

variable in df that contains information on ICD code of case diagnosis. Default is set if dattype is given.

futime_var

variable in df that contains follow-up time per person (in years) in cohort (can only be used with futime_src = "cohort"). Default is set if dattype is given.

pyar_var

variable in refpop_df that contains person-years-at-risk in reference population (can only be used with futime_src = "refpop") Default is set if dattype is given.

alpha

significance level for confidence interval calculations. Default is alpha = 0.05 which will give 95 percent confidence intervals.

Value

df

Examples

#load sample data
data("us_second_cancer")
data("standard_population")
data("population_us")

#make wide data as this is the required format
usdata_wide <- us_second_cancer %>%
                    #only use sample
                    dplyr::filter(as.numeric(fake_id) < 200000) %>%
                    msSPChelpR::reshape_wide_tidyr(case_id_var = "fake_id", 
                    time_id_var = "SEQ_NUM", timevar_max = 2)
                    
#create count variable
usdata_wide <- usdata_wide %>%
                    dplyr::mutate(count_spc = dplyr::case_when(is.na(t_site_icd.2)   ~ 1,
                    TRUE ~ 0))
 
#remove cases for which no reference population exists
usdata_wide <- usdata_wide %>%
                    dplyr::filter(t_yeardiag.2 %in% c("1990 - 1994", "1995 - 1999", "2000 - 2004",
                                                       "2005 - 2009", "2010 - 2014"))
                    

#now we can run the function
msSPChelpR::asir(usdata_wide,
      dattype = "seer",
      std_pop = "ESP2013",
      truncate_std_pop = FALSE,
      futime_src = "refpop",
      summarize_groups = "none",
      count_var = "count_spc",
      refpop_df = population_us,
      region_var = "registry.1", 
      age_var = "fc_agegroup.1",
      sex_var = "sex.1",
      year_var = "t_yeardiag.2", 
      site_var = "t_site_icd.2",
      pyar_var = "population_pyar")



marianschmidt/msSPChelpR documentation built on Feb. 1, 2024, 6:45 a.m.