csu_group_cases: csu_group_cases

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/csu_group_cases.R

Description

csu_group_cases groups individual data into 5 years age-group data and other user defined variable (sex, registry, etc...).
Optionally: Group cancer based on a standard ICD10 coding; Extract year from custom year format.

Usage

1
2
3
4
5
6
7
8
9
csu_group_cases(df_data, 
	var_age ,
	group_by=NULL,
	var_cases = NULL,
	df_ICD = NULL,
	var_ICD=NULL,
	var_year = NULL,
	all_cancer=FALSE) 
	

Arguments

df_data

Individual data (need to be R data.frame format, see examples to import csv file).

var_age

Age variable. (Numeric). Value > 150 will be considered as missing age.

group_by

(Optional) A vector of variables to create the different population (sex, country, etc...).

var_cases

(Optional) cases variable: If there is already a variable for the number of cases.

df_ICD

(Optional) ICD file for ICD grouping information. Must have 2 fields: "ICD", "LABEL"
. 2 formats are possible:
Each ICD code separated by ICD group

ICD LABEL
C82 NHL
C83 NHL
C84 NHL
C85 NHL
C96 NHL

ICD code already grouped.

ICD_group LABEL
C82-85,C96 NHL

2 ICD codes separated by "-" includes all the ICD code between.
2 ICD codes separated by "," includes only these 2 ICD code.
For instance, C82-85, C96 (or C82-C85, C96) includes:
C82, C83, C84, C85 and C96
Must be filled if var_ICD argument is defined

example: ICD_group_GLOBOCAN

var_ICD

(Optional) ICD variable: ICD variable in the individual data.
Must be filled if df_ICD argument is defined

var_year

(Optional) Year variable: Extract year from custom format , as long as the year is expressed with 4 digits (i.e. ("yyyymmdd","ddmmyyyy", "yyyy/mm","dd-mm-yyyy", etc..) and group data by year.

all_cancer

(Optional) If TRUE, will calculate the number of cases for all cancers (C00-97) and all cancers but non-melanoma of skin (C00-97 but C44)
Need var_ICD and df_ICD arguments to be defined

Details

For most analysis, individual cases database need to be grouped by category.
This function groups data by 5 years age-group and other user defined variable.
Next step will be to add 5 years population data. (see csu_merge_cases_pop).

Value

Return a dataframe.

Author(s)

Mathieu Laversanne

See Also

csu_merge_cases_pop csu_asr csu_cumrisk csu_eapc csu_ageSpecific csu_ageSpecific_top csu_bar_top csu_time_trend csu_trendCohortPeriod

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
# you can import your data from csv file using read.csv:
# mydata <-  read.csv("mydata.csv", sep=",")



	
data(ICD_group_GLOBOCAN)
data(data_individual_file)

#group individual data by 
# 5 year age group 
df_data_age <- csu_group_cases(data_individual_file,
  var_age="age",
  group_by=c("sex", "regcode", "reglabel", "site")) 



	#group individual data by 
	# 5 year age group 
	# ICD grouping from dataframe ICD_group_GLOBOCAN

	df_data_icd <- csu_group_cases(data_individual_file,
	  var_age="age",
	  group_by=c("sex", "regcode", "reglabel"),
	  df_ICD = ICD_group_GLOBOCAN,
	  var_ICD  ="site") 

	#group individual data by 
	# 5 year age group 
	# ICD grouping from dataframe ICD_group_GLOBOCAN
	# year (extract from date of incidence)

	df_data_year <- csu_group_cases(data_individual_file,
	  var_age="age",
	  group_by=c("sex", "regcode", "reglabel"),
	  df_ICD = ICD_group_GLOBOCAN,
	  var_ICD  ="site",
	  var_year = "doi")       
	


# you can export your result as csv file using write.csv:
# write.csv(result, file="result.csv")
				  		  

Rcan documentation built on July 1, 2020, 10:20 p.m.