View source: R/DataManagement.R
checkBase | R Documentation |
Function that performs various checks to ensure the database is correctly formatted, and adjusts overlapping patient records.
checkBase( base, convertDates = FALSE, dateFormat = NULL, deleteMissing = NULL, deleteErrors = NULL, subjectID = "sID", facilityID = "fID", disDate = "Ddate", admDate = "Adate", maxIteration = 25, retainAuxData = TRUE, verbose = TRUE, ... )
base |
(data.table). A patient discharge database, in the form of a data.table. The data.table should have at least the following columns: sID: patientID (character) fID: facilityID (character) Adate: admission date (POSIXct, but character can be converted to POSIXct) Ddate: discharge date (POSIXct, but character can be converted to POSIXct) |
convertDates |
(boolean) indicating if dates need to be converted to POSIXct if they are not |
dateFormat |
(character) giving the input format of the date character string (e.g. "ymd" for dates like "2019-10-30")
See |
deleteMissing |
(character) How to handle records that contain a missing value in at least one of the four mandatory variables: NULL (default): do not delete. Stops the function with an error message. "record": deletes just the incorrect record. "patient": deletes all records of each patient with one or more incorrect records. |
deleteErrors |
(character) How incorrect records should be deleted: "record" deletes just the incorrect record "patient" deletes all records of each patient with one or more incorrect records. |
subjectID |
(character) the columns name containing the subject ID. Default is "sID" |
facilityID |
(character) the columns name containing the facility ID. Default is "fID" |
disDate |
(character) the columns name containing the discharge date. Default is "Ddate" |
admDate |
(character) the columns name containing the admission date. Default is "Adate" |
maxIteration |
(integer) the maximum number of times the function will try and remove overlapping admissions |
retainAuxData |
(boolean) allow retaining additional data provided in the database. Default is TRUE. |
verbose |
(boolean) print diagnostic messages. Default is TRUE. |
... |
other parameters passed on to internal functions |
The adjusted database as a data.table with a new class attribute "hospinet.base" and an attribute "report" containing information related to the quality of the database.
parse_date_time
## create a "fake and custom" data base mydb = create_fake_subjectDB(n_subjects = 100, n_facilities = 100) setnames(mydb, 1:4, c("myPatientId", "myHealthCareCenterID", "DateOfAdmission", "DateOfDischarge")) mydb[,DateOfAdmission:= as.character(DateOfAdmission)] mydb[,DateOfDischarge:= as.character(DateOfDischarge)] head(mydb) # myPatientId myHealthCareCenterID DateOfAdmission DateOfDischarge #1: s001 f078 2019-01-26 2019-02-01 #2: s002 f053 2019-01-18 2019-01-21 #3: s002 f049 2019-02-25 2019-03-05 #4: s002 f033 2019-04-17 2019-04-21 #5: s003 f045 2019-02-02 2019-02-04 #6: s003 f087 2019-03-12 2019-03-19 str(mydb) #Classes ‘data.table’ and 'data.frame': 262 obs. of 4 variables: # $ myPatientId : chr "s001" "s002" "s002" "s002" ... # $ myHealthCareCenterID: chr "f078" "f053" "f049" "f033" ... # $ DateOfAdmission : chr "2019-01-26" "2019-01-18" "2019-02-25" "2019-04-17" ... # $ DateOfDischarge : chr "2019-02-01" "2019-01-21" "2019-03-05" "2019-04-21" ... #- attr(*, ".internal.selfref")=<externalptr> my_checked_db = checkBase(mydb, subjectID = "myPatientId", facilityID = "myHealthCareCenterID", disDate = "DateOfDischarge", admDate = "DateOfAdmission", convertDates = TRUE, dateFormat = "ymd") #Converting Adate, Ddate to Date format #Checking for missing values... #Checking for duplicated records... #Removed 0 duplicates #Done. head(my_checked_db) # sID fID Adate Ddate #1: s001 f078 2019-01-26 2019-02-01 #2: s002 f053 2019-01-18 2019-01-21 #3: s002 f049 2019-02-25 2019-03-05 #4: s002 f033 2019-04-17 2019-04-21 #5: s003 f045 2019-02-02 2019-02-04 #6: s003 f087 2019-03-12 2019-03-19 str(my_checked_db) #Classes ‘hospinet.base’, ‘data.table’ and 'data.frame': 262 obs. of 4 variables: #$ sID : chr "s001" "s002" "s002" "s002" ... #$ fID : chr "f078" "f053" "f049" "f033" ... #$ Adate: POSIXct, format: "2019-01-26" "2019-01-18" "2019-02-25" "2019-04-17" ... #$ Ddate: POSIXct, format: "2019-02-01" "2019-01-21" "2019-03-05" "2019-04-21" ... # ... ## Show the quality report attr(my_checked_db, "report")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.