The goal of the ox package is to read OpenClinica odm 1.3 xml export files, and create dataframes for data and metadata (or an ox_all object, a list containing both) from parsed xml files. Also, tidy dataframes can be created for any ItemGroup, with optional definition of factors from codelists.
ox is currently under development.
You can install the development version of ox from GitHub with:
library(devtools)
install_github("acobos/ox")
This is a basic example which shows you how to solve a common problem: create a tidy dataframe for a group of related items (an ItemGroup in OpenClinica terminology), e.g., for demographic data.
First, load packages XMLand ox. Then parse the xml file using xmlParse from package XML, and use function ox_all() to create an ox_all object. This involves reading data and metadata, and is a slow process. It may take a lot of time with any real study!
With ox_info() you will see some basic information of this object.
# load packages
library(XML)
library(ox)
# the xml file address/name
xml_file <- system.file("extdata",
"odm1.3_full_example.xml",
package = "ox",
mustWork = TRUE)
# parse the xml file
doc <- xmlParse(xml_file)
# create an (ox_all) object
d <- ox_all(doc)
> Getting ItemData nodes...
> Extracting data from ItemData nodes...
> Done
> Extracting metadata...
> Done
# get basic info
ox_info(d)
> $numbers
> datapoints subjects sites events forms groups
> 240 2 2 6 12 15
> items
> 106
>
> $events
> [1] "SE_BASELINE" "SE_RW1" "SE_ENDOFRADIOTHERAPY"
> [4] "SE_ACUTETOXICITY" "SE_FM1" "SE_FM3"
>
> $forms
> [1] "F_INFORMEDCONS_1" "F_IECRITERIA_11" "F_COMORBIDITIE_11"
> [4] "F_DEMO_1" "F_CANCERHISTOL_11" "F_PREVMEDANTIN_11"
> [7] "F_SURGERY_1" "F_RANDOM_4" "F_PHYSICALEXAM_11"
> [10] "F_RADIOTHERAPY_2" "F_ACUTETOXICIT_3" "F_SURVIVALANDR_21"
>
> $groups
> [1] "IG_INFOR_INFORMEDCONSENT"
> [2] "IG_IECRI_IECRITERIA"
> [3] "IG_COMOR_UNGROUPED"
> [4] "IG_COMOR_COMORBIDITIES"
> [5] "IG_DEMO_DEMOGRAPHICDATA"
> [6] "IG_CANCE_CANCERHISTOLOGYANDRECEPTOR"
> [7] "IG_PREVM_UNGROUPED"
> [8] "IG_PREVM_PREVIOUSMEDICATIONANTINEOP"
> [9] "IG_SURGE_TYPEOFSURGERY"
> [10] "IG_RANDO_RANDOMIZATION_4899"
> [11] "IG_PHYSI_PHYSICALEXAMINATION"
> [12] "IG_RADIO_RADIOTHERAPYINTERVENTION"
> [13] "IG_ACUTE_UNGROUPED"
> [14] "IG_ACUTE_ACUTETOXICITY"
> [15] "IG_SURVI_SURVIVALANDDISEASERECURREN"
In the previous output, $numbers documents that this example data includes 240 datapoints on two subjects, from two sites, etc.
One of the ItemGroups in this study ($groups in the output above) is IG_DEMO_DEMOGRAPHICDATA. Let's get a tidy dataframe for this group, and see the result.
# get tidy dataframe for demographic data
demo <- ox_xtract_group(d, group = "IG_DEMO_DEMOGRAPHICDATA",
use_item_names = TRUE,
define_factors = TRUE)
> Joining, by = "codelist_oid"
demo
> study_oid subject_key subject_id event_oid event_repeat_key form_oid
> 1 S_CHU_SANT SS_189 189 SE_BASELINE NA F_DEMO_1
> 2 S_PARCSALU SS_100 100 SE_BASELINE NA F_DEMO_1
> group_oid group_repeat_key demo_age demo_menstrual
> 1 IG_DEMO_DEMOGRAPHICDATA 1 55 Postmenopause
> 2 IG_DEMO_DEMOGRAPHICDATA 1 72 Postmenopause
In the resulting dataframe, columns study_oid to group_repeat_key are keys for study site, subject, event, etc.; demo_age and demo_menstrual are the relevant variables, i.e., the two items included in the IG_DEMO_DEMOGRAPHICDATA ItemGroup.
See the vignette for more details.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.