parse_pdata | R Documentation |
Lots of GSEs now use "characteristics_ch*"
for key-value pairs of
annotation. If that is the case, this simply cleans those up and transforms
the keys to column names and the values to column values.
parse_pdata(data, columns = NULL, sep = ":", split = ";")
data |
A data.frame like object, tibble and data.table are also okay. |
columns |
A character vector, should be ended with "(ch\d*)(\.\d*)?".
these columns in |
sep |
A string separating paired key-value, usually |
split |
Passed to strsplit function. Default is ";"'. |
A characteristics annotation column usually contains multiple
key-value items, so we should first split these columns by split
and then
extract key-value
pairs. A new column will be added whose name is the first
group in the "(ch\d*)(\.\d*)?$" regex pattern of the orginal column name
connected with key
element in key-value
pair by string "_" and the new
column value is the character vector of value
element in all key-value
pair.
A modified data.frame.
gse53987 <- rgeo::get_geo(
"gse53987", tempdir(),
gse_matrix = TRUE, add_gpl = FALSE,
pdata_from_soft = FALSE
)
gse53987_smp_info <- Biobase::pData(gse53987)
gse53987_smp_info$characteristics_ch1 <- stringr::str_replace_all(
gse53987_smp_info$characteristics_ch1,
"gender|race|pmi|ph|rin|tissue|disease state",
function(x) paste0("; ", x)
)
gse53987_smp_info <- rgeo::parse_pdata(gse53987_smp_info)
gse53987_smp_info[grepl(
"^ch1_|characteristics_ch1", names(gse53987_smp_info)
)]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.