library(ggplot2) library(tidyr) library(dplyr) library(forcats) library(rsr) knitr::opts_chunk$set(collapse = TRUE,comment = "#>",dev="svg",fig.ext = "svg")
Sysrev users assign millions of labels to documents. The rsr package let's you access and analyze that data. This demo shows how to access data from a review of prostate cancer biomarkers sysrev.com/p/81395.
get_answers_list(81395)
organizes data from the review (p/81395) into a list of tables.
tbls <- get_answers(81395) |> list_answers() #> list(basic=<tbl>,biomarker=<tbl>,...)
The basic table links articles (eg. aid
1781738) to basic label data (logical, string, category). In this project, a basic label describes article trial identifiers (eg NCT01682772):
tbls$basic |> mutate(nct=unlist(`NCT Trial ID`)) |> select(aid,nct) #> aid nct #> 1 11781750 NCT01682772 #> 2 11781738 NCT02854436
The other named values in get_answer_list are group labels, which are themselves tables. Here, the biomarker table describes which trials evaluated which genes.
tbls$biomarker |> mutate(biomarker.name=unlist(biomarker.name)) |> select(aid, biomarker.name) #> aid biomarker.name #> 1 11781738 BRCA1 #> 2 11781738 BRCA2
Extracted tables can be joined by article aid
. Here, joined basic/biomarker tables link trials and biomarkers. ggplot2::geom_tile
can then quickly visualize which trials use which biomarkers.
join.tb = tbls$basic |> inner_join(tbls$biomarker,by="aid") # ggplot(join.tb, # aes(x=bmkr, y=study, fill=elig)) + # geom_tile() + …
joint = tbls$basic |> inner_join(tbls$biomarker,by="aid") |> select(aid, study = short_name, bmkr = biomarker.name, eli = eligibility) |> purrr::modify(~ unlist(.,recursive = T)) top10 = joint |> group_by(bmkr) |> summarize(s = n_distinct(study)) |> slice_max(n=13,order_by=s,with_ties = F) ptb = joint |> inner_join(top10,by="bmkr") |> mutate(bmkr = fct_rev(fct_infreq(bmkr))) |> mutate(study = fct_infreq(study)) |> complete(bmkr, study, fill=list(eli="none")) |> mutate(eli = ifelse(eli=="sufficient","measured",eli)) |> mutate(eli = factor(eli,levels=c("none","required_negative","measured"))) |> mutate(eli = fct_recode(eli, exclude ="required_negative", `measure/include`="measured")) ggplot(ptb,aes(x=study, y=bmkr, fill=eli)) + geom_tile(col="white",size=0.5) + scale_fill_manual(values=c("#161616","#4C9605","#CC2C11")) + theme(text = element_text(size=12), panel.background = element_blank(), axis.text.x = element_text(angle=90,hjust=1), legend.title = element_blank(), legend.position = "top") + ylab("") + xlab("")
Now you know how to get open access review data on sysrev. Check out the rsr reference for more documentation.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.