knitr::opts_chunk$set( collapse = TRUE, comment = "#>", dpi = 450 ) set.seed(42)
In general, the work flow from start to finish is structured in three steps.
.ctv6
file.ctv6
fileknitr::include_graphics("https://github.com/SchmidtPaul/CitaviR/blob/master/man/figures/WorkflowSQL.png?raw=true")
The following screenshot shows the Citavi project that is available in CitaviR
as the 3dupsin5refs.ctv6
.
knitr::include_graphics("https://github.com/SchmidtPaul/CitaviR/blob/master/vignettes/Citavi_Project.PNG?raw=true")
Here, we can read in the information of interest via read_Citavi_ctv6()
.
library(tidyverse) library(CitaviR) example_path <- example_file("3dupsin5refs/3dupsin5refs.ctv6") # in real life: replace with your path CitDat <- read_Citavi_ctv6(path = example_path, CitDBTableName = "Reference") CitDat %>% select(Title, Year, Abstract, DOI)
If, for whatever reason, you wish to do the import from Citavi not via SQL, but with Excel files (exported from Citavi), then
CitaviR
offers an alternative approach viaread_Citavi_xlsx()
andwrite_Citavi_xlsx()
described here.
# quietly store the Citavi project for later OriginalCitDat <- CitDat
At this point there are many things one may wish to do with the data. In this example we will make use of the CitaviR
functions to identify and handle obvious duplicates. (Check out the article on obvious and potential duplicates for more.)
CitDat <- CitDat %>% find_obvious_dups()
One way of identifying obvious duplicates is via CitaviR::find_obvious_dups()
. In short, it first creates a clean_title
by combining each reference's Title
and Year
into a simplified string. This simplification is based on janitor::make_clean_names()
and e.g. converts to all-lowercase, and removes special characters and unnecessary spaces. If two references have the same clean_title
, they are identified as obvious duplicates. In this example, two references were indeed identified as obvious duplicates:
CitDat %>% select(Title, clean_title:obv_dup_id)
Note how a single typo ("Hritability") prevents ct_03
from being detected as an obvious duplicate for ct_02
. For cases like this, one may use CitaviR::find_potential_dups()
, which is explained in the article on obvious and potential duplicates but not done here.
At this point we have already gained information and could continue with steps 4 and 5. However, sometimes duplicates hold different information as it is the case here for ct_02
and the columns PubMedID
and DOI
:
CitDat %>% filter(clean_title_id == "ct_02") %>% select(clean_title_id, obv_dup_id, DOI, PubMedID)
In such a scenario it would be best to gather all information into the one non-duplicate (=dup_01
) that will be kept and of interest later on. Here, CitaviR::handle_obvious_dups()
comes in handy:
CitDat <- CitDat %>% handle_obvious_dups(fieldsToHandle = c("DOI", "PubMedID"))
As can be seen, the columns listed in fieldsToHandle =
are filled up (i.e. tidyr::fill(all_of(fieldsToHandle), .direction = "up")
).
CitDat %>% filter(clean_title_id == "ct_02") %>% select(clean_title_id, obv_dup_id, DOI, PubMedID)
Therefore, we could now get rid of all obvious duplicates (obv_dup_id =! dup_01
) without losing any information.
Finally, we want to implement the gained information into the Citavi project. To do so, we can make use of update_Citavi_ctv6()
. Say we would like to overwrite the old DOI and PubMed information and additionally store the clean_title_id
and obv_dup_id
in Custom field 1 and Custom field 2, respectively. You should probably close your Citavi project before running this:
CitDat %>% update_Citavi_ctv6( path = example_path, CitDBTableName = "Reference", CitDatVarToCitDBTableVar = c( "DOI" = "DOI", "PubMedID" = "PubMedID", "clean_title_id" = "CustomField1", "obv_dup_id" = "CustomField2"), quiet = FALSE )
If you now open the Citavi project, it should have changed as expected.
knitr::include_graphics("https://github.com/SchmidtPaul/CitaviR/blob/master/man/figures/Citavi_BeforeAfter.png?raw=true")
If, for whatever reason, you wish to do the import from Citavi not via SQL, but with Excel files (exported from Citavi), then
CitaviR
offers an alternative approach viaread_Citavi_xlsx()
andwrite_Citavi_xlsx()
described here.
# quietly restore the Citavi project to original state OriginalCitDat %>% mutate(clean_title_id = NA_character_, obv_dup_id = NA_character_) %>% update_Citavi_ctv6( path = example_path, CitDBTableName = "Reference", CitDatVarToCitDBTableVar = c( "DOI" = "DOI", "PubMedID" = "PubMedID", "clean_title_id" = "CustomField1", "obv_dup_id" = "CustomField2"), quiet = TRUE )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.