Description Usage Arguments Details Value Author(s) Examples
View source: R/rebuild_data_source_cocit.R
This function builds a cocitations Table with DOI Information based on a paperTable. The Papers are normally extracted from the Web of Knowledge in csv Format. Make sure to add a row with Paper numbers as these are required to run the function.
1 | rebuild_data_source_cocit(paperTable, ignoreCRs)
|
paperTable |
An imported csv table from web of knowledge with papern.No for each Row. File can be read like this: read.csv("file", sep = ";", header = TRUE, skip = 1) Make sure to add a column with the Name Papern.No. This row will be used to assign the citations to a paper in the generation |
ignoreCRs |
old |
Attention, this function can run for more than an hour based on the number ob papers given. It can run around 1 hour for around 9000 inidivual papers.
A Dataframe with the following columns: PNo, autor, jahr, journal, version, seite, CR, DOI PNo: Paper Number autor: Author name jahr: Year of publishing journal: Journalname version: Version name of the Journal seite: Page on which the citation was released CR: Citationnumber, which is initilizied with 0 DOI: Digital Object Identifier
If a Value is not found its just replaced with an empty string, expect the year, which gets replaced with a String "None"
MFinst
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 | ##---- Should be DIRECTLY executable !! ----
##-- ==> Define data, use random,
##-- or do help(data=index) for the standard data sets.
## The function is currently defined as
function (paperTable, ignoreCRs = FALSE)
{
start_time = Sys.time()
PNos = paperTable$Papern.No
CRs = paperTable$CR
papersLength = length(PNos)
CRPlaceholder = "0"
allCRs = as.data.frame(x = c(), autor = c(), jahr = c(),
journal = c(), version = c(), seite = c(), CR = c(),
DOI = c(), stringsAsFactors = FALSE)
allCRs = rbind(allCRs, c("x", "autor", "jahr", "journal",
"version", "seite", "CR", "DOI"))
allCRs = type.convert(x = allCRs, as.is = TRUE)
allIgnoredCits = vector(mode = "character", length = 0)
names(allCRs)[1] = "x"
names(allCRs)[2] = "autor"
names(allCRs)[3] = "jahr"
names(allCRs)[4] = "journal"
names(allCRs)[5] = "version"
names(allCRs)[6] = "seite"
names(allCRs)[7] = "CR"
names(allCRs)[8] = "DOI"
for (i in 1:papersLength) {
print(paste(i, "of", papersLength))
cocits = strsplit(x = as.character(CRs[i]), split = ";")
rows = vector(mode = "character", length = 0)
if (length(cocits[[1]] > 0)) {
for (y in 1:length(cocits[[1]])) {
citationInfoList = strsplit(cocits[[1]][y], ",")
DOI = grep(x = citationInfoList[[1]], pattern = "DOI ",
value = TRUE)
if (length(DOI) == 0) {
DOI = ""
}
else {
if (length(DOI) > 1) {
DOI = DOI[1]
DOI = gsub("\\[", "", x = DOI)
}
DOI = gsub("DOI ", "", x = DOI)
}
author = trimws(citationInfoList[[1]][1])
year = trimws(citationInfoList[[1]][2])
if (is.na(as.numeric(year))) {
year = "None"
}
if (length(citationInfoList[[1]]) >= 3) {
journal = trimws(citationInfoList[[1]][3])
}
else {
journal = ""
}
if (length(citationInfoList[[1]]) >= 4) {
version = trimws(citationInfoList[[1]][4])
if (!startsWith(version, "V")) {
version = ""
}
}
else {
version = ""
}
if (length(citationInfoList[[1]]) >= 5) {
seite = trimws(citationInfoList[[1]][5])
if (!startsWith(seite, "P")) {
seite = ""
}
}
else {
seite = ""
}
allCRs = rbind(allCRs, c(as.character(PNos[i]),
as.character(author), as.character(year), as.character(journal),
as.character(version), as.character(seite),
as.character(CRPlaceholder), as.character(DOI)))
}
}
}
end_time = Sys.time()
final_time = end_time - start_time
print(final_time)
return(allCRs)
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.