scholar: Analyse citation data from Google Scholar

knitr::opts_chunk$set(tidy = FALSE,
           message = FALSE)
library("scholar")
library("ggplot2")
theme_set(theme_minimal())

Retrieving basic information

## Define the id for Richard Feynman
id <- 'B7vSqZsAAAAJ'

## Get his profile
l <- get_profile(id)

## Print his name and affliation
l$name
l$affiliation

## Print his citation index
l$h_index
l$i10_index

Retrieving publications

get_publications() return a data.frame of publication records. It contains information of the publications, including title, author list, page number, citation number, publication year, etc..

The pubid is the article ID used by Google Scholar and the identifier that is used to retrieve the citation history of a selected publication.

## Get his publications (a large data frame)
p <- get_publications(id)
head(p, 3)

Retrieving citation data

## Get his citation history, i.e. citations to his work in a given year
ct <- get_citation_history(id)

## Plot citation trend
library(ggplot2)
ggplot(ct, aes(year, cites)) + geom_line() + geom_point()

Users can retrieve the citation history of a particular publication with get_article_cite_history().

## The following publication will be used to demonstrate article citation history
as.character(p$title[1])

## Get article citation history
ach <- get_article_cite_history(id, p$pubid[1])

## Plot citation trend
ggplot(ach, aes(year, cites)) +
    geom_segment(aes(xend = year, yend = 0), size=1, color='darkgrey') +
    geom_point(size=3, color='firebrick')

Comparing scholars

You can compare the citation history of scholars by fetching data with compare_scholars.

# Compare Feynman and Stephen Hawking
ids <- c('B7vSqZsAAAAJ', 'qj74uXkAAAAJ')

# Get a data frame comparing the number of citations to their work in
# a given year
cs <- compare_scholars(ids)

## remove some 'bad' records without sufficient information
cs <- subset(cs, !is.na(year) & year > 1900)
ggplot(cs, aes(year, cites, group=name, color=name)) + geom_line() + theme(legend.position="bottom")

## Compare their career trajectories, based on year of first citation
csc <- compare_scholar_careers(ids)
ggplot(csc, aes(career_year, cites, group=name, color=name)) + geom_line() + geom_point() +
    theme(legend.position=c(.2, .8))

Visualizing and comparing network of coauthors

# Be careful with specifying too many coauthors as the visualization of the
# network can get very messy.
coauthor_network <- get_coauthors('amYIKXQAAAAJ&hl', n_coauthors = 7)

coauthor_network

And then we have a built-in function to plot this visualization.

plot_coauthors(coauthor_network)

Note however, that these are the coauthors listed in Google Scholar profile and not coauthors from all publications.



Try the scholar package in your browser

Any scripts or data that you put into this service are public.

scholar documentation built on Aug. 9, 2022, 9:06 a.m.