Lattes is an unique and largest platform for academic curriculumns. There you can find information about the academic work of all Brazilian scholars. It includes institution of PhD, current employer, field of work, all publications metadata and more. It is an unique and reliable source of information for bibliometric studies.
I've been working with Lattes data for some time. Here I present a short list of papers that have used this data.
Is predatory publishing a real threat? Evidence from a large database study. Scientometrics
The Brazilian scientific output published in journals: A study based on a large CV database. Journal of Informetrics
The researchers, the publications and the journals of Finance in Brazil: An analysis based on resumes from the Lattes platform. Brazilian Review of Finance
Análise do Perfil dos Acadêmicos e de suas Publicações Científicas em Administração (in Portuguese. RAC
GetLattesData is a wrap up of functions I've been using for accessing the data. In the past, one could download the data directly, without any manual work. Currently,
r Sys.Date(), a manual captcha break is necessary. Therefore, using this package requires the manual download of the zip files with the xml data.
Let's consider a simple example of accessing information about my academic CV and a coleague. Both zip files are available locally within the package as an example. If you want to run this example for other scholars, you will have to download their xml zip files from Lattes. After opening the Lattes website (see an example here), click in the XML buttom in the top righ corner. Once the captcha is once again solved, you will download a zip file with the xml content.
Since I work in the business department of UFRGS, the impact of my publications is localy set by the Qualis ranking of Management, Accounting and Tourism (
'ADMINISTRAÇÃO PÚBLICA E DE EMPRESAS, CIÊNCIAS CONTÁBEIS E TURISMO'). Qualis is the local journal ranking in Brazil. You can read more about Qualis in Wikipedia and here.
Now, based on the zip file locations and field of Qualis, we use
GetLattesData to access information available in Lattes:
library(GetLattesData) # get files from pkg (you can download from other researchers in lattes website) f.in <- c(system.file('extdata/3262699324398819.zip', package = 'GetLattesData'), system.file('extdata/8373564643000623.zip', package = 'GetLattesData')) # set qualis field.qualis = 'ADMINISTRAÇÃO PÚBLICA E DE EMPRESAS, CIÊNCIAS CONTÁBEIS E TURISMO' # get data l.out <- gld_get_lattes_data_from_zip(f.in, field.qualis = field.qualis )
my.l is a list with the following dataframes:
The first is a dataframe with information about researchers:
tpesq <- l.out$tpesq str(tpesq)
The second dataframe contains information about all published publications, including Qualis and SJR:
Other dataframes in
l.out included information about accepted papers, supervisions, books and conferences.
GetLattesData makes it easy to create academic reports for a large number of researchers. See next, where we plot the number of publications for each researcher, conditioning on Qualis ranking.
tpublic.published <- l.out$tpublic.published library(ggplot2) p <- ggplot(tpublic.published, aes(x = qualis)) + geom_bar(position = 'identity') + facet_wrap(~name) + labs(x = paste0('Qualis: ', field.qualis)) print(p)
We can also use
dplyr to do some simple assessment of academic productivity:
library(dplyr) my.tab <- tpublic.published %>% group_by(name) %>% summarise(n.papers = n(), max.SJR = max(SJR, na.rm = T), mean.SJR = mean(SJR, na.rm = T), n.A1.qualis = sum(qualis == 'A1', na.rm = T), n.A2.qualis = sum(qualis == 'A2', na.rm = T), median.authorship = median(as.numeric(order.aut), na.rm = T )) knitr::kable(my.tab)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.