Description Usage Format Source Examples
Metadata such as titles, authors, journal, and publication IDs for each
paper in the CORD-19 dataset. This comes from the
all_sources_metadata_DATE.csv
file in the decompressed dataset.
Note that the papers have been deduplicated based on paper_id, doi, or
title, and papers without a paper_id or title have been removed.
1 |
A tibble with one observation for each paper, and the following columns:
Unique identifier that can link to full text and citations. SHA of the paper PDF.
Source (e.g. pubmed, CZI...)
Title
Digital Object Identifier
pmcid
PubMed ID
License
Abstract
Publication year
Authors
Journal
Microsoft Academic Paper ID
CovidenceWHO
Does it have full text
https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge,
specifically the all_sources_metadata_DATE.csv
file.
1 2 3 4 5 6 7 8 9 10 11 12 13 | library(dplyr)
# What are the most common journals?
cord19_papers %>%
count(journal, sort = TRUE)
# What are the most common words in titles (or abstracts)?
library(tidytext)
cord19_papers %>%
unnest_tokens(word, title) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words, by = "word")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.