ipynbcount: Count text elements in Jupyter Notebook files
In rmdwc: Count Words and Characters in R Markdown and Jupyter Notebooks

View source: R/ipynbcount.R

ipynbcount

R Documentation

Count text elements in Jupyter Notebook files

Description

This function extracts text from specific cell types (e.g., markdown) in one or more .ipynb files and counts the number of characters, words, and lines. It optionally excludes certain patterns (e.g., code fences). The function uses a helper function rmdcount() to perform the counting on the extracted text.

Usage

ipynbcount(
  files,
  celltype = c("markdown"),
  space = "[[:space:]]",
  word = "[[:space:]]+",
  line = "\n",
  exclude = "```\\{.*?```"
)

Arguments

`files`	character: vector of paths to `.ipynb` (Jupyter Notebook) files.
`celltype`	character: vector indicating which cell types to include (default is `'markdown'`). Valid values include `'markdown'` and `'code'`.
`space`	character: pattern to split a text at spaces (default: `'[[:space:]]'`)
`word`	character: pattern to split a text at word boundaries (default: `'[[:space:]]+'`)
`line`	character: pattern to split lines (default: `'\n'`)
`exclude`	character: pattern to exclude text parts, e.g. code chunks (default: '```\\{.*?```')

Details

This function assumes that the notebook files are valid JSON and contain a list of cells under the cells field. It temporarily writes the extracted content to a file to reuse the rmdcount() logic.

Value

A data frame with counts of characters, words, and lines for each file. Additional columns include file (base name) and path (directory).

Examples

file <- system.file('ipynb/example_data_analysis.ipynb', package="rmdwc")
ipynbcount(file)                                   # without code
ipynbcount(file, celltype=c("markdown", "code"))   # with code

rmdwc documentation built on June 8, 2025, 10:44 a.m.