knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

To keep track of the urls we are working on, castarter facilitates storing urls, as well as some basic metadata about them, in an orderly fashion.

Database location and database file naming conventions

By default, databases are stored in the same folder as website data, e.g. under base_folder/project/website/.

The location of the database cas be retrieved with:

cas_get_db_folder()
cas_get_db_file()

The filename of the SQLite database includes reference to both the project and the website name. This allows to store all database files of different projects in a single folder, as the file naming convention should prevent overlaps.

What if details about multiple projects and websites are to be stored in a single database, e.g. because it relies on a MySQL database hosted on a server rather than on local SQLite databases? Then, the database will need an additional table, with a list of project and website associated to a unique id. That id is then used in table names of each project and website. This approach prevents potential issues with project or website names that may include characters that are not appropriate for a database table name. (#TODO: not yet implemented).

Main database tables and column names

These are the key tables to found in a castarter database:

Additional tables

Additional tables can be of course be kept in the same database to store additional data, metadata, or information about relevant contents.

For example, castarter facilitates checking and storing information about the availability of pages on the Internet Archive's Wayback Machine, relying on their API through the cas_check_ai() function. cas_save_ai() allows to request to the Wayback Machine to store a copy of the given page.



giocomai/castarter documentation built on May 4, 2024, 1:14 a.m.