Man pages for giocomai/castarter
Content Analysis Starter Toolkit

cas_archiveArchive originals of downloaded files in compressed folders
cas_backup_gdBackup files to Google Drive
cas_browseOpen in a browser a URL stored in the local database
cas_build_urlsURL builder
cas_check_corpusChecks if given corpus exists, and, optionally updates it
cas_check_db_folderChecks if database folder exists, if not returns an...
cas_check_read_db_contents_dataReturns a corpus from the 'contents_data' table in the...
cas_check_use_dbCheck caching status in the current session, and override it...
cas_check_website_folderChecks if current website folder exists
cas_connect_to_dbReturn a connection to be used for caching
cas_convert_db_typeConvert database type, e.g. from DuckDB to SQLite
cas_countCount strings in a corpus
cas_count_relativeCount strings in a corpus relative to the number of words
cas_count_total_wordsCount total words in a dataset
cas_create_db_folderCreates the base folder where 'castarter' stores the project...
casdb_empty_index_idEmpty data frame with the same format as data stored in the...
cas_delete_corpusDelete previously stored corpora written with...
cas_delete_from_dbDelete rows from selected database table
cas_disable_dbDisable caching for the current session
cas_disconnect_from_dbEnsure that connection to database is disconnected...
cas_downloadDownloads files systematically, and stores details about the...
cas_download_chromoteDownloads one file at a time with chromote
cas_download_httrDownloads one file at a time with httr
cas_download_indexDownloads index files systematically, and stores details...
cas_download_internalDownloads one file at a time with readLines
cas_download_legacyDownloads html pages based on a vector of links
cas_enable_dbEnable caching for the current session
cas_explorerRun the Shiny Application
cas_explorer_legacyRun the Shiny Application
cas_export_tablesExport database tables to another format such as csv
cas_extractExtract fields and contents from downloaded files
cas_extract_htmlFacilitates extraction of contents from an html file
cas_extract_linksExtract direct links to individual content pages from index...
cas_extract_scriptExtracts scripts from an html page
cas_find_extractorFacilitate finding extractors, typically to be used with...
cas_generate_metadataGenerate basic metadata about the corpus, including start and...
cas_get_base_folderGet base folder under which files will be stored.
cas_get_base_pathBuild full path to base working folder
cas_get_corpus_pathGet path to folder where the corpus is stored.
cas_get_dbGet connection to database with details about current website
cas_get_db_fileGets location of database file
cas_get_db_settingsGet database connection settings from the environment
cas_get_files_to_downloadCreate a data frame with not yet downloaded files
cas_get_optionsGet key project parameters that determine the folder used for...
cas_get_path_to_filesGet path to locally downloaded files
cas_get_urls_dfChecks that a given input corresponds to the format expected...
cas_get_website_folderGet folder were files and data related to the current website...
cas_ia_checkGets an Archive.org Wayback Machine URL
cas_ia_saveSave a URL the Internet Archive's Wayback Machine
cas_kwicAdds a column with n words before and after the selected...
cas_kwic_single_patternAdds a column with n words before and after the selected...
cas_read_corpusRead datasets created with 'cas_write_dataset'
cas_read_db_contents_dataRead contents data from local database
cas_read_db_contents_idRead contents from local database
cas_read_db_downloadRead index from local database
cas_read_db_iaRead status on the Internet Archive of given URLs
cas_read_db_ignore_idRead identifiers to be ignored from the local database
cas_read_db_indexRead index from local database
cas_read_db_urlsRead urls stored in the local database
cas_read_from_dbReads data from local database
cas_reset_dbDelete a specific table from database
cas_reset_db_contents_dataRemoves from the local database the folder where extracted...
cas_reset_db_contents_idRemoves from the local database the folder where links to...
cas_reset_db_ignore_idRemoves from the local database all identifiers included in...
cas_reset_db_index_idRemoves from the local database the table where links to...
cas_reset_download_contentsDelete all files and database records for the contents pages...
cas_reset_download_indexDelete all files and database records for the index pages of...
cas_restoreRestore files from compressed files
cass_build_urlsHelps you define the parameters you need for building index...
cass_combine_into_patternCombines a vector of words into a string to be used for regex...
cass_download_csv_appA minimal shiny app that demonstrates the functioning of...
cas_set_dbSet database connection settings for the session
cas_set_db_folderSet folder for storing the database
cas_set_optionsSet key project parameters that determine the folder used for...
cass_highlightTakes a character vector and returns it with matches of...
cas_show_barchart_ggiraphCreates interacative barchart with ggiraph
cas_show_barchart_ggplot2Creates barchart with ggplot2
cas_show_gg_baseCreates base ggplot2 object to be used by ggplot or ggiraph
cas_show_ts_dygraphCreate dygraphs based on a data frame typically generated...
cass_show_ts_dygraph_appA minimal shiny app that demonstrates the functioning of...
cass_split_stringSplit string into multiple inputs
cas_summariseSummarise for a given time period word counts, typically...
cas_updateUpdate corpus
cas_write_corpusExport the textual dataset for the current website
cas_write_db_contents_dataWrite extracted contents to local database
cas_write_db_contents_idWrite contents URLs to local database
cas_write_db_ignore_idIgnore a set of ids from the download or processing step
cas_write_db_indexWrite index URLs to local database
cas_write_db_urlsWrite index or contents urls directly to the local database
cas_write_to_dbGeneric function for writing to database
pipePipe operator
tidyevalTidy eval helpers
giocomai/castarter documentation built on April 18, 2024, 6:48 p.m.