R/tika_xml.R

Defines functions tika_xml

Documented in tika_xml

#' Get a Structured XHTML Rendition
#' 
#' If \code{output_dir} is specified, files will have the \code{.xml} file extension.
#'
#' @param input Character vector describing the paths and/or urls to the input documents.
#' @param ... Other parameters to be sent to \code{tika()}.
#' @return A character vector in the same order and with the same length as \code{input}, of unparsed \code{XHTML}. Unprocessed files are \code{as.character(NA)}.
#' @examples
#' \donttest{
#' batch <- c(
#'  system.file("extdata", "jsonlite.pdf", package = "rtika"),
#'  system.file("extdata", "curl.pdf", package = "rtika"),
#'  system.file("extdata", "table.docx", package = "rtika"),
#'  system.file("extdata", "xml2.pdf", package = "rtika"),
#'  system.file("extdata", "R-FAQ.html", package = "rtika"),
#'  system.file("extdata", "calculator.jpg", package = "rtika"),
#'  system.file("extdata", "tika.apache.org.zip", package = "rtika")
#' )
#' xml <- tika_xml(batch)
#' }
#' @export

tika_xml <- function(input, ...) {
  tika(input = input, output = "xml", ...)
}

Try the rtika package in your browser

Any scripts or data that you put into this service are public.

rtika documentation built on May 31, 2023, 8 p.m.