knitr::opts_chunk$set(collapse=TRUE, comment="#>") library(htmltools) library(DiagrammeR) library(DiagrammeRsvg)
Matthias Gondan\ Universität Innsbruck\ Department of Psychology\ Innrain 9\ A-6020 Innsbruck\ Matthias.Gondan-Rochon@uibk.ac.at
Prolog is a classical logic programming language with many applications in
expert systems, computer linguistics and traditional, that is, symbolic
artificial intelligence. The main strength of Prolog is
its concise representation of facts and rules for the representation of
knowledge and grammar, as well as its efficient built-in search engine for
closed world domains. R is a statistical programming language for
data analysis and statistical modeling which is widely used in academia and
industry. Besides the core library, a lot of packages have been developed for
all kinds of statistical problems, including statistics-based artificial
intelligence tools such as neural networks for machine learning and deep
learning. Whereas Prolog is weak in statistical computation, but strong in
symbolic manipulation, the converse may be said for the R language. SWI-Prolog
is a widely used Prolog system that offers a wide range of extensions for real
world applications, and there already exist two Prolog "packs" to invoke
R (rserve-client
, real
) from SWI-Prolog. Given the large user community of
R, there may also be a need for a connection in the reverse direction that
allows invoking Prolog queries in R computations. The R\ package rolog
connects to the SWI-Prolog system, thus enabling deterministic and
non-deterministic queries to the Prolog interpreter. Usage of rolog
is
illustrated by a few examples.
Statistics; Logic Programming; Artificial Intelligence; R; Prolog
The R [@R] programming language and environment is a widely used open source software for statistical data analysis. The basic R is a functional language with lots of support for storage and manipulation of different data types, and a strong emphasis on operations involving vectors and arrays. Moreover, a huge number of packages (e.g., CRAN, https://cran.r-project.org/) have been contributed that cover problems from areas as diverse as bioinformatics, machine learning, specialized statistical methods, web programming and connections to other programming languages.
An interface to Prolog is lacking so far. Based on earlier work by Kowalski, the logic programming language Prolog was invented in the 1970ies by Colmerauer and Roussel [@Kowalski1988], mostly for the purpose of natural language processing. Since then, logic programming has become an important driving force in research on artificial intelligence, natural language processing, program analysis, knowledge representation and theorem proving [@Shoham1994;@Lally2011;@Carro2004;@Hsiang1987]. SWI-Prolog [@Wielemaker2012] is an open-source implementation of Prolog that mainly targets developers of applications, with many users in academia, research and industry. SWI-Prolog includes a large number of libraries for "the real world", for example, a web server, encryption, interfaces to C/C++ and other programming languages, as well as a development environment and debugger. In addition, pluggable extensions (so-called packs) are available for specific tasks to enhance its capabilities.
Unlike R, Prolog is a declarative programming language consisting of facts and
rules that define relations, for example, in a problem space [@Newell1972].
Prolog's major strength is its built-in query-driven search engine that
efficiently deals with complex structured data, with the data not necessarily
being numerical. In fact, Prolog only provides a basic collection of arithmetic
calculations via a purely functional interface (is/2
). More complex
calculations such as matrix algebra, statistical models or machine learning need
help from other systems, for example, from R.
Angelopoulos et al. [-@Angelopoulos2013] summarize work at the intersection of symbolic knowledge representation and statistical inference, especially in the area of model fits [EM algorithms, MCMC, @Sato2001;@Angelopoulos2008] and stochastic logic programs [@Cussens2000;@Kimmig2011]. One of the major strengths of logic programming is handling constraints; and a number of systems for constraint satisfaction tools have been developed (constraint logic programming on booleans, finite domains, reals, and intervals) for that purpose [e.g., @Fruehwirth1998;@Triska2018]. Some constraint handlers exist in R (see the CRAN task view for optimization problems), but more of them would be available via a bridge between R and Prolog.
Earlier approaches to connect Prolog and R have been published as SWI-Prolog
packs [real, rserve_client, @Angelopoulos2013;@Rserve] and as a YAP
module [YapR, @YapR]. Whereas real
establishes a direct link to an embedded
instance of R, rserve-client
communicates with a local or remote R
service [@Urbanek2021]. The former approach emphasizes speed, the latter might
be preferred from a security perspective, especially in systems such as
SWISH [@SWISH] that accept only a set of sandboxed commands for Prolog, but do
not impose restrictions on R. A common feature of the two packages is that they
provide an interface for R calls from Prolog, but not the other way round, that
is, querying Prolog from R is not possible, so far.
The present package fills this gap through Prolog queries in R scripts, for
example, to perform efficient symbolic computations, searches in complex graphs,
parsing natural language and definite clause grammars. In addition, two Prolog
predicates are provided that enable Prolog to ring back to the R system for
bidirectional communication. Similar to real
, tight communication between the
two systems is established by linking to a shared library that embeds the
current SWI-Prolog runtime. The exchange of data is facilitated by the C++
interfaces of the two languages [@Edelbuettel2018;@Wielemaker2021]. A less tight
connection might be established using the recently developed machine query
interface [@Zinda2021] that allows socket-based communication between foreign
languages and SWI-Prolog (and, in fact, the MQI
documentation includes an
example in which R is called).
A bidirectional bridge between R and Prolog might overcome the limitations of
both languages, thereby combining the extensive numerical and statistical power
of the R system with Prolog's skills in the representation of knowledge and
reasoning. In addition to the useful little tools shown in the examples
below, rolog
can therefore contribute to progress at the intersection of
traditional artificial intelligence and contemporary statistical programming.
The next section presents the interface of rolog
in detail. Section\ 3
presents possible extensions of the package at both ends, in R and Prolog.
Section\ 4 is a list of illustrative examples that offer useful extensions to
the R system. Conclusions and further perspectives are summarized in Section\ 5.
rolog
has a rather minimalistic syntax, providing only the basic ingredients
to establish communication with the SWI-Prolog runtime. Ways to extend the
interface are described in Section\ 3.
After installation with install.packages("rolog")
, the package is loaded in
the standard way.
library(rolog)
We can see a short message telling the user which SWI-Prolog was found. The
package searches for SWI-Prolog based on the environment
variable SWI_HOME_DIR
, the registry (Windows only), an executable swipl
in
the PATH
, and if everything fails, R package rswipl
[@rswipl]. The message
can be silenced by the usual option quietly=TRUE
of the library
command.
Most of the work is done using the three R\ functions query
, submit
,
and clear
. The R program in Listing\ 1 illustrates a query to
Prolog's member/2
using rolog
's syntax rules.
# member(1, [1, 2.0, a, "b", X, true]) query(call("member", 1L, list(1L, 2.0, quote(a), "b", expression(X), TRUE))) # returns an empty list, stating that member(1, [1 | _]) is satisfied submit() # returns a list with constraints, stating that the query is also satisfied # if the fifth element of the list, X, is 1 submit() # close the query clear()
Listing\ 1.
: A query to Prolog's member/2
predicate.
query
. The function query(call, options)
is used to create a Prolog
query (without invoking it yet). The first argument is a regular R call that is
created using R's function call(name, ...)
. This call represents the Prolog
query that will be submitted in the later course. The creation of such
predicates and Prolog terms is described below and can become quite
contrived (see the examples in Section\ 4). The second argument, options
, may
be used for ad hoc modifications of the translation between R and Prolog, see
the section below. The function returns TRUE
on success. Note that query
does not check if a Prolog predicate corresponding to call
actually
exists (see submit()
below). Only a single query can be opened at a given
time. If a new query Q is created while another query R is still open, a
warning is shown and R is closed.
submit
. Once a query has been created, it can be submitted using submit()
.
If the query fails, the return value is FALSE
. If the query succeeds, a list
of constraints is returned, with bindings for the variables that satisfy the
query. Repeated calls to submit are possible, returning the different solutions
of a query (until it eventually fails). The distinction between the different
types of return values for success and failure (list vs. FALSE
) is facilitated
by the R function isFALSE(x)
.
clear
. Closes the query. The name of the function was chosen to avoid name
clashes with R's own built-in function close
. The function returns an
invisible TRUE
, even if there is no open query.
Three more functions consult
, once
, and findall
are provided for
convenience.
consult
. In most applications, a number of Prolog facts and rules will be
loaded into the system. To facilitate this recurrent task, the Prolog
directive consult/1
has been mirrored into R, consult(filename)
, with
filename
being a string or a vector of strings if multiple files are to be
consulted. The function returns TRUE
on success; in case of problems, it
returns FALSE
and an error message is shown.
once
and findall
. The function once(call, options)
is a convenience
function that acts as a shortcut for query(call, options)
, submit()
, and
clear()
. Similarly, findall(call, options)
abbreviates the
commands query(call, options)
, repetition of submit()
until failure,
and clear()
, returning a list collecting the return values of the individual
submissions.
Table\ 1 summarizes the rules for the translation from R objects to Prolog. Most rules work in both directions, but a few exceptions exist.
Table\ 1 : Creating Prolog terms from R
|R |Prolog |Note/Alternatives |
|:-----------------------|:-------------------------|:------------------------|
|expression(X)
|Variable X |not necessarily uppercase|
|as.symbol(abc)
|Atom abc |as.name
, quote
|
|TRUE
, FALSE
, NULL
|Atoms true, false, null | |
|"abc"
|String \"abc\" | |
|3L
|Integer 3 | |
|3
|Float 3.0 | |
|call("term", 1L, 2L)
|term(1, 2) | |
|list(1L, 2L, 3L)
|List [1, 2, 3] | |
|list(a=1, b=2, c=3)
|List [a-1, b-2, c-3] | |
|c(1, 2, 3, Inf)
|##(1.0, 2.0, 3.0, 1.0Inf) |vectors of length > 1 |
|c(1L, 2L, 3L)
or 1:3
|\'%%\'(1, 2, 3) | |
|c("a", "b", "c")
|\$\$(\"a\", \"b\", \"c\") | |
|c(TRUE, FALSE, NA)
|!!(true, false, na) | |
|sin
|function(x) :- sin(x) |primitive function |
|function(x) sin(x)
|function(x) :- sin(x) |self-written function |
|matrix(1:4, nrow=2)
|\'%%%\'(\'%%\'(1, 3), ...)|see also ###, \$\$\$, !!!|
In R, the basic elements such as integers, floating point numbers, character strings, and logicals are vectorized, and scalar entities are treated like vectors with one element. Conversely, Prolog does not natively support vectors or matrices. The problem is solved in the following way:
##/N
, %%/N
, $$/N
, and !!/N
for floating point numbers, integers,
strings and logicals, respectively.###/R
, %%%/R
, $$$/R
, and !!!/R
with the respective row vectors
as arguments.In the reverse direction, Prolog terms like ##/N
are translated back to R
vectors of length N, including the terms ##/0
and ##/1
that map to R
vectors of length 0 and 1, respectively. Translation of a polymorphic Prolog
term such as ##(a, 1.5)
to R will fail, since rolog
expects the arguments to be numeric.
If a Prolog object cannot be translated to R (e.g., a cyclic term), an error is
raised. If an R object that lacks a suitable representation in
Prolog (e.g., S4 class), a warning is printed and the result is unified
with na
.
To summarize, the rules for translation are not fully symmetrical. A quick check
for symmetry of the representation is obtained by a query to =/2
or
even r_eval/2
(see also below, subsection Prolog interface):
Q <- call("=", expression(X), c(1, 2, NA, NaN, Inf)) once(Q, options=list(portray=TRUE)) Q <- call("r_eval", c(1, 2, NA, NaN, Inf), expression(X)) once(Q)
The optional argument env
to query, once and findall allows to raise the
query (and, as a consequence, r_eval/1,2 in a specific environment.
A few package-specific options have been defined to allow some fine-tuning of the rules for translation between R and Prolog.
##
)###
)%%
/%%%
)!!
/!!!
)$$
/$$$
). The single dollar cannot be
used because it is the list operator in R.TRUE
(default), R vectors of length\ 1 are translated
to scalars in Prolog. If FALSE
(rarely used), R\ vectors are always
translated to ##/N
, or %%/N
, !!/N
, $$/N
, even if they have only one
element.TRUE
(default in query
), the result
of query
, once
and findall
includes an attribute with a text
representation of the query in Prolog.rolog
's own
preproc
function that maps R's x <= y
to Prolog's x =< y
and !=
to
\=
. Preprocessing can be turned off by assigning the R
function dontCheck
to the preproc option.rolog
's own
postproc
function that reverses the mapping from preproc
.The command rolog_options()
returns a list with all the options. The
options can be globally modified with options()
or in the optional
argument of query
, once
, and findall
.
options(rolog.intvec="iv") Q <- call("member", expression(X), list(c(1L, 2L), c(3.5, 4.5))) query(Q, options=list(realvec="rv")) submit() clear()
rolog
offers some basic support to call R from Prolog, that is, connecting the
two systems in the reverse direction. Two predicates can be used for this
purpose, r_eval(Call)
and r_eval(Function, Result)
. The former just invokes
R with the command Call
(ignoring the result); the latter evaluates Function
and unifies the result with Result
. Note that proper quoting of R functions is
needed at the Prolog end, especially with R functions that start with uppercase
letters and/or contain a dot in their name (see Section\ 4).
Package rolog
has limited support for exception handling. If Prolog raises an
exception, the error string is forwarded to R using the stop
function.
The examples below illustrate this by querying an undefined Prolog predicate.
Q <- call("membr", expression(X), list(1, 2, 3)) query(Q) try(submit()) clear()
See Section\ 4 for another example with an error resulting from a malformed
query to r_eval/2
.
R is a functional language, whereas Prolog is declarative. Obviously, there cannot be a perfect one-to-one correspondence between the syntactic components of two programming languages that follow completely different paradigms. Whereas symbols, functions, numbers and character strings are easily mapped between R and Prolog, there are loose ends at both sides. The package is intentionally kept minimalistic, but can easily be extended by convenience functions at both ends, Prolog and R, to facilitate recurrent tasks and/or avoid cumbersome syntax.
In particular, Prolog variables are translated from and to
R expressions (not to be confused with R symbols), and R vectors of
length greater than 1 are translated to the Prolog terms #/N
, %/N
, !/N
,
and $$/N
, as mentioned above. These rules are, in principle, arbitrary
and can be intercepted at several stages.
as.rolog
)The process is illustrated in Figure\ 1.
HTML(export_svg(grViz( 'digraph G { rankdir=LR Query Result subgraph cluster_0 { style=filled color=lightgrey node [style=filled,color=white] r2rolog -> forth -> rolog_pl } subgraph cluster_1 { style=filled color=lightgrey node [style=filled,color=white] rolog2r -> back [dir=back] back -> pl_rolog [dir=back] } Query -> r2rolog rolog_pl:e -> Prolog pl_rolog:e -> Prolog [dir=back] Result -> rolog2r [dir=back] Query [shape=Mdiamond;width=0.7;height=0.7] r2rolog [shape=rect,label="preproc"] forth [label="(rolog)"] rolog_pl [shape=rect,label="preproc/2"] Prolog [shape=Mcircle] pl_rolog [shape=rect,label="postproc/2"] rolog2r [shape=rect,label="postproc"] back [label="(rolog)"] Result [shape=Msquare] }')))
Figure\ 1 : Workflow in rolog
rolog
uses a default preprocessing function preproc(query)
to map the R
operators <=
and !=
to their Prolog counterparts =</2
and \=/2
,
respectively.
However, we have seen above that raising even simple everyday Prolog queries
such as member(X, [1, 2, 3, a, b])
require complicated R expressions
like call("member", expression(X), list(1, 2, 3, quote(a), quote(b)))
. The R
function as.rolog(query)
is meant to simplify this a bit by translating
symbols starting with a dot to Prolog variables, and calls
like ""[1, 2, 3, a, b]
to lists. In the example below, as.rolog
is added to
the queue of preprocessing functions.
a <- 5 Q <- quote(member(.X, ""[1, 2, 3, a, (a), 1 <= 2])) once(Q, options=list(preproc=list(as.rolog, preproc), portray=TRUE))
Note that the name of the variable will still be X
in the later course,
not "dot-X". As illustrated by the example above, as.rolog
treats the
argument a
as a symbol; to evaluate the respective variable (i.e., "unquote"),
it can be put in parentheses.
Preprocessing can be turned off by setting the option preproc
to the identity
function dontCheck
.
Section\ 3 includes an example for mathematical rendering of R expressions. In that example, a preprocessing function is used to bring function calls with named arguments to a canonical form which is then handled in Prolog. More sophisticated work with quasi-quotations and unquoting expressions is described in "Advanced R" [@Wickham2019].
In most cases, postprocessing will revert the manipulations during
preprocessing, and the default function postproc(query)
actually translates
the Prolog operators =<
and \=
back to their respective counterparts in R.
Many Prolog programmers are used to operate with atoms, whereas character
strings are the preferred representation of symbolic information in R. In the
example below, a second hook is put in the queue that converts the result of a
query like member(X, [a, b, c])
to strings.
stringify <- function(x) { if(is.symbol(x)) return(as.character(x)) if(is.call(x)) x[-1] <- lapply(x[-1], FUN=stringify) if(is.list(x)) x <- lapply(x, FUN=stringify) if(is.function(x)) body(x) <- stringify(body(x)) return(x) } Q <- quote(member(.X, ""[a, b, c])) R <- findall(Q, options=list(preproc=list(as.rolog, preproc), postproc=list(stringify, postproc))) unlist(R)
In other words, the query is satisfied if X
is either "a", or "b", or "c".
Recent versions of SWI-Prolog support so-called dictionaries of the
form Tag{Key1:Value1, Key2:Value2, ...}
. The tag is typically an
atom (but can be a variable, as well), the keys are unique atom or integers; the
values can be anything. Suppose we have a Prolog predicate that does something
with dicts, and we would like to query it from R. The simplest solution is a
wrapper in Prolog that translates key-value
pairs [Key1-Value1, Key2-Value2, ...]
back and forth to dicts:
do_something_with_pairs(Pairs0, Pairs1) :- dict_pairs(Dict0, my_dict, Pairs0), do_something_with_dicts(Dict0, Dict1), dict_pairs(Dict1, my_dict, Pairs1).
do_something_with_pairs/2
can then be queried from R using, for example, lists
with named elements (see Table\ 1).
once(call("do_something_with_pairs", list(a=1, b=2), expression(X)))
In the code above, dict_pairs/2
takes the role of both preproc/2
and postproc/2
in Figure\ 1. It illustrates that complicated syntax on the R
side can be much simplified when doing the conversion at the Prolog end. Ways to
extend Prolog by add-ons ("packs") are shown in the next section.
In this section we present a few usage examples for package rolog
in
increasing complexity. Although the code snippets are mostly self-explanatory,
some familiarity with the Prolog language is helpful.
Prolog's typical hello world example is a search through a directed acyclic graph (DAG), for example, a family tree like the one given in Listing\ 2.
parent(pam, bob). parent(bob, ann). parent(bob, pat). parent(pat, jim). ancestor(X, Z) :- parent(X, Z). ancestor(X, Z) :- parent(X, Y), ancestor(Y, Z).
Listing\ 2 : A family tree in Prolog (see also family.pl)
Listing\ 2 is included in the package and is accessed using the
function system.file
. Within Prolog, the normal workflow is to consult
the code with [family]
and then to raise queries such
as ancestor(X, jim)
, which returns, one by one, four solutions for the
variable X. In R, we obtain the following results:
library(rolog) consult(system.file(file.path("pl", "family.pl"), package="rolog")) query(call("ancestor", expression(X), quote(jim))) submit() # solutions for X submit() # etc. clear() # close the query
As stated above, consult
loads the facts and rules of Listing\ 2 into the
Prolog database. query
initializes a query, and the subsequent calls
to submit
return the conditions under which the query succeeds. In this
example, the query succeeds if X
is either pat
, pam
, or bob
. A query is
closed with clear()
, or automatically if the query fails. If we are interested
in just the first solution, we can use once(Call)
as a shortcut
to query(Call)
, then submit()
, then clear()
. If we want to collect all
solutions of a query with a finite set of solutions, we can use findall(Call)
.
As mentioned in Section\ 2, a simplified syntax is provided by as.rolog
that accepts quoted expressions with dots indicating Prolog variables:
Q <- quote(ancestor(.X, jim)) findall(Q, options=list(preproc=as.rolog))
A useful application of DAGs is confounder adjustment in causal
analysis [@greenland1999;@ggdag]. The Prolog file backdoor.pl
is an
implementation of Greenland et al.'s criteria for the backdoor test
for d-separation in DAGs, with a predicate minimal/3
that searches for
minimally sufficient sets of variables for confounder adjustment on the causal
path between exposure and outcome. The nodes and arrows refer to Figure\ 12 in
Greenland et al.
consult(system.file(file.path("pl", "backdoor.pl"), package="rolog")) node <- function(N) invisible(once(call("assert", call("node", N)))) node("a"); node("b"); node("c"); node("f"); node("u") node("e") # exposure node("d") # outcome arrow <- function(X, Y) invisible(once(call("assert", call("arrow", X, Y)))) arrow("a", "d"); arrow("a", "f"); arrow("b", "d"); arrow("b", "f") arrow("c", "d"); arrow("c", "f"); arrow("e", "d"); arrow("f", "e") arrow("u", "a"); arrow("u", "b"); arrow("u", "c") R <- findall(call("minimal", "e", "d", expression(S))) unlist(R)
The query to minimal/3
returns two minimally sufficient sets of covariates for
confounder adjustment (namely, {a, b, c} and {f}).
One of the main driving forces of Prolog development was natural language processing [@Dahl1981]. Therefore, the next example is an illustration of sentence parsing using so-called definite clause grammars. As Listing\ 3 shows, rolog can access modules from SWI's standard library (e.g., "dcg/basics.pl").
:- use_module(library(dcg/basics)). s(s(NP, VP)) --> np(NP, C), blank, vp(VP, C). np(NP, C) --> pn(NP, C). np(np(Det, N), C) --> det(Det, C), blank, n(N, C). np(np(Det, N, PP), C) --> det(Det, C), blank, n(N, C), blank, pp(PP). vp(vp(V, NP), C) --> v(V, C), blank, np(NP, _). vp(vp(V, NP, PP), C) --> v(V, C), blank, np(NP, _), blank, pp(PP). pp(pp(P, NP)) --> p(P), blank, np(NP, _). det(det(a), sg) --> `a`. det(det(the), _) --> `the`. pn(pn(john), sg) --> `john`. n(n(man), sg) --> `man`. n(n(men), pl) --> `men`. n(n(telescope), sg) --> `telescope`. v(v(sees), sg) --> `sees`. v(v(see), pl) --> `see`. p(p(with)) --> `with`. % Translate R string to code points and invoke phrase/2 sentence(Tree, Sentence) :- string_codes(Sentence, Codes), phrase(s(Tree), Codes).
Listing\ 3
: Simple grammar and lexicon. sentence/2
preprocesses the R call.
As in the first example, we first consult a little Prolog program with a
minimalistic grammar and lexicon (Listing\ 3, see also pl/telescope.pl
), and
then raise a query asking for the syntactic structure
of "john sees a man with a telescope". Closer inspection of the two results
reveals the two possible
meanings, "john sees a man who carries a telescope"
versus "john sees a man through a telescope". Further Prolog examples of
natural language processing are found in \citet{Blackburn2005}, including the
resolution of anaphoric references and the extraction of semantic meaning.
consult(system.file(file.path("pl", "telescope.pl"), package="rolog")) Q <- quote(sentence(.Tree, "john sees a man with a telescope")) unlist(findall(Q, options=list(preproc=as.rolog)))
In description of the previous example, we noted in passing that rolog
can
access the built-in libraries of SWI-Prolog (e.g., by calls to use_module/1
).
It is also possible to extend the installation by add-ons, including add-ons
that require compilation, if the build tools (essentially, RTools under Windows,
and xcode under macOS) are properly configured. This is illustrated below by the
demo add-on environ
[@Environ] that collects the current environment
variables.
once(call("pack_install", quote(environ), list(quote(interactive(FALSE))))) once(quote(use_module(library(environ)))) once(call("environ", expression(X)))
The query then unifies X with a list with Key=Value
terms. The purpose of
this example is obviously not to mimic the built-in function Sys.getenv()
from
R, but to illustrate the installation and usage of Prolog extensions from within
R. In most situations, the user would install the pack from within Prolog
with pack_install(environ).
.
Prolog is homoiconic, that is, code is data. In this example, we make use of Prolog's ability to match expressions against given patterns and modify these expressions according to a few predefined "buggy rules" [@Brown1978], inspired by recurrent mistakes in the statistics exams of our students. Consider the $t$-statistic for comparing an observed group average to a population mean:
$$ T = \frac{\overline{X} - \mu}{s / \sqrt{N}} $$
Some mistakes may occur in this calculation, for example, omission of the implicit parentheses around the numerator and the denominator when typing the numbers into a calculator, resulting in $\overline{X} - \frac{\mu}{s} \div \sqrt{N}$, or forgetting the square root around $N$, or both. Prolog code for the two buggy rules is given in Listing\ 4.
% Correct steps and mistakes expert(tratio(X, Mu, S, N), frac(X - Mu, S / sqrt(N))). buggy(frac(X - Mu, S / SQRTN), X - frac(Mu, S) / SQRTN). buggy(sqrt(N), N). % Apply expert and buggy rules, or enter expressions step(X, Y) :- expert(X, Y) ; buggy(X, Y). step(X, Y) :- compound(X), mapargs(search, X, Y), dif(X, Y). % Search through problem space search(X, X). search(X, Z) :- step(X, Y), search(Y, Z).
Listing 4 : Manipulating terms in Prolog
The little e-learning system shown in Listing\ 4 produces six response alternatives. The fourth and the sixth result are combinations of the same two buggy rules (parenthesis, then square root, and the other way round). Some additional filters would be needed to eliminate trivial and redundant solutions \citep[see, e.g., the chapter on generate-and-test in][]{Sterling1994}.
consult(system.file(file.path("pl", "buggy.pl"), package="rolog")) Q <- quote(search(tratio(x, mu, s, n), .S)) unlist(findall(Q, options=list(preproc=as.rolog)))
An important feature of such a term manipulation is that the evaluation of the term can be postponed; for example, there is no need to instantiate the variables x, mu, s, and n with given values before raising a query. This is especially helpful for variables that may represent larger sets of data in later steps.
It should be mentioned that R is homoiconic, too, and the Prolog code above can, in principle, be rewritten in R using non-standard evaluation techniques [@Wickham2019]. Prolog's inbuilt pattern matching algorithm simplifies things a lot, though.
The R extension of the markdown language [@Xie2020] enables reproducible
statistical reports with nice typesetting in HTML, Microsoft Word, and Latex.
However, so far, R expressions such as pbinom(k, N, p)
are typeset as-is;
prettier mathematical expressions such as $P_\mathrm{Bi}(X \le k; N, p)$ require
Latex commands like P_\mathrm{Bi}\left(X \le k; N, p\right)
, which are
cumbersome to type in and hard to read even if the expressions are simple.
Since recently, manual pages include support for mathematical
expressions [@Sarkar2022], which already is a big improvement.
Below Prolog's grammar rules are used for an automatic translation of R calls
to MathML. The result can then be used for calculations or it can be rendered on
a web page. A limited set of rules for translation from R to MathML is found
in pl/mathml.pl
of package rolog
. A more comprehensive translator is
provided by the R package mathml
[@mathml]. The relevant code snippets are
shown in the listings below, along with their output.
library(rolog) consult(system.file(file.path("pl", "mathml.pl"), package="rolog")) # R interface to Prolog predicate r2mathml/2 mathml <- function(term) { t <- once(call("r2mathml", term, expression(X))) cat(paste(t$X, collapse="")) }
Listing 4 : Generate MathML from R expressions
The first example is easy. At the Prolog end, there is a handler for pbinom/3
that translates the term into a pretty MathML syntax like P_bi(X <= k; N, pi).
term <- quote(pbinom(k, N, p)) # Pretty print mathml(term) # Do some calculations with the same term k <- 10 N <- 22 p <- 0.4 eval(term)
The next example is interesting because Prolog needs to find out the name of
the integration variable for sin
. For that purpose, rolog provides a
predicate r_eval/2
that calls R from Prolog (i.e., the reverse direction, see
also next example). Here, the predicate is used for the
R\ function formalArgs(args(sin))
, which returns the name of the function
argument of sin
, that is, x
.
term <- quote(integrate(sin, 0L, 2L*pi)) mathml(term) eval(term)
Note that the Prolog end, the handler for integrate/3
is rather rigid; it
accepts only these three arguments in that particular order, and without names,
that is, integrate(sin, lower=0L, upper=2L * pi)
would not print the desired
result.
The extra R function canonical()
applies match.call()
to non-primitive R
calls, basically cleaning up the arguments and bringing them into the correct
order. Moreover, an extra handler maps the extractor function $(Fn, "value")
to Fn
.
canonical <- function(term) { if(is.call(term)) { f <- match.fun(term[[1]]) if(!is.primitive(f)) term <- match.call(f, term) # Recurse into arguments term[-1] <- lapply(term[-1], canonical) } return(term) } g <- function(u) sin(u) # Mixture of (partially) named and positional arguments in unusual order term <- quote(2L * integrate(low=-Inf, up=Inf, g)$value) mathml(canonical(term)) # It is a bit of a mystery that R knows the result of this integral. eval(term)
Note that both sin
nor g
in the above terms are R symbols, not R functions.
In order to render something like call("integrate", low=-Inf, up=Inf, g)
,
or call("integrate", low=-Inf, up=Inf, sin)
, with g
and sin
referring to
the respective functions, one would need to determine its name, which is not
possible in general.
print(g)
The basic workflow of the bridge from R to Prolog is to (A)\ translate an
R\ expression into a Prolog term (i.e., a predicate), (B)\ query the predicate,
and then, (C)\ translate the result (i.e., the bindings of the variables) back
to R (see also Figure\ 1). The reverse direction is straightforward, we start by
translating a Prolog term to an R\ expression (i.e. Step\ C), evaluate the
R\ expression, and then translate the result back to a Prolog\ term (Step\ A).
Package rolog
provides two predicates for that purpose, r_eval(Expr)
and r_eval(Expr, Res)
. The former is used to invoke an R\ expression Expr
for its side effects (e.g., initializing a random number generator); it does not
return a result. The latter is used to evaluate the R\ expression and return the
result Res
. The code snippet in Listing\ 6 (r_eval.pl
) illustrates this
behavior.
r_seed(Seed) :- r_eval('set.seed'(Seed)). r_norm(N, L) :- r_eval(rnorm(N), L).
Listing\ 6
: Calling R from Prolog using r_eval/1
and r_eval/2
. The R\ call set.seed
is quoted because the dot is an operator in Prolog.
consult(system.file(file.path("pl", "r_eval.pl"), package="rolog")) invisible(once(call("r_seed", 123L))) once(call("r_norm", 3L, expression(X)))
The example in Listing\ 6 is a bit trivial, basically illustrating the syntax
and the workflow. More serious applications of are shown in the next two
sections where r_eval/2
is used to evaluate monotonically behaving
R\ functions and to obtain the names of function arguments in R.
As show below, the default environment of rolog
's r_eval/2
is .GlobalEnv
,
this can be changed in an optional argument to once()
, findall()
,
and query()
.
# Set variable in R, read in Prolog env <- new.env() with(env, a <- 1) once(call("r_eval", quote(a), expression(X)), env=env) # Set R variable in Prolog, read in R invisible(once(call("r_eval", call("<-", quote(b), 2)))) cat("b =", b)
If the R call raises an exception, an error is propagated to Prolog
and finally to the rolog
package:
#try(once(quote(r_eval(rnorm(-1))))) # return "-1" random normals
Let $\langle\ell, u\rangle$ denote a number between $\ell$ and $u$, $\ell\le u$.
It is easily verified that the result of the
difference $\langle\ell_1, u_1\rangle - \langle\ell_2, u_2\rangle$ is somewhere
in the interval $\langle \ell_1 - u_2, u_1 - \ell_2\rangle$, and a number of
rules exist for basic arithmetic operations and (piecewise) monotonically
behaving functions [@Hickey2001]. For ratios, denominators with mixed sign yield
two possible intervals, for example,
$\langle 1, 2\rangle / \langle -3, 3\rangle = \langle -\infty, 3\rangle \cup \langle 3, \infty\rangle$,
as shown in Figure 4 in Hickey et al.'s article. The number of possible
candidates increases if more complicated functions are involved, as unions of
intervals themselves appear as arguments (e.g., if $I_1 \cup I_2$ is added
to $I_3 \cup I_4$, the result
is $I_1 + I_3 \cup I_1 + I_4 \cup I_2 + I_3 \cup I_2 + I_4$). As a consequence,
calculations in interval arithmetic are non-deterministic in nature, and the
number of possible results is not foreseeable and cannot, in general, be
vectorized as is often done in R. Use cases for interval arithmetic are the
limitations of floating-point representations in computer hardware, but
intervals can also be used to represent the result of measurements with limited
precision, or truncated intermediate results of students doing hand
calculations. A few rules for basic interval arithmetic are found
in pl/interval.pl
; a few examples are shown below. Again, Prolog rings back to
R via r_eval/2
to determine the result of dbinom(X, Size, Prob, Log)
.
#consult(system.file(file.path("pl", "interval.pl"), package="rolog")) #Q <- quote(int(`...`(1, 2) / `...`(-3, 3), .Res)) #unlist(findall(Q, options=list(preproc=as.rolog))) #D <- quote(`...`(5.7, 5.8)) #mu <- 4 #s <- quote(`...`(3.8, 3.9)) #N <- 24L #tratio <- call("/", call("-", D, mu), call("/", s, call("sqrt", N))) #once(call("int", tratio, expression(Res))) # Binomial density #prob = quote(`...`(0.2, 0.3)) #once(call("int", call("dbinom", 4L, 10L, prob, FALSE), expression(Res)))
The slightly cumbersome syntax for entering an interval $\langle \ell, u\rangle$ is due to the fact that the ellipsis is a reserved symbol in R and cannot be used as an infix operator. A powerful and comprehensive system for constraint logic programming over intervals is available as a Prolog pack [@Workman2021] and can easily be connected to R using, for example, the present package.
R has become the primary language for statistical programming and data science,
but is currently lacking support for traditional, symbolic artificial
intelligence. There are already two add-ons for SWI-Prolog that allow to run R
calculations from Prolog [@Angelopoulos2013;@Rserve], but a connection in the
other direction was missing, so far. rolog
bridges this gap by providing an
interface to a SWI-Prolog distribution in an R package. The communication
between the two systems is mainly in the form of queries from R to Prolog, but
two predicates allow Prolog to ring back and evaluate terms in R. The design of
the package is minimalistic, providing three main
functions query()
, submit()
, and clear()
, and a very limited set of
convenience tools (consult()
, once()
, and findall()
) to facilitate
recurrent everyday actions. As both systems are homoiconic in nature, it was
easy to establish a one-to-one correspondence between many of the elements of
the two languages. Most exceptions (e.g., lack of R support for empty symbols)
can be avoided and/or circumvented by wrapper functions at both ends.
Simple ways to extend the package have been described in Section\ 2; such
extensions could, for example, include R objects and structures like those
returned by lm()
, or S4 classes. In many use cases, this may be realized by
transforming the R object to a list with named elements, and rebuild the object
on the Prolog end on an as-needed basis. After a query, the process is reversed.
If speed is an issue, more of these steps can, in principle, be moved into the
package and implemented in Rcpp
.
rolog
, thus, opens up a wide of applications in logic programming for
statisticians and researchers at the intersection of symbolic and connectionist
artificial intelligence, where concise knowledge representation is combined with
statistical power. Moreover, rolog
provides starting points for useful
small-scale solutions for everyday issues in data science (term transformations,
pretty mathematical output, interval arithmetic, see Section\ 3).
At its present stage, a major limitation of rolog
is its relatively slow
speed. For example, translation of R lists or vectors to the respective elements
of the Prolog language (also lists, #/N
) is done element-wise, in both
directions. The translation is optimized by using Rcpp
[@Edelbuettel2018], but
there remains an upper bound in the efficiency, because Prolog does not support
vectors or matrices. Since Prolog's primary purpose is not vector or matrix
calculation, this limitation may not show up in real-world applications. Another
issue, maybe a bit annoying, is the rather cumbersome syntax of the interface,
with the need for quoted calls and R expressions for representing Prolog
variables. rolog
was deliberately chosen to be minimalistic and, so far, only
depends on base R. A more concise representation might be obtained by tools from
the "Tidyverse" ecosystem, as described in Chapter\ 19 of
Advanced\ R [@Wickham2019]. Finally, at this stage, rolog
is unable to deal
with cyclic
terms (e.g., once(call("=", expression(A), call("f", expression(A))))
,
i.e., A = f(A)
raises an error message).
rolog
is available for R Version 4.2 and later, and can easily be installed
using the usual install.packages("rolog")
. The source code of the package is
found at https://github.com/mgondan/rolog/, including installation instructions
for Unix, Windows and macOS.
Development of the package profited substantially from the Prolog
packs rserve_client
[@Rserve] and real
[@Angelopoulos2013].
The results in this paper were obtained using
R\ r paste(R.Version()[7:8], collapse = ".")
with
the rolog
\ r packageVersion("rolog")
package. R\ itself and all packages
used are available from the Comprehensive R Archive Network (CRAN)
at https://CRAN.R-project.org/.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.