RTypeInference: Infer Types of Inputs and Outputs for R Expressions

Some functions are polymorphic, which means that the types of their parameters are flexible. An example of this is the identity function

function(x) x

where the parameter x is not constrained to any specific type.

We can represent the type of the identity function by $$\forall a. a \to a$$, where $$a$$ is a "type variable" that serves as a placeholder for a concrete type. The quantifier $$\forall a$$ says that $$a$$ is a parameter, which means $$a$$ can be replaced by a different type each time the function is called. Consequently, this kind of polymorphism is called parametric polymorphism.

Type variables are also used when a type is unknown. For instance, in

function(x) {
  y = f(x)
  g(y)
}

the type of x is initially unknown, so x is assigned a new type variable, say $$t_1$$. If it is known that $$ f: \forall a. a \to a g: Integer \to Boolean $$ then we can conclude that $$t_1 = Integer$$. This requires several steps. First, the instance of f in the expression f(x) must have type $$t_1 \to t_1$$ because $$x: t_1$$. In the implementation, this is computed in four steps:

The function f is instantiated to type $$t_2 \to t_2$$ for a new type variable $$t_2$$. Note that $$t_2$$ is not quantified.
The call f(x) says that f must also have type $$t_1 \to t_3$$ for a new type variable $$t_3$$. In other words, f must be a function and must accept an argument with the same type as x; the return type is unknown.
The two types for f must agree, so they are unified. This produces a substitution, for example $$t_1 \mapsto t_2, t_3 \mapsto t_2$$.
The substitution is applied to the type environment, so $$x: t_2$$. This is important because the substitution may affect the original type variables for some of the arguments (as with $$t_1$$ here).

As a result, the type of f(x) is $$t_2$$ and therefore the type of y is $$t_2$$. Since $$t_2$$ is already bound to x in the type environment, it cannot be quantified.

Next, the same procedure is applied to the call g(y). The function g is not polymorphic, so the first step is simplified:

The function g has type $$Integer \to Boolean$$
The call g(x) says that g must also have type $$t_2 \to t_4$$ for a new type variable $$t_4$$.
The two types for g must agree, so they are unified. This produces the substitution $$t_2 \mapsto Integer, t_4 \mapsto Boolean$$.
The substitution is applied to the type environment, so $$x: Integer$$.

duncantl/RTypeInference documentation built on Jan. 16, 2021, 12:30 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

duncantl/RTypeInference
Infer Types of Inputs and Outputs for R Expressions

inst/notes/instantiation.md
In duncantl/RTypeInference: Infer Types of Inputs and Outputs for R Expressions

R Package Documentation

Browse R Packages

We want your feedback!

duncantl/RTypeInference Infer Types of Inputs and Outputs for R Expressions

inst/notes/instantiation.md In duncantl/RTypeInference: Infer Types of Inputs and Outputs for R Expressions

R Package Documentation

Browse R Packages

We want your feedback!

duncantl/RTypeInference
Infer Types of Inputs and Outputs for R Expressions

inst/notes/instantiation.md
In duncantl/RTypeInference: Infer Types of Inputs and Outputs for R Expressions