aline: Calculate aline distances

Description Usage Arguments Value Note Author(s) References See Also Examples

Description

The main user function for returning Aline Distances. Also it provides options for additional outputs such as the raw alignments and individual distance measurements. Word lists are passed as two vectors (w1, w2) such that the nth element of each vector are compared.

Usage

1
aline(w1, w2, sim = FALSE, m1 = NULL, m2 = NULL, mark=FALSE, alignment = FALSE, ...)

Arguments

w1

A vector of IPA-encoded words.

w2

A second vector of IPA-encoded words to be aligned with w1.

sim

By default calculates the aline distance (normalized between word pairs) as defined in Downey et al. 2008. If TRUE aline similarity scores from (Kondrak 2000) are returned.

m1

User defined IPA symbol. See map() for details.

m2

User defined ALINE symbol. See map() for details.

alignment

If TRUE the funciton will return the aligned IPA word pairs.

mark

If TRUE the result will mark the invalid characters with "@"

...

Other parameters passed to raw.alignment().

Value

If alignment=FALSE the function returns a vector of scores such that the nth score is the aline distance between the nth elements of x and y.

If alignment=TRUE the function returns a data frame with each word pair represented in a column and with the following rows:

w1

The original IPA-encoded word vector.

w2

The original IPA-encoded word vector.

scores

The similarity or distance score as defined by argument sim.

a1

The alignment of the first word.

a2

The alignment of the second word.

Note

This function will issue warnings and drop unknown characters if an input word contains unmapped IPA symbols. If this happens, the warning can be eliminated by appending an additional IPA-ASCII character mapping

Author(s)

Sean Downey and Guowei Sun

References

Kondrak, G. (2000). A new algorithm for the alignment of phonetic sequences. In Proceedings of the 1st North American Chapter of the Association for Computational Linguistics Conference.

Downey, S. S., Hallmark, B., Cox, M. P., Norquest, P., & Lansing, J. S. (2008). Computational feature-sensitive reconstruction of language relationships: Developing the ALINE distance for comparative historical linguistic reconstruction. Journal of Quantitative Linguistics, 15(4), 340-369.

See Also

raw.alignment,map

Examples

1
2
3
4
5
6
7
8
9
x<-c(intToUtf8(c(361,109,108,97,116,952)),intToUtf8(c(100,105,331,331,105,114,97)))
y<-c(intToUtf8(c(418,109,108,97,116,952)),intToUtf8(c(100,105,110,110,105,114,97)))
# For CRAN requirement, to see x and y, type x,y in R console
x
y
aline(w1=x,w2=y)   # A warning is returned because of unknown character

# user substitution
aline(w1=x,w2=y,m1=intToUtf8(418),m2="o")

Example output

[1] "<U+0169>mlat<U+03B8>" "di<U+014B><U+014B>ira"
[1] "<U+01A2>mlat<U+03B8>" "dinnira"         
Invalid character: <U+01A2> in <U+01A2>mlat<U+03B8>
[1] 0.04615385 0.10810811
[1] 0.07647059 0.10810811

alineR documentation built on May 2, 2019, 11:26 a.m.

Related to aline in alineR...