trackPrevContext: Functions related to mentions of the same entity in...
In johnwdubois/rezonateR: A Support Package for Working with Rezonator in R

lastMentionUnit

R Documentation

Functions related to mentions of the same entity in previous/following context in track chains.

Description

Usage

lastMentionUnit(
  unitSeq = NULL,
  chain = NULL,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F
)

unitsToLastMention(
  unitSeq = NULL,
  chain = NULL,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F
)

lastMentionToken(
  tokenOrder = NULL,
  chain = NULL,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F
)

tokensToLastMention(
  tokenOrder = NULL,
  chain = NULL,
  zeroProtocol = "literal",
  zeroCond = NULL,
  unitSeq = NULL,
  unitTokenSeqName = NULL,
  unitDF = NULL,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F
)

countPrevMentions(
  windowSize,
  unitSeq = NULL,
  chain = NULL,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F
)

countPrevMentionsIf(
  windowSize,
  cond,
  unitSeq = NULL,
  chain = NULL,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F
)

countPrevMentionsMatch(
  windowSize,
  field,
  unitSeq = NULL,
  chain = NULL,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F
)

getPrevMentionField(
  field,
  tokenOrder = NULL,
  chain = NULL,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F
)

nextMentionUnit(
  unitSeq = NULL,
  chain = NULL,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F
)

unitsToNextMention(
  unitSeq = NULL,
  chain = NULL,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F
)

nextMentionToken(
  tokenOrder = NULL,
  chain = NULL,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F
)

tokensToNextMention(
  tokenOrder = NULL,
  chain = NULL,
  zeroProtocol = "literal",
  zeroCond = NULL,
  unitSeq = NULL,
  unitTokenSeqName = NULL,
  unitDF = NULL,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F
)

countNextMentions(
  windowSize,
  unitSeq = NULL,
  chain = NULL,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F
)

countNextMentionsIf(
  windowSize,
  cond,
  unitSeq = NULL,
  chain = NULL,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F
)

countNextMentionsMatch(
  windowSize,
  field,
  unitSeq = NULL,
  chain = NULL,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F
)

getNextMentionField(
  field,
  tokenOrder = NULL,
  chain = NULL,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F
)

countPrevBridges(
  windowSize,
  frameMatrix,
  unitSeq = NULL,
  chain = NULL,
  inclRelations = NULL
)

Arguments

`unitSeq`	The vector of tokenOrder values where the mentions appeared. You can choose tokenOrderFirst, tokenOrderFirst, or maybe an average of the two. By default it's tokenOrderFirst.
`chain`	The chain that each mention belongs to.
`exclFrag`	Exclude 'fragments' (i.e. members of a combined chunk which do not serve as meaningful chunks in their own right)
`combinedChunk`	The `combinedChunk` column of the rezrDF. By default, named `combinedChunk`.
`nonFragmentMember`	Vector indicating whether each entry is a non-fragment member, i.e. a member of a combined chunk that also serves as a meaningful chunk in its own right.
`tokenOrder`	The vector of sequence values values where the mentions appeared. Common choices are docTokenSeqFirst, docTokenSeqLast, wordTokenSeqFirst and wordTokenseqLast (the last two are available after running addIsWordField on a rezrObj. By default it's docTokenSeqLast.
`zeroProtocol`	If `literal`, I will take the seq values of the zeroes at face value. (If you set zeros as non-words and use `docWordSeqFirst` or `discourseWordSeLast` as your `tokenOrder`, this will lead to meaningless values.) If `unitFinal`, I will treat zeroes as if they were the final word of the unit. If `unitFirst`, I will treat zeroes as if they were the first word of the unit.
`zeroCond`	A condition for determining whether a token is zero. For most people, this should be `⁠(word column) == "<0>"⁠`.
`unitTokenSeqName`	The name of the corresponding tokenSeq column in the unit column. By default, `docTokenSeqLast` is used.
`windowSize`	The size of the window in which you will be counting.
`cond`	For if functions, the condition that the previous / next mention must satisfy. It cannot refer to the current mention.
`field`	The field whose value you want to match or extract.

Note

The default values do not work with case_when(). I am still figuring out why. In the meantime, please specify unitSeq, combinedchunk etc. within case_when().

Examples

sbc007 = addUnitSeq(sbc007, "track")
#Get the number of units to the last mention
sbc007$trackDF$default = sbc007$trackDF$default %>%
rez_mutate(unitsToLastMention = unitsToLastMention(unitSeqLast))
#Get the number of words to the last mention
sbc007$trackDF$default =  sbc007$trackDF$default %>%
rez_mutate(wordsToLastMention = tokensToLastMention(
docWordSeqLast, #What seq to use
zeroProtocol = "unitFinal", #How to treat zeroes
zeroCond = (text == "<0>"),
unitDF = sbc007$unitDF)) #Additional argument for unitFinal protocol
#Get the character length of the previous mention
sbc007$trackDF$default = sbc007$trackDF$default %>%
addFieldLocal(fieldName = "prevLength",
              expression = nchar(getPrevMentionField(text)),
              fieldaccess = "auto")
#Get the number of zero mentions and zero status-matching mentions in the last 20 units
sbc007$trackDF$default %>%
rez_mutate(isZero = text == "<0>") %>%
rez_mutate(noPrevZeroMentionsIn20 = countPrevMentionsIf(20, isZero),
            noPrevZeroMentionsIn20 = countPrevMentionsMatch(20, isZero))

johnwdubois/rezonateR documentation built on April 17, 2025, 4:08 p.m.