returns the places of one word

Share:

Description

from a text, gives back the positions of a word indicated by the associated intervals as defined in the function text3interval. The search can be constrained to specific columns. The output can also be restricted.

Usage

1
2
3
4
5
 
  text3places8word(text,word,
                   column=c(1,Inf),
                   which=c(1,Inf)
                   ) 

Arguments

text

A character vector containing the text (a component, a line).

word

character(1) the word to be found.

column

The columns where the first character of the word must found. c(1,1) means that it must be at the very start of a line. c(10,12) means that it must start on the 10th, 11th or 12th column of a line.

which

Which occurences of word (not the line numbers) must be returned defined by the number of the first one and the number of the last one.
So c(2,2) will designate the second and only the second; c(1,5) will ask for the first five. When the components are both negative, the numbering is done from the end, so c(-1,-1) means the last one and c(-1,-3) asks for the last three ones given starting from the last.

Details

The word cannot be extended upon two successive lines but the same line can have more than one word. Be aware that overlapping patterns are not all detected (see one of the examples).

Value

A four column matrix, each row corresponding to a word place with the help of an interval.
For negative values of which, the order of occurences is reversed: the last found will be in the first row of the matrix output.

Future

Think of a way to introduce "end of line" as a possible word. Improve the case of overlapping patterns.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
 
  text3places8word(letters,"j"); 
  text3places8word(letters,"J"); 
  text3places8word(c("Il etait une fois","un petit et rouge chaperon"),"et"); 
  text3places8word(c("Il etait une fois","un petit et rouge chaperon"),"et",which=c(2,3)); 
  text3places8word(c("Il etait une fois","un petit et rouge chaperon"),"et",which=-c(1,3)); 
  text3places8word(c("# Il etait une fois"," #un petit et rouge chaperon"),"#"); 
  text3places8word(c("# Il etait une fois"," #un petit et rouge chaperon"),"#",column=c(1,2)); 
  text3places8word(c("# Il etait une fois"," #un petit et rouge chaperon"),"#",column=c(2,2)); 
  # overlapping pattern 
  text3places8word("aaaa","aa"); 

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.