parse8text: returns the parsed content of a text
In rbsa: Ancillary Functions for R Programming

Description Usage Arguments Details Value Examples

from a text comprising paragraphs and items finds the different components and returns them by means of a list. When the component is a paragraph, it is a character, When the component is an item list, it is a named list.
This function is not intended for standard users.

1
2
3

 
  parse8text(text,item1=c("{","}"),item2=c("<<",">>"),
             numb="#",bull="*",lsep="-")

`text`	The text to be parsed. For the moment just a character vector.
`item1`	character(2) the pair of tags to use to define the first value of an item. When numb interpreted as an enumeration, when * interpreted as an itemized list, if not a description list. The first character of item1[1] must start at the beginning a line and the two braces must be on the same line.
`item2`	character(2) the pair of tags to use to define the second value of an item.
`numb`	character(1) code to indicate automatically numbered items.
`bull`	character(1) code to indicate bullet items.
`lsep`	character(1) Each line starting with lsep is considered a tagging line to separate two paragraphs or two item lists. They can be used to separate a paragraph and an item list but are useless. Separating lines within list items are not considered as separating. Successive separating lines are considered as a unique separating line. They are eliminated in the resulting list.

Each item of a list must comprise two values, framed with item1 and item2. When the first value is numb, it is a numbered item; when the first value is bull, it is a bullet item; if not it is a labelled item.
Each component, paragraphs and items are supposed to be proposed on non overlapping lines.
Successive items are considered to belong to the list of items knowing that empty lines (comprising zero characters) are first eliminated (a line with a blank is not empty and will be considered as a paragraph). Also are eliminated lines starting with lsep, their role is to separate distinct paragraphs and lists.
When the braces for items are not consistent, no error is reported but the staff is interpreted as part of a paragraph.
When two list items have got identical labels, an error is reported.

A named list. The names for paragraphs start with P, those for item lists with L.

 
  parse8text(c("{a}","<<","pour voir",">>")); 
  uu <- c("1rst paragraph","","2d paragraph","", 
  "{#} <<un>>","{#}","<<deux>>","","3rd and last paragraph"); 
  parse8text(uu); 
  vv <- c("1rst paragraph","","2d paragraph","", 
  "{AA} <<un>>","{BBB}","<<deux>>","","3rd and last paragraph"); 
  parse8text(vv); 
  parse8text(rbsa0$text4$v);