strsplit1: Split the first field

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/strsplit1.R

Description

Split the first field from x, identified as all the characters preceeding the first unquoted occurrence of split.

Usage

1
strsplit1(x, split=',', Quote='"', ...)

Arguments

x

a character vector to be split

split

the split character

Quote

a quote character: Occurrences of split between pairs of Quote are ignored.

...

optional arguments for grep

Details

This function was written to help parse data from the US Department of Health and Human Services on cyber-security breaches affecting 500 or more individuals. As of 2014-06-03 the csv version of these data included commas in quotes that are not sep characters. this function was written to split the fields one at a time to allow manual processing to make it easier to correct parsing errors.

Algorithm:

1. spl1 <- regexpr(split, x, ...)

2. Qt1 <- regexpr(Quote, x, ...)

3. For any (Qt1<spl1), look for Qt2 <- regexpr(Quote, substring(x, Qt1+1)), then look for spl1 <- regexpr(split, substring(x, Qt1+Qt2+1))

4. out <- list(substr(x, 1, spl1-1), substr(x, spl1+1))

Value

A list of length 2: The first component of the list contains the character strings found before the first unquoted occurrence of split. The second component contains the character strings remaining after the characters up to the identified split are removed.

Author(s)

Spencer Graves

See Also

strsplit substring grep

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
chars2split <- c(qs00='abcdefg', qs01='abc,def', 
   qs10a='"abcdefg', qs10b='abc"defg', 
   qs1.1='"abc,def', qs20='"abc" def', 
   qs2.1='"ab,c" def', qs21='"abc", def', qs22.1='"a,b",c')    

split <- strsplit1(chars2split)

# answer
split. <- list(c(qs00='abcdefg', qs01='abc', qs10a='"abcdefg', 
   qs10b='abc"defg', qs1.1='"abc,def', qs20='"abc" def', 
   qs2.1='"ab,c" def', qs21='"abc"', qs22.1='"a,b"'), 
               c(qs00='', qs01='def', qs10a='', 
   qs10b='', qs1.1='', qs20='', qs2.1='', 
   qs21=' def', qs22.1='c') )

all.equal(split, split.)

Ecfun documentation built on May 2, 2019, 6:53 p.m.