Description Usage Arguments Details Value Author(s) References See Also Examples
For asNumericChar
, delete leading blanks and a
leading dollar sign plus commas (thousand separators)
and drop information after a blank (other than leadng
blanks), then coerce to numeric or to factors, Dates,
or POSIXct as desired.
For a data.frame
, apply
asNumericChar
to all columns and drop
columns except those in keep
, ignore
,
factors
, Dates
, POSIX
and
MSdates
.
Then order the rows by the
orderBy
column. Some Excel imports include
commas as thousand separators; this replaces any
commas with char(0), ”, before trying to convert
to numeric.
Similarly, if "%" is found as the last character in any field, drop the percent sign and divide the resulting numeric conversion by 100 to convert to proportion.
Also, some character data includes footnote references following the year.
For example Table F-1 from the US Census Bureau
needs all three of these numeric conversion features:
It needs orderBy
, because the most recent
year appears first, just the opposite of most other
data sets where the most recent year appears last.
It has footnote references following a character
string indicating the year. And it includes commas
as thousand separators.
1 2 3 4 5 6 7 |
x |
For |
keep |
something to indicate which columns to keep, in
addition to columns specified in |
orderBy |
Which columns to order the rows of |
ignore |
vector identifying columns of |
factors |
vector indicating columns of |
Dates |
vector indicating columns of |
POSIX |
vector indicating columns of |
format. |
Character vector of length 1 to pass
as argument For |
MSdates |
The names or numbers identifying columns of x
identifying dates as integer numbers of days
since 1899-12-31. In
Microsoft Excel, dates are stored in that
format. Tests on 2016-08-10 revealed that
|
leadingChar |
A regular expression passed to
|
suppressChar |
a regular expression passed to
|
pctChar |
A regular expression passed to
|
For asNumericChar:
1. Replace commas by nothing
2. strsplit on ' ' and take only the first part, thereby eliminating the footnote references.
3. Replace any blanks with NAs
4. as.numeric
for asNumericDF:
1. Copy x
to X
.
2. Confirm that ignore
, factors
,
Dates
, and POSIX
all refer to columns
of x
and do not overlap. [*** NOTE: as of
2016-07-21, these checks have only been implemented
for ignore
.]
3. Convert factors
, Dates
, and
POSIX
.
4. Apply asNumericChar
to all columns
not in ignore
, factors
, Dates
,
or POSIX
.
5. Keep columns specified by keep
.
6. return the result.
a data.frame
Spencer Graves
"Add (sum) or subtract dates; Applies To: Excel 2013". Microsoft. (accessed 2016-08-11)
scan
gsub
Quotes
stripBlanks
as.numeric
, factor
,
as.Date
, as.POSIXct
read.xlsx
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | ##
## 1. an example
##
xDate <- as.Date('1970-01-01')+c(0, 365)
xPOSIX <- as.POSIXct(xDate)+c(1, 99)
xMSdate <- as.Date(1, as.Date('1899-12-31'))+1:2
fakeF1 <- data.frame(yr=c('1948', '1947 (1)'),
q1=c(' 1,234 ', ''), duh=rep(NA, 2),
dol=c('$1,234', ''), pct=c('1%', '2%'),
xDate=as.character(xDate, format='%m-%d-%Y'),
xPOSIX=as.character(xPOSIX,
format='%m-%d-%Y %H:%M:%S'),
xMSdate=2:3, junk=c('this is','junk'))
# This converts the last 3 columns to NAs and drops them:
str(nF1.1 <- asNumericChar(fakeF1$yr))
str(nF1.2 <- asNumericChar(fakeF1$q1))
str(nF1.3 <- asNumericChar(fakeF1$duh))
nF1 <- asNumericDF(fakeF1)
nF2 <- asNumericDF(fakeF1, Dates=6, MSdate='xMSdate',
ignore=c('junk', 'xPOSIX'), format.='%m-%d-%Y')
nF3 <- asNumericDF(nF2, POSIX='xPOSIX',
ignore=c(5,7:8), format.='%m-%d-%Y %H:%M:%S')
# check
nF1. <- data.frame(yr=asNumericChar(fakeF1$yr),
q1=asNumericChar(fakeF1$q1),
dol=asNumericChar(fakeF1$dol),
pct=c(.01, .02),
xMSdate=2:3)
nF1c <- data.frame(yr=1948:1947, q1=c(1234, NA),
dol=c(1234, NA), pct=c(.01, .02),
xMSdate=2:3)
all.equal(nF1, nF1.)
all.equal(nF1., nF1c)
nF3c <- data.frame(yr=1948:1947, q1=c(1234, NA),
dol=c(1234, NA), pct=c(.01, .02),
xDate=xDate, xPOSIX=xPOSIX,
xMSdate=xMSdate, junk=fakeF1$junk)
all.equal(nF3, nF3c)
##
## 2. orderBy=1:2
##
nF. <- asNumericDF(fakeF1, orderBy=1:2)
all.equal(nF., nF1c[2:1,])
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.