Read a text (e.g., csv) file, find rows with more than 3
characters. Parse the initial contiguous block of those into a
The initial application for this function is to read Table 6.16. Income and employment by industry in the National Income and Product Account tables published by the Bureau of Economic Analysis of the United States Department of Commerce.
the name of a file from which the data are to be read.
Logical: Is the second column of the identified data matrix to be interpreted as variable names?
The field space separator charactor.
character string(s) that translate into NA
optional arguments for
1. txt <- readLines(file)
2. Split into fields.
3. Identifiy headers, Data, footers.
4. Recombine the second component of each Data row if necessary so all have the same number of fields.
5. Extract variable names
7. return the transpose
A matrix of the transpose of the rows with the max number of fields with attributes 'headers', 'footers', 'other', and 'summary'. If this matrix can be coerced to numeric with no NAs, it will be. Otherwise, it will be left as character.
Table 6.16. Income and employment by industry in the National
Income and Product Account tables published by the Bureau of
Economic Analysis of the United States Department of Commerce. To
get this table from www.bea.gov, under "U.S. Economic Accounts",
first select "Corporate Profits" under "National". Then next to
"Interactive Tables", select, "National Income and Product Accounts
Tables". From there, select "Begin using the data...". Under
"Section 6 - income and employment by industry", select each of the
tables starting "Table 6.16". As of February 2013, there were 4
such tables available: Table 6.16A, 6.16B, 6.16C and 6.16D. Each
of the last three are available in annual and quarterly summaries.
USFinanceIndustry data combined the first 4 rows of the 4
annual summary tables.
1 2 3 4 5 6 7 8 9