formatDBFs: convert DBF files and data.frames to data.tables and FST...

Description Usage Arguments Value Examples

View source: R/AmerAssocIndividInvestorsAAII.R

Description

Removed useless columns. Remove duplicated data. Change column datatypes.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
formatDBFs(
  From = paste0("C:/DATA/AAIISIPRO/MONTHDATE", "/", 18565),
  FromFiles = c("SETUP.DBF", "SI_CI.DBF", "SI_EXCHG.DBF", "SI_SP.DBF", "SI_PTYP.DBF",
    "SI_DATE.DBF", "SI_UTYP.DBF", "SI_TRBCS.DBF", "SI_MGDSC.DBF", "SI_DOW.DBF",
    "SI_PSD.DBF", "SI_PSDC.DBF", "SI_PSDD.DBF", "SI_ISQ.DBF", "SI_BSQ.DBF", "SI_CFQ.DBF",
    "SI_ISA.DBF", "SI_BSA.DBF", "SI_CFA.DBF", "SI_PSDA.DBF", "SI_PSDH.DBF",
    "SI_PSDL.DBF", "SI_PSDV.DBF", "SI_AVG.DBF", "SI_EE.DBF", "SI_GR.DBF", "SI_VAL.DBF",
    "SI_MGAVG.DBF", "SI_MGAV2.DBF", "SI_PERC.DBF", "SI_RAT.DBF", "SI_MLT.DBF"),
  To = From,
  PrependColFile = "SETUP.DBF",
  PrefixCols = c(SI_CI = "LASTMOD", SI_MGDSC = "MG_CODE", SI_MGDSC = "MG_DESC",
    SI_TRBCS = "MG_CODE", SI_TRBCS = "MG_DESC", SI_PTYP = "TYPE_CODE", SI_PTYP =
    "TYPE_DESCR", SI_UTYP = "TYPE_CODE", SI_UTYP = "TYPE_DESCR", SI_MLT = "PE", SI_MGAVG
    = "*", SI_MGAV2 = "*"),
  RemoveCols = c("^X.*$", "X_NullFlags", "REPNO", "(?<!CI_)LASTMOD", "UPDATED"),
  RemoveDupsColFileExceptions = c(""),
  RemoveDupsColValues = c("COMPANY_ID"),
  ChangeType = list(Date = c("^.*DATE$", "^PRICED.*$", "^.*DT$", "^.*LASTMOD$",
    "^PEREND_.*$", "^DATE_EY0$", "^DATE_EQ0$"), logical = c("^ADR$", "^OPTIONABLE$",
    "^DRP_AVAIL$", "^UPDATED$"), integer = c("^EMPLOYEES$", "^PERLEN_.*$", "^SHRINSTN$"),
    character = c("^COMPANY_ID$", "^COMPANY$", "^TICKER$", "^EXCHANGE$", "^STREET$",
    "^CITY$", "^STATE$", "^ZIP$", "^COUNTRY$", "^PHONE$", "^WEB_ADDR$", "^BUSINESS$",
    "^ANALYST_FN$", "IND_2_DIG", "^IND_3_DIG$", "^SIC$", "^SP$", "^DOW$", "^EXCHG_CODE$",
    "^EXCHG_DESC$", "^.*MG_CODE$",      "^.*MG_DESC$", "^PERTYP_.*$", "^UPDTYP_.*$",
    "^SP_CODE$", "^SP_DESC$", "^.*TYPE_CODE$", "^.*TYPE_DESCR$", "^TYPE_SHORT$",
    "^DOW_CODE$", "^DOW_DESC$", "^FIELD_NAME"))
)

Arguments

From

Directory containing the Files (xor R object list of data.frames - NOT-IMPLEMENTED).

FromFiles

if From is a directory, then a vector of "DBF" files of interest.

To

if From is a directory, then the new directory location (xor R object list of data.frames - NOT-IMPLEMENTED).

PrependColFile

DBF file that has the Unique identifier column name. This is the first column.

PrefixCols

Named vector of Strings. Names represent the table names and values represents the column names. New column names will have a pre-pended values that is the last characters of the table name that follow the table name "_". Value of "*" means "all columns".

RemoveCols

Vector of regular expressions(PERL = T) of columns to remove.

RemoveDupsColFileExceptions

Files to non-remove duplicate column values. See the next parameter. RemoveDupsColValues.

RemoveDupsColValues

Column name to have its duplicates (and an corresponding non-duplicates) removed.

ChangeType

list of named vectors, with the name of the vector to be the output datatype, and the values of the vectors to be regular expressions identifying the columns to be converted. Remaining columns not yet converted are converted to numeric.

Value

If "From" is a directory, then new files are placed on disk. Alternately, if "From" is an R list, then return a list of modified data.tables.

Examples

1
2
3
4
5
## Not run: 
formatDBFs()
formatDBFs(paste0("C:\\DATA\\AAIISIPRO\\MONTHDATE","\\", 18627))

## End(Not run)

AndreMikulec/econModel documentation built on June 30, 2021, 9:48 a.m.