clean_variable_spelling()
and clean_spelling()
have been migrated over
to the {matchmaker} package and arguments from the aformentioned functions are
passed to the {matchmaker} functions. Tests and documentation have been
updated to reflect this.clean_variable_spelling()
and clean_spelling()
gain the option to specify
which columns contain the keys (from
) and values (to
). These default to
1 and 2, respectively, which ensure that backwards compatibility is retained.
(this fixes #99).linelist_example()
is a new function that serves as an alias for
system.file("extdata", thing, package = "linelist")
, which is much easier
for new R users to understand.top_values()
no longer throws a spurious warning when the levels in the
subset data are identical to the levels in the full data (#96)top_values()
gains a new subset
argument that allows the user to retain
the top levels of a subset of a vector. This is particularly useful for
retrospective analysis based on current trends (fixes #92 via #94 and #95,
@thibautjombart)top_values()
gains the explicit ties.method parameter, which defaults to
"first" to fix issue #88 (thanks to @cwhittaker1000 for spotting the issue
and providing a detailed explanation).top_values()
issues a warning if one of the top values had a tied value
that was not included. top_values()
issues a warning if the user uses a ties.method that is not
guaranteed to return exactly n top values.clean_spelling()
gains the anchor_regex
argument, which will wrap all
regex keyword entries in "^" and "$" before processing. The linelist class and all associated epivars/dictionary functions have been
removed as out of scope of this package. Without any validation, these
functions were no more than a fancy wrapper to dplyr::rename()
, thus they
are being removed after fda9e18b02f5853cd311ddcc513c427244b21dd7. If the
linelist class is ressurrected, (e.g. to implement a hxl validator package),
it can be taken from that commit. This is related to #29
clean_spelling()
now gains the .regex
keyword that allows the user to
supply perl-style regular expressions to change words that may have similar
spelling.
guess_dates()
now processes at double the speed of the previous version.guess_dates()
will now properly constrain date vectors to the start and end
dates. guess_dates()
correctly parses dates represented as integers from excel
(#73).print.data_comparison()
now sets diff_only = TRUE
by default (#71)compare_data()
gains the option columns
, which allows users to
choose which columns they want to compare. Defaults to TRUE
, which
compares all columns (#58).guess_dates()
can now handle dates that were imported from Excel as
integers (#66).guess_dates()
gains the argument "modern_excel" to indicate how integer
dates should be formatted.getOption("linelist_guess_orders")
replaces the explicit list of orders in
guess_dates()
for easier access.guess_dates()
no longer throws an error if passed a date class object (#65).guess_dates()
has been better documented to reflect the above changes (#64).clean_spelling()
gains a new keyword: .na
(or should I say "valueword").
When this keyword is in the values (second) column of the wordlist, the keys
will be replaced with a missing (<NA>
) value. This is useful for contrasting
between presence of an absence and an absence of a presence with the .missing
keyword. See #55 and #57 for detailsprint.data_comparison()
gains the logical arguments common_values
and
diff_only
to control the length of print output (See #61).compare_data()
now correctly accounts for different values in variables.
Thanks to @ffinger for finding the bug (#56).compare_data()
now returns list of variable classes instead of TRUE if the
classes match. (See #53 for details).clean_variable_spelling()
will now run global variables before processing
named variables instead of in tandem. This allows the user to define
misspellings in the .global
variable.
See https://github.com/reconhub/linelist/issues/51 for details.clean_spelling()
will no longer throw a warning if there is no value for
.default to replace.clean_variable_spelling()
, clean_variables()
, and clean_data()
gain the
warn
and warn_spelling
arguments which will capture all errors and
warnings issued from clean_spelling()
for each variable.
See https://github.com/reconhub/linelist/pull/48 for details). compare_data()
allows users to compare structural changes to data frames
This includes, names, classes, dimensions, and values in matching categorical
variables. (See https://github.com/reconhub/linelist/pull/50 for details).top_values()
will mask all but the top n
values in a factor.crayon
package is added to importsclean_spelling()
wordlists now allow the optional .missing
keyword to
replace both NA
and blank ("") cells in the data. Values that are NA
will
be converted to "NA" (character) with a warning.
See https://github.com/reconhub/linelist/pull/44 and
https://github.com/reconhub/linelist/pull/45 for details.guess_dates()
can once again parse date formats that are file names:
example_format_2019-02-19.xlsx
. (See #43 for details)clean_spelling()
gains a quiet
argument to suppress warnings.clean_variable_spelling()
will no longer error if there are variable
specifications that don't exist in the data. It will also suppress all
warnings from clean_spelling()
. (see #41 for details)clean_spelling()
will check the spelling of a vector against a wordlistclean_variable_spelling()
will apply clean_spelling()
to all specified
columns in a data frameclean_variables()
wraps clean_variable_labels()
and clean_variable_spelling()
clean_data()
now can optionally check labels againt a wordlist.(see #38 for details)
mask()
will temporarily replace column names with epivarsunmask()
reverses the effect of mask.geo
epivar was replaced with geo_lat
and geo_lon
(see #35)lookup()
function can look up the column name corresponding to an epivar
(see #28)add_epivars()
adds epivars to the global dictionaryadd_description()
updates the description of one of the epivars
(see #26)template_linelist()
function (see #24)get_vars()
can take multiple variables (see #15)guess_dates()
now throws an appropriate error if a vector is passed instead
of a data frame. See https://github.com/reconhub/linelist/issues/4 for detailsNEWS.md
file to track changes to the package.Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.