Please report any bugs or suggestions at: https://github.com/christophergandrud/DataCombine/issues.
DataCombine is a set of miscellaneous tools intended to make combining data sets--especially time-series cross-section data--easier. The package is continually being developed as I turn lines of code that I frequently use into single functions. It currently includes the following functions:
CasesTable function added to report cases after listwise deletion of
missing values for time-series cross-sectional data.
change: calculates the absolute, percentage, and proportion change from
a specified lag, including within groups.
CountSpell: function that returns a variable counting the spell number
for an observation. Works with grouped data.
dMerge: merges 2 data frames and report/drop/keeps only duplicates.
DropNA: drops rows from a data frame when they have missing (
NA) values on a
FillDown: fills in missing (
NA) values with the previous non-missing value
FillIn: fills in missing values of a variable from one data frame with the
values from another variable.
FindDups: find duplicated values in a data frame and subset it to either
include or not include them.
FindReplace: replaces multiple patterns found in a character string column
of a data frame.
grepl.sub: subsets a data frame if a specified pattern is found in a
InsertRow: allows user to insert a row into a data frame. Largely
implements: Ari B. Friedman's function.
MoveFront: moves variables to the front of a data frame. This can be useful
if you have a data frame with many variables and want to move a variable or
variables to the front.
NaVar: create new variable(s) indicating if there are missing values in
shift: creates lag and lead variables, including for time-series
cross-sectional data. The shifted variable is returned to a new vector. This
function is largely based on
TszKin Julian's shift function.
slide: creates lag and lead variables, including for time-series
cross-sectional data. The slid variable are added to the original data frame.
This expands the capabilities of
slideMA: creates a moving average for a period before or after each time
point for a given variable.
SpreadDummy: spread a dummy variable (1's and 0') over a specified time
period and for specified groups.
StartEnd: finds the starting and ending time points of a spell, including
for time-series cross-sectional data.
rmExcept: removes all objects from a workspace except those specified by the
TimeExpand: expands a data set so that it includes an observation for each
time point in a sequence. Works with grouped data.
TimeFill: creates a continuous
Dummy data frame from a data
VarDrop: drops one or more variables from a data frame.
I will continue to add to the package as I build data sets and run across other pesky tasks I do repeatedly that would be simpler if they were completed by a single function.
DataCombine is on CRAN.
You can also install the most recent stable version with
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.