DescTools-package: Tools for Descriptive Statistics and Exploratory Data...

Description Details Warning MS-Office Author(s) Examples

Description

A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel.
Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'CamelStyle' was consequently applied to functions borrowed from contributed R packages as well.

Feedback, feature requests, bugreports and other suggestions are welcome! Please report problems to Stack Overflow mentioning DescTools or directly to the maintainer.
(We're approaching version 1.0, so take the opportunity...)

Details

A grouped list of the functions:

Operators, calculus, transformations:
%()% Between operators determine if a value lies within a range [a,b]
%)(% Outside operators: %)(%, %](%, %)[%, %][%
%nin% "not in" operator
%overlaps% Do two collections have common elements?
%like%, %like any% Simple operator to search for a specified pattern
Interval Calculate the number of days of the overlapping part
of two date periods
AUC Calculate area under the curve
Primes Find all primes less than n
Factorize Prime factorization of integers
GCD Calculate the greatest common divisor
LCM Calculate the least common multiple
Permn Determine all possible permutations of a set
Fibonacci Generates single Fibonacci numbers or a Fibonacci sequence
DigitSum Digit sum of a number
Frac Return the fractional part of a numeric value
Ndec Count decimal places of a number
BoxCox, BoxCoxInv Box Cox transformation and its inverse transformation
BoxCoxLambda Return the optimal lambda for a BoxCox transformation
LogSt, LogStInv Calculate started logarithmic transformation and it's inverse
Logit, LogitInv Generalized logit and inverse logit function
LinScale Simple linear scaling of a vector x
Winsorize Data cleaning by winsorization
Trim Trim data by omitting outlying observations
CutQ Cut a numeric variable into quartiles or other quantiles
Recode Recode a factor with altered levels
Rename Change name(s) of a named object
Sort Sort extension for matrices and data.frames
SortMixed, OrderMixed Mixed sort order
DenseRank Calculate ranks in consecutive order (no ties)
RoundTo Round to a multiple
Large, Small Returns the kth largest, resp. smallest values
HighLow Combines Large and Small.
Rev Reverses the order of rows and/or columns of a matrix or a data.frame
Untable Recreates original list based on a n-dimensional frequency table
CollapseTable Collapse some rows/columns in a table.
Dummy Generate dummy codes for a factor
FisherZ, FisherZInv Fisher's z-transformation and its inverse
Midx Calculate sequentially the midpoints of the elements of a vector
UnitConv Return the most common unit conversions
Vigenere Implements a Vigenere cypher, both encryption and decryption
Information and manipulation functions:
AllDuplicated Find all values involved in ties
Closest Return the value in a vector being closest to a given one
Coalesce Return the first value in a vector not being NA
ZeroIfNA Replace NAs by 0
Impute Replace NAs by the median or another value
CombN Returns the number of subsets out of a list of elements
CombSet Generates all possible subsets out of a list of elements
CombPairs Generates all pairs out of one or two sets of elements
IsWhole Is x a whole number?
IsDichotomous Check if x contains exactly 2 values
IsOdd Is x even or odd?
IsPrime Is x a prime number?
IsZero Is numeric(x) == 0, say x < machine.eps?
Label Get or set the label attribute of an object
Abind Bind matrices to n-dimensional arrays
VecRot Shift the elements of a vector in a circular mode to the right
or to the left by n characters.
Clockwise Transform angles from counter clock into clockwise mode
split.formula A formula interface for the base function split
reorder.factor Reorder the levels of a factor
LOCF Imputation of datapoints following the "last observation
carried forward" rule
Lookup Simple lookup if merge seems cumbersome
ToLong, ToWide Simple reshaping of a vector
String functions:
StrTrim Delete white spaces from a string
StrTrunc Truncate string on a given length and add ellipses if it really
was truncated
StrAbbr Abbreviates a string
StrCap Capitalize the first letter of a string
StrPad Fill a string with defined characters to fit a given length
StrDist Compute Levenshtein or Hamming distance between strings
StrRev Reverse a string
StrCountW Count the words in a string
StrChop Split a string by a fixed number of characters.
StrVal Extract numeric values from a string
StrPos Find position of first occurrence of a string in another one
StrIsNumeric Check whether a string does only contain numeric data
FixToTab Create table out of a running text, by using columns of spaces as delimiter
Conversion functions:
AscToChar, CharToAsc Converts ASCII codes to characters and vice versa
DecToBin, BinToDec Converts numbers from binmode to decimal and vice versa
DecToHex, HexToDec Converts numbers from hexmode to decimal and vice versa
DecToOct, OctToDec Converts numbers from octmode to decimal and vice versa
DegToRad, RadToDeg Convert degrees to radians and vice versa
CartToPol, PolToCart Transform cartesian to polar coordinates and vice versa
CartToSph, SphToCart Transform cartesian to spherical coordinates and vice versa
Colors:
SetAlpha Add transperancy (alpha channel) to a color.
ChooseColorDlg Display the system's color dialog to choose a color
ColPicker Display R colors in a dialog
ColorLegend Add a color legend to a plot
ColToGray, ColToGrey Convert colors to gcrey/grayscale
ColToHex, HexToCol Convert a color into hex string
HexToRgb
ColToHsv R color to HSV conversion
ColToRgb, RgbToCol Color to RGB conversion and back
RgbToLong Convert a rgb color to a long number
FindColor Get color on a defined color range
MixColor Get the mix of two colors
TextContrastColor Choose textcolor depending on background color
Pal Some custom color palettes
Plots:
Canvas Canvas for geometric plotting
Mar Set margins more comfortably.
lines.loess Add a loess smoother and its CIs to an existing plot
lines.lm Add the prediction of linear model and its CIs to a plot
lines.smooth.spline Add the prediction of a smooth.spline and its CIs to a plot
ErrBars Add horizontal or vertical error bars to an existing plot
DrawArc, DrawRegPolygon Draw elliptic, circular arc(s) or regular polygon(s)
DrawCircle, DrawEllipse Draw a circle, a circle annulus or a sector or an annulus
DrawBezier Draw a Bezier curve
DrawBand Draw confidence band
BoxedText Add text surrounded by a box to a plot
Rotate Rotate a geometric structure
SpreadOut Spread out a vector of numbers so that there is a minimum
interval between any two elements. This can be used
to place textlabels in a plot so that they do not overlap.
IdentifyA Helps identifying all the points in a specific area.
identify.formula Formula interface for identify.
PtInPoly Identify all the points within a polygon.
ConnLines Calculate and insert connecting lines in a barplot
AxisBreak Place a break mark on an axis
PlotACF Create a combined plot of a time series including its
autocorrelation and partial autocorrelation
PlotMonth Plot seasonal effects of a univariate time series
PlotArea Create an area plot
PlotBag Create a two-dimensional boxplot
PlotBubble Draw a bubble plot
PlotCandlestick Plot candlestick chart
PlotCirc Create a circular plot
PlotCorr Plot a correlation matrix
PlotDot Plot a dotchart with confidence intervals
PlotFaces Produce a plot of Chernoff faces
PlotFdist Frequency distribution plot, combination of histogram,
boxplot and ecdf.plot
PlotMarDens Scatterplot with marginal densities
PlotMultiDens Plot multiple density curves
PlotPolar Plot values on a circular grid
PlotFun Plot mathematical expression or a function
PolarGrid Plot a grid in polar coordinates
PlotPyramid Pyramid plot (back-back histogram)
PlotTreemap Plot of a treemap.
PlotVenn Plot a Venn diagram
PlotViolin Plot violins instead of boxplots
PlotQQ QQ-plot for an optional distribution
PlotWeb Create a web plot
PlotTernary Create a triangle or ternary plot
PlotMiss Plot missing values
Distributions:
pBenf Benford distribution, including qBenf, dBenf, rBenf
pRevGumbel Reverse Gumbel distribution, including qRevGumbel,
dRevGumbel, rRevGumbel
qRevGumbelExp Expontial reverse Gumbel distribution (quantile only)
Statistics:
Freq Univariate frequency table
PercTable Bivariate percentage table
Margins (Extended) margin tables of a table
ExpFreq Expected frequencies of a n-dimensional table
Mode Mode, the most frequent value
Gmean, Gsd Geometric mean and geometric standard deviation
Hmean Harmonic Mean
Median Extended median function supporting weights and ordered factors
HuberM, TukeyBiweight Huber M-estimator of location and Tukey's biweight robust mean
HodgesLehmann the Hodges-Lehmann estimator
HoeffD Hoeffding's D statistic
MeanSE Standard error of mean
MeanCI, MedianCI Confidence interval for the mean and median
MeanDiffCI Confidence interval for the difference of two means
MoveAvg Moving average
MeanAD Mean absolute deviation
VarCI Confidence interval for the variance
CoefVar Coefficient of variation and its confidence interval
RobScale Robust data standardization
Range (Robust) range
BinomCI, MultinomCI Confidence intervals for binomial and multinomial proportions
BinomDiffCI Calculate confidence interval for a risk difference
BinomRatioCI Calculate confidence interval for the ratio of binomial proportions.
PoissonCI Confidence interval for a Poisson lambda
Skew, Kurt Skewness and kurtosis
YuleQ, YuleY Yule's Q and Yule's Y
TschuprowT Tschuprow's T
Phi, ContCoef, CramerV Phi, Pearson's Contingency Coefficient and Cramer's V
GoodmanKruskalGamma Goodman Kruskal's gamma
KendallTauA Kendall's tau-a
KendallTauB Kendall's tau-b
StuartTauC Stuart's tau-c
SomersDelta Somers' delta
Lambda Goodman Kruskal's lambda
GoodmanKruskalTau Goodman Kruskal's tau
UncertCoef Uncertainty coefficient
Entropy, MutInf Shannon's entropy, mutual information
DivCoef, DivCoefMax Rao's diversity coefficient ("quadratic entropy")
TheilU Theil's U1 and U2 coefficient
Assocs Combines the association measures above.
OddsRatio, RelRisk Odds ratio and relative risk
CohenKappa, KappaM Cohen's Kappa, weighted Kappa and Kappa for
more than 2 raters
CronbachAlpha Cronbach's alpha
ICC Intraclass correlations
KrippAlpha Return Kripp's alpha coefficient
KendallW Compute the Kendall coefficient of concordance
Lc Calculate and plot Lorenz curve
Gini, Atkinson Gini- and Atkinson coefficient
Herfindahl, Rosenbluth Herfindahl- and Rosenbluth coefficient
GiniSimpson Compute Gini-Simpson Coefficient
CorCI Confidence interval for Pearson's correlation coefficient
CorPart Find the correlations for a set x of variables with set y removed
CorPolychor Polychoric correlation coefficient
SpearmanRho Spearman rank correlation and its confidence intervals
ConDisPairs Return concordant and discordant pairs of two vectors
FindCorr Determine highly correlated variables
CohenD Cohen's Effect Size
EtaSq Effect size calculations for ANOVAs
Contrasts Generate pairwise contrasts for using in a post-hoc test
Strata Stratified sampling with equal/unequal probabilities
Outlier Outliers following Tukey's boxplot definition
LOF Local Outlier Factor
Tests:
SignTest Signtest
ZTest Z--test for known population variance
JonckheereTerpstraTest Jonckheere-Terpstra trend test for medians
PageTest Page test for ordered alternatives
CochranQTest Cochran's Q-test
SiegelTukeyTest Siegel-Tukey test for equality in variability
SiegelTukeyRank Calculate Siegel-Tukey's ranks (auxiliary function)
LeveneTest Levene's test for homogeneity of variance
MosesTest Moses Test of extreme reactions
RunsTest Runs test for randomness
DurbinWatsonTest Durbin-Watson test for autocorrelation
BartelsRankTest Bartels rank test for randomness
JarqueBeraTest Jarque-Bera Test for normality
AndersonDarlingTest Anderson-Darling test for normality
CramerVonMisesTest Cramer-von Mises test for normality
LillieTest Lilliefors (Kolmogorov-Smirnov) test for normality
PearsonTest Pearson chi-square test for normality
ShapiroFranciaTest Shapiro-Francia test for normality
MHChisqTest Mantel-Haenszel Chisquare test
StuartMaxwellTest Stuart-Maxwell marginal homogeneity test
LehmacherTest Lehmacher marginal homogeneity test
CochranArmitageTest Cochran-Armitage test for trend in binomial proportions
BreslowDayTest, WoolfTest Test for homogeneity on 2x2xk tables over strata
PostHocTest Post hoc tests by Scheffe, LSD, Tukey for a aov-object
ScheffeTest Multiple comparisons Scheffe test
DunnTest Dunn's test of multiple comparisons
DunnettTest Dunnett's test of multiple comparisons
HotellingsT2Test Hotelling's T2 test for the one and two sample case.
YuenTTest Yuen's robust t-Test with trimmed means and winsorized variances
BarnardTest Barnard's test for 2x2 tables
Date functions:
day.name, day.abb Defined names of the days
AddMonths, AddMonthsYM Add a number of months to a given date
IsDate Check whether x is a date object
IsWeekend Check whether x falls on a weekend
IsLeapYear Check whether x is a leap year
LastDayOfMonth Return the last day of the month of the date x
DiffDays360 Calculate the difference of two dates using the 360-days system
Date Create a date from numeric representation of year, month, day
Day, Month, Year Extract part of a date
Hour, Minute, Second Extract part of time
Week, Weekday Returns ISO week and weekday of a date
Quarter Quarter of a date
YearDay, YearMonth The day in the year of a date
Now, Today Get current date or date-time
HmsToSec, SecToHms Convert h:m:s times to seconds and vice versa
Zodiac The zodiac sign of a date :-)
Finance functions:
OPR One period returns (simple and log returns)
NPV Net present value
IRR Internal rate of return
GUI-Helpers:
ChooseColorDlg Display color dialog to choose a color
FileOpenCmd Get path of a data file to be opened
SaveAsDlg Save a data object by dialog
SelectVarDlg Select elements of a set by click
PasswordDlg Display a dialog containing an edit field, showing only ***.
PlotPar Display the R plot parameters in a dialog
Xplore A breeze of interactive plotting
Reporting, InOut:
CatTable Print a table with the option to have controlled linebreaks
Format Easy format for numbers and dates
Desc Produce a rich description of an object
GetNewWrd, GetNewXL, GetNewPP Create a new Word, Excel or PowerPoint Instance
GetCurrWrd, GetCurrXL, GetCurrPP Get a handle to a running Word, Excel or PowerPoint instance
IsValidWrd Check if the handle to a Word instance is valid or outdated
WrdCaption Insert a title in Word
WrdPlot Insert the active plot to Word
WrdFont, WrdFont<- Get, resp. set the font on the current cursorposition in Word
WrdTable Create a table in Word
ToWrd Mord flexible wrapper to send diverse objects to Word
WrdInsertBookmark Insert a new bookmark in a Word document
WrdGoto Place cursor to a specific bookmark, or another text position.
WrdUpdateBookmark Update the text of a bookmark's range
XLGetRange Get the values of one or several cell range(s) in Excel
XLGetWorkbook Get the values of all sheets of an Excel workbook
XLView Use Excel as viewer for a data.frame
PpPlot Insert active plot to PowerPoint
PpAddSlide Adds a slide to a PowerPoint presentation
PpText Adds a textbox with text to a PP-presentation
ParseSASDatalines Parse a SAS "datalines" statement to read data
Tools:
PairApply Helper for calculating functions pairwise
LsFct, LsObj List the functions (or the data, all objects) of a package
FctArgs Retrieve the arguments of a functions
InDots Check if an argument is contained in ... argument and return it's value
ParseFormula Parse a formula and return the splitted parts of if
Recycle Recycle a list of elements to the maximal found dimension
Keywords Get the keywords of a man page
SysInfo Get some more information about system and environment
DescToolsOptions Get the DescTools specific options
PlotBagPairs Return
PlotGACF Return
PlotMatrix Return
SampleTwins Return

Warning

This package is still under development. Although the code seems meanwhile quite stable, until release of version 1.0 (which is expected in, hmm - there's still so much room for improvement - spring 2017) you should be aware that everything in the package might be subject to change. Backward compatibility is not yet guaranteed. Functions may be deleted or renamed and new syntax may be inconsistent with earlier versions. By release of version 1.0 the "deprecated-defunct process" will be installed.

MS-Office

To make use of MS-Office features you must have Office in one of its variants installed. All Wrd*, XL* and Pp* functions require as well the package RDCOMClient to be installed. Hence the use of these functions is restricted to Windows systems. RDCOMClient is available at:
http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/3.2/ and can be installed with
install.packages("RDCOMClient", repos="http://www.stats.ox.ac.uk/pub/RWin")
RDCOMClient does not exist for Mac or Linux, sorry.

Author(s)

Andri Signorell
Helsana Versicherungen AG, Health Sciences, Zurich
HWZ University of Applied Sciences in Business Administration Zurich.

Includes R source code and/or documentation previously published by (in alphabetical order):
Ken Aho, Nanina Anderegg, Tomas Aragon, Antti Arppe, Adrian Baddeley, Kamil Barton, Ben Bolker, Frederico Caeiro, Stephane Champely, Daniel Chessel, Leanne Chhay, Clint Cummins, Michael Dewey, Harold C. Doran, Stephane Dray, Charles Dupont, Dirk Eddelbuettel, Jeff Enos, Claus Ekstrom, Martin Elff, Kamil Erguler, Richard W. Farebrother, John Fox, Romain Francois, Michael Friendly, Tal Galili, Matthias Gamer, Joseph L. Gastwirth, Yulia R. Gel, Juergen Gross, Gabor Grothendieck, Frank E. Harrell Jr, Richard Heiberger, Michael Hoehle, Christian W. Hoffmann, Torsten Hothorn, Markus Huerzeler, Wallace W. Hui, Pete Hurd, Rob J. Hyndman, Pablo J. Villacorta Iglesias, Matthias Kohl, Mikko Korpela, Max Kuhn, Detlew Labes, Friederich Leisch, Jim Lemon, Dong Li, Martin Maechler, Arni Magnusson, Daniel Malter, George Marsaglia, John Marsaglia, Alina Matei, David Meyer, Weiwen Miao, Giovanni Millo, Yongyi Min, David Mitchell, Markus Naepflin, Daniel Navarro, Henric Nilsson, Klaus Nordhausen, Derek Ogle, Hong Ooi, Nick Parsons, Sandrine Pavoine, Tony Plate, Roland Rapold, William Revelle, Tyler Rinker, Brian D. Ripley, Caroline Rodriguez, Nathan Russell, Venkatraman E. Seshan, Greg Snow, Michael Smithson, Werner A. Stahel, Mark Stevenson, Terry Therneau, Yves Tille, Adrian Trapletti, Kevin Ushey, Jeremy VanDerWal, Bill Venables, John Verzani, Gregory R. Warnes, Stefan Wellek, Hadley Wickham, Rand R. Wilcox, Peter Wolf, Daniel Wollschlaeger, Thomas Yee, Achim Zeileis

Thank you all!

Special thanks go to Beat Bruengger, Mathias Frueh, Daniel Wollschlaeger for their valuable contributions and testing.

The good things come from all these guys, any problems are likely due to my tweaking.

Maintainer: Andri Signorell <[email protected]>

Examples

1
2
3
4
5
6
7
# ******************************************************
# There are no examples defined here. But see the demos:
#
# demo(describe)
# demo(plots))
#
# ******************************************************

DescTools documentation built on Sept. 13, 2017, 9:04 a.m.