sift()
is removed. It is now a completely separate package, https://github.com/DesiQuintans/sift.material2014_colblind
--- A new datasets containing the Material Design 2014 colour palette, plus simulated colourblind conversions via khroma:::anomalize()
.chunk_int()
, which tells you how an integer should be divided into n
chunks (e.g. chunk 1 gets rows 1-10, chunk 2 gets rows 11-20, etc.).na_in_row()
produces two new output columns: notna_in_row_count
and notna_in_row_prop
.show_colours()
has new n
argument to control how many rows the resulting grid will have.find_dims()
takes new argument x
to support the above change in grid sizing.show_colours()
now fills the grid by column (top to bottom, then left to right) instead of by row (left to right, then top to bottom). This makes more pleasing colour palette grids IMO.eval()
from try.seed()
.drop_empty_cols()
and drop_empty_rows()
are no longer hard-coded to consider the value 0
as something that counts as 'empty'. This means that columns and rows that are full of 0
are kept, since zeroes can be meaningful. If you want to consider 0
as empty, use the argument regex = "^0$"
to match it.drop_empty_cols()
and drop_empty_rows()
can now report their changes with the report
argumentna_in_row()
finds the count and proportion of NA
for each row of a dataframe, using tidyselectors to choose which columns to look at.keep_every()
, which uses a string to control which elements of a vector are kept or removed. For example, "k-"
keeps odd elements, "-k"
keeps even elements, and "-kk--"
keeps the 2nd and 3rd element out of every 5.round_to_duration()
, which roughly converts a time duration (e.g. 37 days) into a different unit of time (e.g. 1.215606 months), with optional rounding to a nearest value (e.g. 1.25 months).consecutive_week()
, which counts the number of 7-day weeks between two dates OR the number of calendar weeks starting on Mondays (AKA isoweeks) between two dates.not.na()
, a more noticeable equivalent of !is.na()
.not.nan()
, a more noticeable equivalent of !is.nan()
.write_df()
, which writes a dataframe to a .csv and .rds in one shot.save_a4()
, which saves a ggplot to a .png, with defaults suitable for a full A4 page in landscape.percentile()
's plots are no longer hard-coded to 0-100 %.round_to_nearest()
now accepts dir = "both"
to be more similar to other functions with a dir
argument, whose names I can't remember right now.desi_base_theme()
removes the grid lines now, as the description always said.desi_base_theme()
now has a ...
argument for sending further arguments to ggplot2::theme()
.make_path()
now condenses multiple directory separators (/
) into one, and also respects file extensions (i.e. make_path("dir", "file", ".rds")
will output dir/file.rds
and not dir/file/.rds
).glue
and readr
to dependencies.split_size()
to split a vector into chunks of known size (or smaller).ensure()
is for quickly testing your code, returning an Error if a test evaluates FALSE
. For more extensive testing needs, look at the assertr
package.assign_groups()
assigns groups to elements of a vector, with good handling for duplicates.split_n()
is deprecated because its output was not intuitive when it came to unsorted or duplicated input. Use assign_groups()
instead.drop_invar_cols()
now recognises all column types, including logical columns and list columns.uw0()
unwraps hard-wrapped lines without introducing spaces between the lines. This is a shortcut for uw(..., collapse = "", join = "")
.rows_with_na()
only keeps rows of a dataframe that contain at least 1 NA
.show_colours()
argument arrange
, which lets you arrange the colours as a rectangular panel (by default), or as horizontal or vertical stripes.Show()
is a version of View()
that can be used inside pipelines and Markdown documents.triangle_num()
calculates the _n_th Triangle Number, which is like factorials but with addition. For example: T3 = 1 + 2 + 3 = 6.IQR_outliers()
marks elements of a vector that are outliers according to the 1.5 * IQR rule.encode_signif()
turns p-values into significance codes (e.g. 0.05 → *).unique_n()
keeps the first n
unique elements in a vector.percentile()
argument cuts
has two new default levels (0.025 and 0.9975) for easy observation of possible 95 % CI cut-offs.Mode()
has new accepted values for the break_ties
argument: median
, median l
, and median r
. These return the median of all of the modes, or in the case of an even number of modes (e.g. c(1, 2, 3, 4)
), the mode to the left or right of the median (e.g. 2
or 3
).show_colors()
is now correctly exported and documented.data(random_integers)
is a vector of 10,000 random integers between 0 and 100, generated by https://www.random.org. It is for specific applications like feature selection in machine learning, where adding a column of "true-random" numbers as a feature lets you draw a clear line between helpful and unhelpful features.na_rm()
which wraps stats::na.omit()
and hides its annoying printing side-effects when applied to a vector.add_group_size()
adds the size of a grouped dataframe's groups (i.e. dplyr::n()
under dplyr::group_by()
) to the dataframe as a new column. Useful for complicated filtering operations such as "Only keep a genus if it has at least 3 species, and each species occurs in at least 4 sites within each region."dots_char()
takes ...
and returns its elements as a character vector (or a string).rev_sentence()
reverses the order of whole words in a string.howmany_df()
fully removed.round_to_places()
fully removed.Mode()
with break_ties = "no"
explicitly set now returns the correct result.plot_arrange()
arranges base R plots into a grid. It's like gridExtra::grid.arrange()
, but for base R plots.nth_word()
which grabs the nth part of a string that is delimited by a regular expression.try.seed()
has a seed
argument that lets you directly set a seed there, instead of having to replace the whole function with the base set.seed()
when you're done using it.Mode()
has a new argument value, break_ties = "NA"
. If more than one mode is found in the input, it will return NA
instead of having to choose between them.palette_distant()
which has 48 colours that are not located adjacent to each other along the RGB and HSV codings.top_tail()
which retrieves first and last rows from a dataframe.try.seed()
for running an expression with a new random generator seed each time. The seed is announced in the console so that when you find one that you like, you can copy it use it in set.seed()
in your script.show_colours()
now correctly handles lists of colours shorter than 4.Mode()
so that it is simpler to use and has more tie-breaking options. Old args ties
and mean
are deprecated and will throw warnings, but I have done my best to maintain their functionality so that this is not a breaking change.howmany()
is now a generic function that includes methods for dataframes, tables, and vectors.howmany_df()
is defunct. Now, just pass the dataframe to howmany()
and it will choose the right method.round_to_places()
is defunct.common_stem()
for comparing several strings and returning the 'stem' (a range of characters from the first position to the first mismatch) that is common to all of them.str_rev()
for mirroring every string in a vector of strings.mirror_matrix()
now supports mirroring of both column order and row order.mirror_matrix()
now runs 45 times faster (!).join
argument added to uw()
. Gives you control over how separate hard-wrapped lines should be joined together.useNA
argument to count_unique()
so that NAs are also shown in the table.clippy()
correctly copies .Last.value
to the clipboard if no x
argument is provided to it.howmany_df()
for quickly summarising the number of unique values in every column of a dataframe.percentile()
help file documents na.rm
argument now.uw()
documentation were too long.quick_lm()
for fast data exploration using plots of x ~ y
linear models.split_n()
divides a vector into groups by assigning each entry a grouping number.drop_invar_cols()
drops columns whose values are all the same (for character/factor), or whose values are very close together (for numeric).drop_empty_()
functions to make it clearer that df
is being subset with the to/from/cols
arguments. percentile()
given an na.rm
argument.show_colours()
given a pad
argument for padding out the table when the number of colours doesn't perfectly fit a square. By default this is white (because the default plot background is white).uw()
("unwrap") takes hard-wrapped strings, removes the cosmetic linebreaks and indentation, and outputs it as a single combined string.round_to_places()
is deprecated. Use the base function round()
with a digits
argument instead (e.g. round(n, digits = 2)
).cache = TRUE
because it can sometimes cause annoying bugs.percentile()
now passes ...
to plot()
if the plot
argument is TRUE
.percentile(plot = TRUE)
now shows values on the plotted points.build_palette()
is now a public function so that you can get the functionality of the palette_...()
family in your own colour lists.spaced
arg to palette_...()
and build_palette()
functions. This argument lets you sample n
colours evenly across the colour list. If you have a list of colours that are sorted by hue, for example, this helps you pick colours that are further away from each other.rcols_as_hex()
converts built-in R colours to hex values (e.g. "goldenrod" → "#DAA520").palette_builtin()
lets you access the list of colours provided by colours(distinct = TRUE)
, adding features like transparency and random colour selection via the build_palette()
framework.palette_...()
functions is now bundled together with build_palette()
.plot
argument for percentile()
creates a graphical representation of the percentile values.coinflip()
which randomly returns TRUE
or FALSE
.collapse_vec()
which lets you concatenate the elements of multiple vectors into one long string.count_unique()
which counts how many times each unique element in a vector is repeated.howmany()
now accepts multiple vectors with ...
instead of only one vector.librarian
is also loaded.This update is mostly about adding more plotting tools that I needed while I was making network graphs in igraph
.
palette_mrmrs()
which contains 16 web-safe colours from Adam Morse.palette_picked()
which contains 14 colours that I picked as high-contrast replacements from palette_distinct()
. Nearly all of them are similar to the ones from palette_mrmrs()
, that's pretty neat!show_colours()
takes a vector of colours and shows them in a nice plot.is.prime()
checks if a number is prime or not.mirror_matrix()
mirrors a matrix horizontally.alpha
argument to palette_distinct()
. Applies a constant transparency to all of the colours that the function returns, which is useful for generating colours in graphs that are heavily overplotted.desiderata:::find_dims()
that finds the dimensions of a grid that will fit a certain number of cells. Used internally for show_colours()
.desiderata:::build_palette()
set.seed.any()
was deprecated for months and is now removed. Use set_seed_any()
instead.howmany()
is an alias for length(unique(x))
.shush()
no longer causes invisibly-returned output to print.desi_theme_base()
doesn't remove the grid lines anymore.palette_distinct()
returns the requested number of colours when random = TRUE
.se_mean()
, which calculates the standard error of the mean.method
arg to apply_to_files()
to provide more options than simple row-binding. clippy()
, which copies dataframes, vectors, and the results of expressions to the system clipboard. Tested on Windows, but hopefully also works on Mac!ties
arg to Mode()
. If ties == FALSE
and there are multiple modes (e.g. c(2, 2, 1, 1)
), only the first mode (2
) will be returned.regex
args to drop_empty_rows()
and drop_empty_cols()
.drop_empty_rows()
, which deletes empty rows from a dataframe. A row is empty if every cell is NA
, NULL
, ""
, or 0
. You can select which columns to use or omit when making this empty/not-empty decision. For example, columns containing IDs or names will probably never be empty and should be ignored.collapse_df()
, which collapses every cell of a dataframe (or a subset of one) into a vector. Useful for grabbing every number in a table and plotting it on a histogram, for example.drop_empty_cols()
, which deletes empty columns from a dataframe. A column is empty if every row is NA
, NULL
, ""
, or 0
.theme_desi_base()
.align_titles()
to horizontally align the title and subtitle of a ggplot.overwrite_df()
no longer returns the overwritten columns as factors.round_to_places()
no longer rounds numbers twice. round_to_places(16.666667, 2)
used to return 17
, but it now correctly returns 16.67
.vec_to_regex()
, which collapses vectors into a regular expression.basic_colour_names
, a built-in dataset that contains the names of 197 browser-compatible web colours.set_seed_any()
properly checks if the digest
package is installed.cat_wrap()
, which prints text to the console while wrapping the output.percentile()
, which is an alias of stats::quantile()
with some useful default percentiles defined.set.seed.any()
deprecated; use set_seed_any()
instead.digest
from 'Imports' to 'Suggests'. It is only used once by set_seed_any()
.theme_desi_base()
.rotate_x_text()
and rotate_y_text()
for rotating the tick labels of ggplot2 plots. Comes with sane default settings.%notin%
which is just !(x %in% y)
in a more readable form.%pctin%
which returns the percent of x
that appears in y
.apply_to_files()
, which applies a function to a list of files that matched a regex search pattern. Use it to import all spreadsheets in a folder, for example. Includes recursive searching.overwrite_df()
, which lets you use regex to match and replace across all cells in a dataframe. This is incredibly convenient as the final step before printing a table in your Rmarkdown document, because you can blank out NAs and other irrelevant values to avoid distracting the reader. Not intended for use with actual data tidying ;use dplyr::recode()
or similar for that.NEWS.md
file to track changes to the package.shush()
lets you run any expression without allowing it to print cat()
, print()
, warning()
or message()
to the console. Useful for running functions that have cat()
hard-coded into them.example()
or R CMD CHECK
.Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.