drop_empty_rows: Drop 'empty' rows in a dataframe

View source: R/dataframe_tools.R

drop_empty_rowsR Documentation

Drop 'empty' rows in a dataframe

Description

Deletes rows from a dataframe if they are 'empty'. A row is empty when every single cell is NA, NULL, "", or matches a regular expression.

Usage

drop_empty_rows(
  df,
  from = 1,
  to = NULL,
  cols = NULL,
  regex = "^$",
  report = FALSE
)

Arguments

df

(Dataframe) A dataframe.

from, to

(Numeric or NULL) The start and end of a continuous range of columns that will be subsetted from df. For example, columns that are always filled should be omitted (see examples). If to is NULL, it defaults to the last column in df so that ⁠from = 2, to = NULL⁠ is the same as 2:length(df).

cols

(Numeric or NULL) A numeric vector of the columns to consider. This allows you to select non-contiguous columns. If the cols argument is being used (not-NULL), from and to will be ignored.

regex

(Character) A regex pattern that matches a value that should be considered 'empty'.

report

(Logical) If TRUE, print a Message with the number of empty rows that were dropped.

Value

A subset of df with all empty rows removed.

Authors

Examples

data <- data.frame(name = c("Jim", "Jane", "Janice", "Joe", "Jay"),
                   a = c(0, "", 1, NA, 0),
                   b = c(1, "", 1, NA, 0),
                   c = c(1, NA, 2, 0, 0),
                   d = c(0, NA, 4, 0, 0),
                   e = c(0, "", 5, 0, 0),
                   f = c(3, "", 0, 0, 0),
                   stringsAsFactors = FALSE)
                   
data

#>     name    a    b  c d e f
#> 1    Jim    0    1  1  0 0 3
#> 2   Jane           NA NA    
#> 3 Janice    1    1  2  4 5 0
#> 4    Joe <NA> <NA>  0  0 0 0
#> 5    Jay    0    0  0  0 0 0

drop_empty_rows(data)

# Returns the whole dataframe because column 1 ('name') is never empty.
#>     name    a    b  c  d e f
#> 1    Jim    0    1  1  0 0 3
#> 2   Jane           NA NA    
#> 3 Janice    1    1  2  4 5 0
#> 4    Joe <NA> <NA>  0  0 0 0
#> 5    Jay    0    0  0  0 0 0

drop_empty_rows(data, from = 2)

# We get the desired result when 'name' is omitted.
#>     name    a    b c d e f
#> 1    Jim    0    1 1 0 0 3
#> 3 Janice    1    1 2 4 5 0
#> 4    Joe <NA> <NA> 0 0 0 0
#> 5    Jay    0    0 0 0 0 0

drop_empty_rows(data, from = 2, regex = "^0$")

# Regex can be used to match cells that should be 'empty'.
#>     name a b c d e f
#> 1    Jim 0 1 1 0 0 3
#> 3 Janice 1 1 2 4 5 0

drop_empty_rows(data, cols = c(2, 6))

# Non-contiguous columns can be selected with 'cols'.
#>     name    a    b c d e f
#> 1    Jim    0    1 1 0 0 3
#> 3 Janice    1    1 2 4 5 0
#> 4    Joe <NA> <NA> 0 0 0 0
#> 5    Jay    0    0 0 0 0 0

drop_empty_rows(data, cols = c(2, 6), report = TRUE)

#> Dropped rows: 1 in total
#>     name    a    b c d e f
#> 1    Jim    0    1 1 0 0 3
#> 3 Janice    1    1 2 4 5 0
#> 4    Joe <NA> <NA> 0 0 0 0
#> 5    Jay    0    0 0 0 0 0


DesiQuintans/desiderata documentation built on April 9, 2023, 5:43 a.m.