missing: Dump, replace and fill missing values in data.frame

drop_na_dtR Documentation

Dump, replace and fill missing values in data.frame

Description

A set of tools to deal with missing values in data.frames. It can dump, replace, fill (with next or previous observation) or delete entries according to their missing values.

Usage

drop_na_dt(.data, ...)

replace_na_dt(.data, ..., to)

delete_na_cols(.data, prop = NULL, n = NULL)

delete_na_rows(.data, prop = NULL, n = NULL)

fill_na_dt(.data, ..., direction = "down")

shift_fill(x, direction = "down")

Arguments

.data

data.frame

...

Colunms to be replaced or filled. If not specified, use all columns.

to

What value should NA replace by?

prop

If proportion of NAs is larger than or equal to "prop", would be deleted.

n

If number of NAs is larger than or equal to "n", would be deleted.

direction

Direction in which to fill missing values. Currently either "down" (the default) or "up".

x

A vector with missing values to be filled.

Details

drop_na_dt drops the entries with NAs in specific columns. fill_na_dt fill NAs with observations ahead ("down") or below ("up"), which is also known as last observation carried forward (LOCF) and next observation carried backward(NOCB).

delete_na_cols could drop the columns with NA proportion larger than or equal to "prop" or NA number larger than or equal to "n", delete_na_rows works alike but deals with rows.

shift_fill could fill a vector with missing values.

Value

data.table

References

https://stackoverflow.com/questions/23597140/how-to-find-the-percentage-of-nas-in-a-data-frame

https://stackoverflow.com/questions/2643939/remove-columns-from-dataframe-where-all-values-are-na

https://stackoverflow.com/questions/7235657/fastest-way-to-replace-nas-in-a-large-data-table

See Also

drop_na,replace_na, fill

Examples

df <- data.table(x = c(1, 2, NA), y = c("a", NA, "b"))
 df %>% drop_na_dt()
 df %>% drop_na_dt(x)
 df %>% drop_na_dt(y)
 df %>% drop_na_dt(x,y)

 df %>% replace_na_dt(to = 0)
 df %>% replace_na_dt(x,to = 0)
 df %>% replace_na_dt(y,to = 0)
 df %>% replace_na_dt(x,y,to = 0)

 df %>% fill_na_dt(x)
 df %>% fill_na_dt() # not specified, fill all columns
 df %>% fill_na_dt(y,direction = "up")

x = data.frame(x = c(1, 2, NA, 3), y = c(NA, NA, 4, 5),z = rep(NA,4))
x
x %>% delete_na_cols()
x %>% delete_na_cols(prop = 0.75)
x %>% delete_na_cols(prop = 0.5)
x %>% delete_na_cols(prop = 0.24)
x %>% delete_na_cols(n = 2)

x %>% delete_na_rows(prop = 0.6)
x %>% delete_na_rows(n = 2)

# shift_fill
y = c("a",NA,"b",NA,"c")

shift_fill(y) # equals to
shift_fill(y,"down")

shift_fill(y,"up")

hope-data-science/tidyfst documentation built on Sept. 23, 2024, 8:05 p.m.