row_sums: Row sums and means for data frames
In strengejacke/sjmisc: Data and Variable Transformation Functions

row_sums

R Documentation

Row sums and means for data frames

Description

row_sums() and row_means() compute row sums or means for at least n valid values per row. The functions are designed to work nicely within a pipe-workflow and allow select-helpers for selecting variables.

Usage

row_sums(x, ...)

## Default S3 method:
row_sums(x, ..., n, var = "rowsums", append = TRUE)

## S3 method for class 'mids'
row_sums(x, ..., var = "rowsums", append = TRUE)

row_means(x, ...)

total_mean(x, ...)

## Default S3 method:
row_means(x, ..., n, var = "rowmeans", append = TRUE)

## S3 method for class 'mids'
row_means(x, ..., var = "rowmeans", append = TRUE)

Arguments

`x`	A vector or data frame.
`...`	Optional, unquoted names of variables that should be selected for further processing. Required, if `x` is a data frame (and no vector) and only selected variables from `x` should be processed. You may also use functions like `:` or tidyselect's select-helpers. See 'Examples' or package-vignette.
`n`	May either be a numeric value that indicates the amount of valid values per row to calculate the row mean or sum; a value between 0 and 1, indicating a proportion of valid values per row to calculate the row mean or sum (see 'Details'). or `Inf`. If `n = Inf`, all values per row must be non-missing to compute row mean or sum. If a row's sum of valid (i.e. non-`NA`) values is less than `n`, `NA` will be returned as value for the row mean or sum.
`var`	Name of new the variable with the row sums or means.
`append`	Logical, if `TRUE` (the default) and `x` is a data frame, `x` including the new variables as additional columns is returned; if `FALSE`, only the new variables are returned.

Details

For n, must be a numeric value from 0 to ncol(x). If a row in x has at least n non-missing values, the row mean or sum is returned. If n is a non-integer value from 0 to 1, n is considered to indicate the proportion of necessary non-missing values per row. E.g., if n = .75, a row must have at least ncol(x) * n non-missing values for the row mean or sum to be calculated. See 'Examples'.

Value

For row_sums(), a data frame with a new variable: the row sums from x; for row_means(), a data frame with a new variable: the row means from x. If append = FALSE, only the new variable with row sums resp. row means is returned. total_mean() returns the mean of all values from all specified columns in a data frame.

Examples

data(efc)
efc %>% row_sums(c82cop1:c90cop9, n = 3, append = FALSE)

library(dplyr)
row_sums(efc, contains("cop"), n = 2, append = FALSE)

dat <- data.frame(
  c1 = c(1,2,NA,4),
  c2 = c(NA,2,NA,5),
  c3 = c(NA,4,NA,NA),
  c4 = c(2,3,7,8),
  c5 = c(1,7,5,3)
)
dat

row_means(dat, n = 4)
row_sums(dat, n = 4)

row_means(dat, c1:c4, n = 4)
# at least 40% non-missing
row_means(dat, c1:c4, n = .4)
row_sums(dat, c1:c4, n = .4)

# total mean of all values in the data frame
total_mean(dat)

# create sum-score of COPE-Index, and append to data
efc %>%
  select(c82cop1:c90cop9) %>%
  row_sums(n = 1)

# if data frame has only one column, this column is returned
row_sums(dat[, 1, drop = FALSE], n = 0)

strengejacke/sjmisc documentation built on May 16, 2024, 4:07 a.m.