# normalize_df_per_dim: Useful functions on data frames In eilslabs/YAPSA: Yet Another Package for Signature Analysis

## Description

`normalize_df_per_dim`: Normalization is carried out by dividing by `rowSums` or `colSums`; for rows with `rowSums=0` or columns with `colSums=0`, the normalization is left out.

`average_over_present`: If averaging over columns, zero rows (i.e. those with `rowSums=0`) are left out, if averaging over rows, zero columns (i.e. those with `colSums=0`) are left out.

`sd_over_present`: If computing the standard deviation over columns, zero rows (i.e. those with `rowSums=0`) are left out, if computing the standard deviation over rows, zero columns (i.e. those with `colSums=0`) are left out.

`stderrmean_over_present`: If computing the standard error of the mean over columns, zero rows (i.e. those with `rowSums=0`) are left out, if computing the standard error of the mean over rows, zero columns (i.e. those with `colSums=0`) are left out. Uses the function `stderrmean`

## Usage

 ```1 2 3 4 5 6 7``` ```normalize_df_per_dim(in_df, in_dimension) average_over_present(in_df, in_dimension) sd_over_present(in_df, in_dimension) stderrmean_over_present(in_df, in_dimension) ```

## Arguments

 `in_df` Data frame to be normalized `in_dimension` Dimension along which the operation will be carried out

## Value

The normalized numerical data frame (`normalize_df_per_dim`)

A vector of the means (`average_over_present`)

A vector of the standard deviations (`sd_over_present`)

A vector of the standard errors of the mean (`stderrmean_over_present`)

`stderrmean`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27``` ```test_df <- data.frame(matrix(c(1,2,3,0,5,2,3,4,0,6,0,0,0,0,0,4,5,6,0,7), ncol=4)) ## 1. Normalize over rows: normalize_df_per_dim(test_df,1) ## 2. Normalize over columns: normalize_df_per_dim(test_df,2) test_df <- data.frame(matrix(c(1,2,3,0,5,2,3,4,0,6,0,0,0,0,0,4,5,6,0,7), ncol=4)) ## 1. Average over non-zero rows: average_over_present(test_df,1) ## 2. Average over non-zero columns: average_over_present(test_df,2) test_df <- data.frame(matrix(c(1,2,3,0,5,2,3,4,0,6,0,0,0,0,0,4,5,6,0,7), ncol=4)) ## 1. Compute standard deviation over non-zero rows: sd_over_present(test_df,1) ## 2. Compute standard deviation over non-zero columns: sd_over_present(test_df,2) test_df <- data.frame(matrix(c(1,2,3,0,5,2,3,4,0,6,0,0,0,0,0,4,5,6,0,7), ncol=4)) ## 1. Compute standard deviation over non-zero rows: stderrmean_over_present(test_df,1) ## 2. Compute standard deviation over non-zero columns: stderrmean_over_present(test_df,2) ```