# Furniture In furniture: Furniture for Quantitative Scientists

```knitr::opts_chunk\$set(echo = TRUE)
```

This vignette is current as of `furniture` `r packageVersion("furniture")`.

## Using `furniture`

```library(furniture)
```

We will first make a fictitious data set:

```df <- data.frame(a = rnorm(1000, 1.5, 2),
b = seq(1, 1000, 1),
c = c(rep("control", 400), rep("Other", 70), rep("treatment", 500), rep("None", 30)),
d = c(sample(1:1000, 900, replace=TRUE), rep(-99, 100)))
```

There are four functions that we'll demonstrate here:

1. `washer`
2. `table1`
3. `tableC`
4. `tableF`

## Washer

`washer` is a great function for quick data cleaning. In situations where there are placeholders, extra levels in a factor, or several values need to be changed to another.

```library(dplyr)

df <- df %>%
mutate(d = washer(d, -99),  ## changes the placeholder -99 to NA
c = washer(c, "Other", "None", value = "control")) ## changes "Other" and "None" to "Control"
```

## Table1

Now that the data is "washed" we can start exploring and reporting.

```table1(df, a, b, factor(c), d)
```

The variables must be numeric or factor. Since we use a special type of evaluation (i.e. Non-Standard Evaluation) we can change the variables in the function (e.g., `factor(c)`). This can be extended to making a whole new variable in the function as well.

```table1(df, a, b, d, ifelse(a > 1, 1, 0))
```

This is just the beginning though. Two powerful things the function can do are shown below:

```table1(df, a, b, d, ifelse(a > 1, 1, 0),
splitby=~factor(c),
test=TRUE)
```

The `splitby = ~factor(c)` stratifies the means and counts by a factor variable (in this case either control or treatment). When we use this we can also automatically compute tests of significance using `test=TRUE`.

We can also use it intuitively within the pipe (for more about this, see the "Table 1" vignette):

```df %>%
group_by(c) %>%
table1(a, b, d, ifelse(a > 1, 1, 0),
test=TRUE)
```

In this case, we used the `group_by()` function from `dplyr` (within the `tidyverse`) and `table1()` knows to use that as the grouping variable in place of the `splitby` argument.

If the parametric tests (default) are not appropriate, you can set `param = FALSE`.

```df %>%
group_by(c) %>%
table1(a, b, d, ifelse(a > 1, 1, 0),
test=TRUE,
param=FALSE)
```

Finally, you can polish it quite a bit using a few other options. For example, you can do the following:

```table1(df, a, b, d, ifelse(a > 1, 1, 0),
splitby=~factor(c),
test=TRUE,
var_names = c("A", "B", "D", "New Var"),
type = c("simple", "condensed"))
```

Note that `var_names` can be used for more complex naming (e.g., with spaces, brackets) that otherwise cannot be used with data frames. Alternatively, for more simple naming, we can name them directly.

```table1(df, A = a, B = b, D = d, A2 = ifelse(a > 1, 1, 0),
splitby=~factor(c),
test=TRUE,
type = c("simple", "condensed"))
```

You can also format the numbers (adding a comma for big numbers such as in 20,000 instead of 20000):

```table1(df, a, b, d, ifelse(a > 1, 1, 0),
splitby=~factor(c),
test=TRUE,
var_names = c("A", "B", "D", "New Var"),
format_number = TRUE)
```

The table can be exported directly to a folder in the working directory called "Table1". Using `export`, we provide it with a string that will be the name of the CSV containing the formatted table.

```table1(df, a, b, d, ifelse(a > 1, 1, 0),
splitby=~factor(c),
test=TRUE,
var_names = c("A", "B", "D", "New Var"),
format_number = TRUE,
export = "example_table1")
```

This can also be outputted as a latex, markdown, or pandoc table (matching all the output types of `knitr::kable`). Below shows how to do a latex table (not using `kable` however, but a built-in function that provides the variable name at the top of the table):

```table1(df, a, b, d, "new var" = ifelse(a > 1, 1, 0),
splitby = ~factor(c),
test = TRUE,
output = "latex2")
```

Last item to show you regarding `table1()` is that it can be printed in a simplified and condensed form. This instead of reporting counts and percentages for categorical variables, it reports only percentages and the table has much less white space.

```table1(df, a, b, d, "new var" = ifelse(a > 1, 1, 0),
splitby = ~factor(c),
test = TRUE,
type = c("simple", "condensed"))
```

## Table C

This function is to create simple, beautiful correlation tables. The syntax is just like `table1()` in most respects. Below we include all the numeric variables to see their correlations. Since there are missing values in `d` we will use the natural `na.rm=TRUE`.

```tableC(df,
a, b, d,
na.rm = TRUE)
```

All the adjustments that you can make in `table1()` can be done here as well. For example,

```tableC(df,
"A" = a, "B" = b, "D" = d,
na.rm = TRUE,
output = "html")
```

## Table F

This function is to create simple frequency tables. The syntax is just like `table1()` and `tableC()` in most respects, except that it uses only one variable instead of many.

```tableF(df, a)
```

Similarly to `table1()` we can use a `splitby` argument (or `group_by()`).

```tableF(df, d, splitby = c)
```
```df %>%
group_by(c) %>%
tableF(d)
```

## Table X

Lastly, `tableX()` is a pipe-able two-way version of `table()` with a similar syntax to that of the rest of the `furniture` functions.

```df %>%
tableX(c, ifelse(d > 500, 1, 0))
```

By default, it provides the total counts for the rows and columns with flexibility as to what is displayed and where.

## Conclusion

The four functions: `table1()`, `tableC()`, `tableF()`, and `washer()` add simplicity to cleaning up and understanding your data. Use these pieces of furniture to make your quantitative life a bit easier.

## Try the furniture package in your browser

Any scripts or data that you put into this service are public.

furniture documentation built on Feb. 22, 2021, 9:08 a.m.