In randomchars42/eenv: Import often used functions, settings and packages

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

A package to load packages, functions and variables I frequently use.

Installation

You can install eenv from github with:

# install.packages("devtools")
devtools::install_github("randomchars42/eenv")

Loading

library("eenv")

Data import

Preparing your raw data

Suppose you have an ods / xls(x) file with raw values obtained from a measurement like this:

data <-
  utils::read.csv2(
    system.file("extdata", "values.csv", package = "eenv"),
    header = FALSE)
rownames(data) <- LETTERS[1:6]

knitr::kable(
  data,
  row.names = TRUE,
  col.names = as.character(1:6))

Save them as plate_1.csv- thats like an ods / xls(x) file but its basically a text file with the values separated by commas (or semicolons for languages that use "," to separate decimals).

In the current versions of LibreOffice / OpenOffice / Microsoft office theres an option "Save as" > "csv".

Adding names

Before feeding your samples into your measuring device you most likely drafted some sort of plan which position corresponds to which sample (didn't you?). It may have looked like this:

data <-
  utils::read.csv2(
    system.file("extdata", "names.csv", package = "eenv"),
    header = FALSE)
rownames(data) <- LETTERS[1:6]

knitr::kable(
  data,
  row.names = TRUE,
  col.names = as.character(1:6))

So you had some calibrators (CAL1 - CAL4) and samples A, B, C, D, E, F, H, I, J, K, L, M each in duplicates.

To easily set the names for your samples just copy the names into your new plate_1.csv:

data <-
  utils::read.csv2(
    system.file("extdata", "values_names.csv", package = "eenv"),
    header = FALSE)
rownames(data) <- LETTERS[1:12]

knitr::kable(
  data,
  row.names = TRUE,
  col.names = as.character(1:6))

Tell plates_read() your data contains the names and which column should hold those names by setting additional_vars = c("name").

# do not run this code yet
plates_read(
  additional_vars = c("name")
)

It does not matter which name you select for the column. Pick anything that adequately describes the data in the column.

Adding more information

Suppose samples A, B, C, D, E, F were taken at day 1 and H, I, J, K, L, M were taken from the same rats / individuals / patients on day 2.

It would be more elegant to encode that into the data:

data <-
  utils::read.csv2(
    system.file("extdata", "plate_1.csv", package = "eenv"),
    header = FALSE)
rownames(data) <- LETTERS[1:12]

knitr::kable(
  data,
  row.names = TRUE,
  col.names = as.character(1:6))

So now the name and the day are in the same field, separated by "_". (So make sure the original names do not contain "_" or there will be trouble.. ;) )

Tell plates_read() your data contains the names and day by setting additional_vars = c("name", "day"):

# do not run this code yet
plates_read(
  additional_vars = c("name", "day")
)

Calculating concentrations

So, your measuring device only gave you raw values (extinction rates / relative light units / ...), but you know the concentrations of CAL1, CAL2, CAL3 and CAL4. To get the concentrations for the rest of the samples you need to tell plates_read() which samples are your calibrators and what their actual concentration is.

You do this via cal_names and cal_values. Both expect a vector (that's a series of text strings / numbers / values...). You can create a vector using c(). Just put whatever you like in its arguments:

# try it
like <- c("sunny day", "cake", "coffee", "more coffee", "chocolate", "pizza")
like

# try it
fibonacci <- c(1, 1, 2, 3, 5, 8, 13)
fibonacci

You already created your first vector when you said: additional_vars = c("name", "day") ;) .

So, now create some vectors for your calibrators:

# try it
calibrator_names = c(
  "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9",
  "CAL10"
)

calibrator_values = c(
  4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125
)

Tell plates_read() about it:

# do not run this code yet

calibrator_names = c(
  "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9",
  "CAL10"
)

calibrator_values = c(
  4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125
)

plates_read(
  additional_vars = c("name", "day"),
  cal_names = calibrator_names,
  cal_values = calibrator_values
)

The same, but a bit shorter:

# do not run this code yet
plates_read(
  additional_vars = c("name", "day"),
  cal_names = c(
    "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9",
    "CAL10"
  ),
  cal_values = c(
    4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125
  )
)

Getting the data into the programme

plates_read() automagically reads plate_1.csv in your current directory. If you have data from more than one plate use plates = 2 to read plate1.csv AND plate2.csv. You can read as much plates as you like, as long as their numbers are in a sequence and no numbers are missing.

For now, tell plates_read() to read only your first plate:

# do not run this code yet
plates_read(
  plates = 1,
  additional_vars = c("name", "day"),
  cal_names = c(
    "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9",
    "CAL10"
  ),
  cal_values = c(
    4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125
  )
)

You dont have any data yet but want to play a bit?

This saves the example data into plate_1.csv (but first it checks if there already is a file with that name, else it would overwrite it):

# try it
# (don't worry if it looks weird ;) )
if (! file.exists("plate_1.csv")) {
  write.csv2(
    x = read.csv2(
      file = system.file("extdata", "values_names.csv", package = "eenv"),
      header = FALSE),
    file = "plate_1.csv",
    col.names = FALSE
  )
}

One thing to note for .csv-files: some languages use "." to separate decimals, in those languages .csv usually uses "," to separate values. Some European languages use "," for decimals, in these languages ";" is used for separation of values. You need to tell plates_read() how you would like to have it with sep = "," or sep = ";", respectively:

# do not run this code yet
plates_read(
  plates = 1,
  sep = ";",
  additional_vars = c("name", "day"),
  cal_names = c(
    "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9",
    "CAL10"
  ),
  cal_values = c(
    4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125
  )
)

Working with the results

plates_read() does a lot and wants to tell you about it:

reads the data, sorts it, calculates concentrations, calculates how good your duplicates were, ...
creates a data table (called "tibble" or "tbl" for short) with all the values it calculated
creates a tibble with only one entry per duplicate (with the mean of the duplicate) which is what you will most likely use
creates a graphic to show you how good your calibrators matched the function used to calculate the concentrations

To tell you aout everything it returns a list. lists are like vectors but you can store a lot of different things in them:

# try it
my_list <- list(a = 1, favourite_food = "pizza", age = 20, a_vector = c(1, 2, 3))
my_list

You can access each part of the list in a lot of different ways:

Using "$":

# try it
my_list$age

Using the position:

# try it
my_list[[3]]

Something totally different:

# try it
my_list[["a_vector"]]

This works for setting values as well:

# try it
my_list$age <- 30
my_list$age

Now its time to run plates_read():

# now you may run it :)
result_list <- plates_read(
  plates = 1,
  sep = ";",
  additional_vars = c("name", "day"),
  cal_names = c(
    "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9",
    "CAL10"
  ),
  cal_values = c(
    4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125
  )
)

result_list <- eenv::plates_read(
  plates = 1,
  sep = ";",
  path = system.file("extdata", package = "eenv"),
  additional_vars = c("name", "day"),
  cal_names = c(
    "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9",
    "CAL10"
  ),
  cal_values = c(
    4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125
  ),
  write_data = FALSE,
  use_written_data = FALSE
)

The resulting list is now stored in result_list. In addition, plates_read() created two files in your current directory: data_all.csv and data_samples.csv with all the data ;) .

plates_read()'s list holds the following items:

$all: here you will find all the data , including calibrators, duplicates, ... (saved in data_all.csv)
$samples: only samples here - no calibrators, no duplicates -> most often you will work with this data (saved in data_samples.csv)
$plate1: another list ;)
- $plot: a plot showing you the linear function used to calculate the concentrations for this plate.
The points are the calibrators. They should more or less lie close to the line. $model: the model - you won't need this too often ;) ($plate2): the same information for every plate you have

Take a look:

result_list$all

knitr::kable(result_list$all)

result_list$samples

knitr::kable(result_list$samples)

result_list$plate1$plot

If the calibrators aren't so good...

The lowest calibrator ("CAL9") does not seem to fit to well. So perhaps you get a better result when removing it? You can do that by setting exclude_cals = list(plate1 = c("CAL9")).

(But obviously, you won't be able to interpret values below "CAL8", other than that they are "below CAL8"!).

# now you may run it :)
result_list <- plates_read(
  plates = 1,
  sep = ";",
  additional_vars = c("name", "day"),
  cal_names = c(
    "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9",
    "CAL10"
  ),
  cal_values = c(
    4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125
  ),
  exclude_cals =  list(plate1 = c("CAL9"))
)

result_list <- eenv::plates_read(
  plates = 1,
  sep = ";",
  path = system.file("extdata", package = "eenv"),
  additional_vars = c("name", "day"),
  cal_names = c(
    "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9",
    "CAL10"
  ),
  cal_values = c(
    4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125
  ),
  exclude_cals =  list(plate1 = c("CAL9")),
  write_data = FALSE,
  use_written_data = FALSE
)

Take a look at the fit of the line:

# try it
result_list$plate1$plot

Better! But be careful when interpreting values below "CAL8" (the lowest use calibrator now).

Where to go to from here

First of all, to save a couple of keystrokes and make remembering easier:

# try it
my_data <- result_list$samples

You could take a look which duplicates have a high (> 20 %) coefficient of variation:

# try it
my_data %>%
  filter(concentration_cv > 0.20)

Luckily, there are no samples. Note that the result was not stored.

If there were you could exclude them like:

# try it
my_data <- my_data %>%
  filter(concentration_cv < 0.20)
my_data

# try it
my_data <- my_data %>%
  filter(concentration_cv < 0.20)

knitr::kable(my_data)

This time the result was stored. Be careful when overwriting data. You can always go back and run plates_read() again :).

Now, given the concentration of your calibrators was in "ng / ml" but your editor wants you to use SI units you could convert the concentrations like this:

# try it
my_data <- my_data %>%
  mutate(
    concentration = convert_conc(x = concentration, from = "ng / ml", to = "nmol / l", molar_mass = 52391)
  )
my_data

# try it
my_data <- my_data %>%
  mutate(
    concentration = convert_conc(x = concentration, from = "ng / ml", to = "nmol / l", molar_mass = 52391)
  )

knitr::kable(my_data)

Or you could create a plot like this:

# try it
# use our data
# use column "name" as x-axis
# use column "concentration" as y-axis
ggplot(data = my_data, aes(x = name, y = concentration, colour = name)) +
  # plot each value as a point
  geom_point() +
  # draw a sub-plot (facet) for each day
  facet_wrap(~day)

Or a boxplot:

# try it
# use our data
# use column "name" as x-axis
# use column "concentration" as y-axis
ggplot(data = my_data, aes(x = day, y = concentration)) +
  # draw boxplots
  geom_boxplot() +
  # draw each value as a point over the boxplot
  geom_jitter(width = 0.2)
  # you could try
  # geom_point() as well ;)

Where to get help

The INTERNET ;)

To learn more about working with data:

dplyr, good documentation with many examples : http://dplyr.tidyverse.org/
R for Data Science, by the author of dplyr and ggplot: http://r4ds.had.co.nz/
good packages: http://tidyverse.org/

To learn more about presenting data in plots:

Visualising data: http://r4ds.had.co.nz/data-visualisation.html
Communicating results with plots: http://r4ds.had.co.nz/graphics-for-communication.html
good examples: http://www.cookbook-r.com/Graphs/

Rarely needed

Calibrators do not need to be ln-ln transformed

Your calibrators are linear? You can use: model_func = fit_linear and interpolate_func = interpolate_linear. Basicallly, you can use any function as model_function that returns a model which is understood by your interpolate_func.

# try it but the result wont be as good
result_list <- plates_read(
  plates = 1,
  sep = ";",
  additional_vars = c("name", "day"),
  cal_names = c(
    "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9",
    "CAL10"
  ),
  cal_values = c(
    4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125
  ),
  exclude_cals =  list(plate1 = c("CAL9")),
  model_func = fit_linear,
  interpolate_func = interpolate_linear
)

result_list$plate1$plot

Data sets are stored somewhere else

Your files are stored somewhere else? Just set path = path/to/your/files as a parameter for plates_read().

randomchars42/eenv documentation built on May 20, 2019, 1:29 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

randomchars42/eenv
Import often used functions, settings and packages

In randomchars42/eenv: Import often used functions, settings and packages

Installation

Loading

Data import

Preparing your raw data

Adding names

Adding more information

Calculating concentrations

Getting the data into the programme

Working with the results

If the calibrators aren't so good...

Where to go to from here

Where to get help

Rarely needed

Calibrators do not need to be ln-ln transformed

Data sets are stored somewhere else

R Package Documentation

Browse R Packages

We want your feedback!

randomchars42/eenv Import often used functions, settings and packages

In randomchars42/eenv: Import often used functions, settings and packages

Installation

Loading

Data import

Preparing your raw data

Adding names

Adding more information

Calculating concentrations

Getting the data into the programme

Working with the results

If the calibrators aren't so good...

Where to go to from here

Where to get help

Rarely needed

Calibrators do not need to be ln-ln transformed

Data sets are stored somewhere else

R Package Documentation

Browse R Packages

We want your feedback!

randomchars42/eenv
Import often used functions, settings and packages