knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
A package to load packages, functions and variables I frequently use.
You can install eenv
from github with:
# install.packages("devtools") devtools::install_github("randomchars42/eenv")
library("eenv")
Suppose you have an ods
/ xls(x)
file with raw values obtained from a measurement like this:
data <- utils::read.csv2( system.file("extdata", "values.csv", package = "eenv"), header = FALSE) rownames(data) <- LETTERS[1:6] knitr::kable( data, row.names = TRUE, col.names = as.character(1:6))
Save them as plate_1.csv
- thats like an ods
/ xls(x)
file but its basically a text file with the values separated by commas (or semicolons for languages that use "," to separate decimals).
In the current versions of LibreOffice / OpenOffice / Microsoft office theres an option "Save as" > "csv".
Before feeding your samples into your measuring device you most likely drafted some sort of plan which position corresponds to which sample (didn't you?). It may have looked like this:
data <- utils::read.csv2( system.file("extdata", "names.csv", package = "eenv"), header = FALSE) rownames(data) <- LETTERS[1:6] knitr::kable( data, row.names = TRUE, col.names = as.character(1:6))
So you had some calibrators (CAL1 - CAL4) and samples A, B, C, D, E, F, H, I, J, K, L, M each in duplicates.
To easily set the names for your samples just copy the names into your new plate_1.csv
:
data <- utils::read.csv2( system.file("extdata", "values_names.csv", package = "eenv"), header = FALSE) rownames(data) <- LETTERS[1:12] knitr::kable( data, row.names = TRUE, col.names = as.character(1:6))
Tell plates_read()
your data contains the names and which column should hold those names by setting additional_vars = c("name")
.
# do not run this code yet plates_read( additional_vars = c("name") )
It does not matter which name you select for the column. Pick anything that adequately describes the data in the column.
Suppose samples A, B, C, D, E, F were taken at day 1 and H, I, J, K, L, M were taken from the same rats / individuals / patients on day 2.
It would be more elegant to encode that into the data:
data <- utils::read.csv2( system.file("extdata", "plate_1.csv", package = "eenv"), header = FALSE) rownames(data) <- LETTERS[1:12] knitr::kable( data, row.names = TRUE, col.names = as.character(1:6))
So now the name and the day are in the same field, separated by "_"
. (So make sure the original names do not contain "_"
or there will be trouble.. ;) )
Tell plates_read()
your data contains the names and day by setting additional_vars = c("name", "day")
:
# do not run this code yet plates_read( additional_vars = c("name", "day") )
So, your measuring device only gave you raw values (extinction rates / relative light units / ...), but you know the concentrations of CAL1
, CAL2
, CAL3
and CAL4
. To get the concentrations for the rest of the samples you need to tell plates_read()
which samples are your calibrators and what their actual concentration is.
You do this via cal_names
and cal_values
. Both expect a vector
(that's a series of text strings / numbers / values...). You can create a vector using c()
. Just put whatever you like in its arguments:
# try it like <- c("sunny day", "cake", "coffee", "more coffee", "chocolate", "pizza") like
# try it fibonacci <- c(1, 1, 2, 3, 5, 8, 13) fibonacci
You already created your first vector when you said: additional_vars = c("name", "day")
;) .
So, now create some vectors for your calibrators:
# try it calibrator_names = c( "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9", "CAL10" ) calibrator_values = c( 4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125 )
Tell plates_read()
about it:
# do not run this code yet calibrator_names = c( "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9", "CAL10" ) calibrator_values = c( 4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125 ) plates_read( additional_vars = c("name", "day"), cal_names = calibrator_names, cal_values = calibrator_values )
The same, but a bit shorter:
# do not run this code yet plates_read( additional_vars = c("name", "day"), cal_names = c( "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9", "CAL10" ), cal_values = c( 4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125 ) )
plates_read()
automagically reads plate_1.csv
in your current directory. If you have data from more than one plate use plates = 2
to read plate1.csv
AND plate2.csv
. You can read as much plates as you like, as long as their numbers are in a sequence and no numbers are missing.
For now, tell plates_read()
to read only your first plate:
# do not run this code yet plates_read( plates = 1, additional_vars = c("name", "day"), cal_names = c( "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9", "CAL10" ), cal_values = c( 4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125 ) )
You dont have any data yet but want to play a bit?
This saves the example data into plate_1.csv
(but first it checks if there already is a file with that name, else it would overwrite it):
# try it # (don't worry if it looks weird ;) ) if (! file.exists("plate_1.csv")) { write.csv2( x = read.csv2( file = system.file("extdata", "values_names.csv", package = "eenv"), header = FALSE), file = "plate_1.csv", col.names = FALSE ) }
One thing to note for .csv
-files: some languages use "." to separate decimals, in those languages .csv
usually uses "," to separate values. Some European languages use "," for decimals, in these languages ";" is used for separation of values. You need to tell plates_read()
how you would like to have it with sep = ","
or sep = ";"
, respectively:
# do not run this code yet plates_read( plates = 1, sep = ";", additional_vars = c("name", "day"), cal_names = c( "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9", "CAL10" ), cal_values = c( 4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125 ) )
plates_read()
does a lot and wants to tell you about it:
"tibble"
or "tbl"
for short) with all the values it calculatedTo tell you aout everything it returns a list
. list
s are like vector
s but you can store a lot of different things in them:
# try it my_list <- list(a = 1, favourite_food = "pizza", age = 20, a_vector = c(1, 2, 3)) my_list
You can access each part of the list in a lot of different ways:
Using "$":
# try it my_list$age
Using the position:
# try it
my_list[[3]]
Something totally different:
# try it my_list[["a_vector"]]
This works for setting values as well:
# try it my_list$age <- 30 my_list$age
Now its time to run plates_read()
:
# now you may run it :) result_list <- plates_read( plates = 1, sep = ";", additional_vars = c("name", "day"), cal_names = c( "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9", "CAL10" ), cal_values = c( 4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125 ) )
result_list <- eenv::plates_read( plates = 1, sep = ";", path = system.file("extdata", package = "eenv"), additional_vars = c("name", "day"), cal_names = c( "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9", "CAL10" ), cal_values = c( 4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125 ), write_data = FALSE, use_written_data = FALSE )
The resulting list is now stored in result_list
. In addition, plates_read()
created two files in your current directory:
data_all.csv
and data_samples.csv
with all the data ;) .
plates_read()
's list holds the following items:
$all
: here you will find all the data , including calibrators, duplicates, ... (saved in data_all.csv
)$samples
: only samples here - no calibrators, no duplicates -> most often you will work with this data (saved in data_samples.csv
)$plate1
: another list ;)
$plot
: a plot showing you the linear function used to calculate the concentrations for this plate.The points are the calibrators. They should more or less lie close to the line.
$model
: the model - you won't need this too often ;)
($plate2
): the same information for every plate you have
Take a look:
result_list$all
knitr::kable(result_list$all)
result_list$samples
knitr::kable(result_list$samples)
result_list$plate1$plot
The lowest calibrator ("CAL9") does not seem to fit to well. So perhaps you get a better result when removing it? You can do that by setting exclude_cals = list(plate1 = c("CAL9"))
.
(But obviously, you won't be able to interpret values below "CAL8", other than that they are "below CAL8"!).
# now you may run it :) result_list <- plates_read( plates = 1, sep = ";", additional_vars = c("name", "day"), cal_names = c( "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9", "CAL10" ), cal_values = c( 4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125 ), exclude_cals = list(plate1 = c("CAL9")) )
result_list <- eenv::plates_read( plates = 1, sep = ";", path = system.file("extdata", package = "eenv"), additional_vars = c("name", "day"), cal_names = c( "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9", "CAL10" ), cal_values = c( 4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125 ), exclude_cals = list(plate1 = c("CAL9")), write_data = FALSE, use_written_data = FALSE )
Take a look at the fit of the line:
# try it result_list$plate1$plot
Better! But be careful when interpreting values below "CAL8" (the lowest use calibrator now).
First of all, to save a couple of keystrokes and make remembering easier:
# try it my_data <- result_list$samples
You could take a look which duplicates have a high (> 20 %) coefficient of variation:
# try it my_data %>% filter(concentration_cv > 0.20)
Luckily, there are no samples. Note that the result was not stored.
If there were you could exclude them like:
# try it my_data <- my_data %>% filter(concentration_cv < 0.20) my_data
# try it my_data <- my_data %>% filter(concentration_cv < 0.20) knitr::kable(my_data)
This time the result was stored. Be careful when overwriting data. You can always go back and run plates_read()
again :).
Now, given the concentration of your calibrators was in "ng / ml" but your editor wants you to use SI units you could convert the concentrations like this:
# try it my_data <- my_data %>% mutate( concentration = convert_conc(x = concentration, from = "ng / ml", to = "nmol / l", molar_mass = 52391) ) my_data
# try it my_data <- my_data %>% mutate( concentration = convert_conc(x = concentration, from = "ng / ml", to = "nmol / l", molar_mass = 52391) ) knitr::kable(my_data)
Or you could create a plot like this:
# try it # use our data # use column "name" as x-axis # use column "concentration" as y-axis ggplot(data = my_data, aes(x = name, y = concentration, colour = name)) + # plot each value as a point geom_point() + # draw a sub-plot (facet) for each day facet_wrap(~day)
Or a boxplot:
# try it # use our data # use column "name" as x-axis # use column "concentration" as y-axis ggplot(data = my_data, aes(x = day, y = concentration)) + # draw boxplots geom_boxplot() + # draw each value as a point over the boxplot geom_jitter(width = 0.2) # you could try # geom_point() as well ;)
The INTERNET ;)
To learn more about working with data:
dplyr
, good documentation with many examples : http://dplyr.tidyverse.org/dplyr
and ggplot
: http://r4ds.had.co.nz/To learn more about presenting data in plots:
Your calibrators are linear? You can use: model_func = fit_linear
and interpolate_func = interpolate_linear
. Basicallly, you can use any function as model_function
that returns a model which is understood by your interpolate_func
.
# try it but the result wont be as good result_list <- plates_read( plates = 1, sep = ";", additional_vars = c("name", "day"), cal_names = c( "CAL1", "CAL2", "CAL3", "CAL4", "CAL5", "CAL6", "CAL7", "CAL8", "CAL9", "CAL10" ), cal_values = c( 4000, 2000, 1000, 500, 250, 125, 62.5, 31.25, 15.625, 7.8125 ), exclude_cals = list(plate1 = c("CAL9")), model_func = fit_linear, interpolate_func = interpolate_linear ) result_list$plate1$plot
Your files are stored somewhere else? Just set path = path/to/your/files
as a parameter for plates_read()
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.