knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
pepr
This vignette will show you how and why to use the subsample table functionality of the pepr
package.
basic information about the PEP concept visit the project website.
broader theoretical description in the subsample table documentation section.
This series of examples below demonstrates how and why to use sample subannoatation functionality in multiple cases to provide multiple input files of the same type for a single sample.
This example demonstrates how the sample subannotation functionality is used. In this example, 2 samples have multiple input files that need merging (frog_1
and frog_2
), while 1 sample (frog_3
) does not. Therefore, frog_3
specifies its file in the sample_table.csv
file, while the others leave that field blank and instead specify several files in the subsample_table.csv
file.
This example is made up of these components:
branch = "master" library(pepr) projectConfig = system.file( "extdata", paste0("example_peps-", branch), "example_subtable1", "project_config.yaml", package = "pepr" ) .printNestedList(yaml::read_yaml(projectConfig))
library(knitr) sampleAnnotation = system.file( "extdata", paste0("example_peps-", branch), "example_subtable1", "sample_table.csv", package = "pepr" ) sampleAnnotationDF = read.table(sampleAnnotation, sep = ",", header = T) kable(sampleAnnotationDF, format = "html")
sampleAnnotation = system.file( "extdata", paste0("example_peps-", branch), "example_subtable1", "subsample_table.csv", package = "pepr" ) sampleAnnotationDF = read.table(sampleAnnotation, sep = ",", header = T) kable(sampleAnnotationDF, format = "html")
Let's create the Project object and see if multiple files are present
projectConfig1 = system.file( "extdata", paste0("example_peps-", branch), "example_subtable1", "project_config.yaml", package = "pepr" ) p1 = Project(projectConfig1) # Check the files p1Samples = sampleTable(p1) p1Samples$file # Check the subsample names p1Samples$subsample_name
And inspect the whole table in p1@samples
slot
kable(p1Samples)
You can also access a single subsample if you call the getSubsample
method with appropriate sample_name
- subsample_name
attribute combination. Note, that this is only possible if the subsample_name
column is defined in the sub_annotation.csv
file.
sampleName = "frog_1" subsampleName = "sub_a" getSubsample(p1, sampleName, subsampleName)
This example uses a subsample_table.csv
file and a derived attributes to point to files. This is a rather complex example. Notice we must include the file_id
column in the sample_table.csv
file, and leave it blank; this is then populated by just some of the samples (frog_1
and frog_2
) in the subsample_table.csv
, but is left empty for the samples that are not merged.
This example is made up of these components:
projectConfig = system.file( "extdata", paste0("example_peps-", branch), "example_subtable2", "project_config.yaml", package = "pepr" ) .printNestedList(yaml::read_yaml(projectConfig))
sampleAnnotation = system.file( "extdata", paste0("example_peps-", branch), "example_subtable2", "sample_table.csv", package = "pepr" ) sampleAnnotationDF = read.table(sampleAnnotation, sep = ",", header = T) kable(sampleAnnotationDF, format = "html")
sampleAnnotation = system.file( "extdata", paste0("example_peps-", branch), "example_subtable2", "subsample_table.csv", package = "pepr" ) sampleAnnotationDF = read.table(sampleAnnotation, sep = ",", header = T) kable(sampleAnnotationDF, format = "html")
Let's load the project config, create the Project object and see if multiple files are present
projectConfig2 = system.file( "extdata", paste0("example_peps-", branch), "example_subtable2", "project_config.yaml", package = "pepr" ) p2 = Project(projectConfig2) # Check the files p2Samples = sampleTable(p2) p2Samples$file
And inspect the whole table in p2@samples
slot
kable(p2Samples)
This example gives the exact same results as Example 2, but in this case, uses a wildcard for frog_2
instead of including it in the subsample_table.csv
file. Since we can't use a wildcard and a subannotation for the same sample, this necessitates specifying a second data source class (local_files_unmerged
) that uses an asterisk (*
). The outcome is the same.
This example is made up of these components:
projectConfig = system.file( "extdata", paste0("example_peps-", branch), "example_subtable3", "project_config.yaml", package = "pepr" ) .printNestedList(yaml::read_yaml(projectConfig))
sampleAnnotation = system.file( "extdata", paste0("example_peps-", branch), "example_subtable3", "sample_table.csv", package = "pepr" ) sampleAnnotationDF = read.table(sampleAnnotation, sep = ",", header = T) kable(sampleAnnotationDF, format = "html")
sampleAnnotation = system.file( "extdata", paste0("example_peps-", branch), "example_subtable3", "subsample_table.csv", package = "pepr" ) sampleAnnotationDF = read.table(sampleAnnotation, sep = ",", header = T) kable(sampleAnnotationDF, format = "html")
Let's load the project config, create the Project object and see if multiple files are present
projectConfig3 = system.file( "extdata", paste0("example_peps-", branch), "example_subtable3", "project_config.yaml", package = "pepr" ) p3 = Project(projectConfig3) # Check the files p3Samples = sampleTable(p3) p3Samples$file
And inspect the whole table in p3@samples
slot
kable(p3Samples)
Merging is for same class inputs (like, multiple files for read1). Different-class inputs (like read1 vs read2) are handled by different attributes (or columns). This example shows you how to handle paired-end data, while also merging within each.
This example is made up of these components:
project_config = system.file( "extdata", paste0("example_peps-", branch), "example_subtable4", "project_config.yaml", package = "pepr" ) .printNestedList(yaml::read_yaml(project_config))
sampleAnnotation = system.file( "extdata", paste0("example_peps-", branch), "example_subtable4", "sample_table.csv", package = "pepr" ) sampleAnnotationDF = read.table(sampleAnnotation, sep = ",", header = T) kable(sampleAnnotationDF, format = "html")
sampleAnnotation = system.file( "extdata", paste0("example_peps-", branch), "example_subtable4", "subsample_table.csv", package = "pepr" ) sampleAnnotationDF = read.table(sampleAnnotation, sep = ",", header = T) kable(sampleAnnotationDF, format = "html")
Let's load the project config, create the Project object and see if multiple files are present
projectConfig4 = system.file( "extdata", paste0("example_peps-", branch), "example_subtable4", "project_config.yaml", package = "pepr" ) p4 = Project(projectConfig4) # Check the read1 and read2 columns p4Samples = sampleTable(p4) p4Samples$read1 p4Samples$read2
And inspect the whole table in p4@samples
slot
kable(p4Samples)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.