Description Usage Arguments Details Value See Also
Set up analysis container for ConcensusGLM using path to data as input. Performs validity checks on the input.
Optionally annotates input with experimental metadata from annotation_filename
.
1 2 3 4 |
data_filename |
Character. Path to a table of counts. |
annotation_filename |
Character. Optional. Path to experimental annotations. These will be joined to the input data and used for batch correction. |
output_path |
Character. Path to directory where you want the analysis output. This is also where checkpoints and logs from cluster execution are stored. Default is current working directory. |
controls |
Named list of Characters with elements |
test |
Logical. Run in test mode? If so, only reads the first 5 million lines of |
checkpoint |
Logical. Save intermediate results as checkpoints? |
threshold |
Numeric. Strains below this total count threshold will be discarded. Default |
spike_in |
Character. A regular expression to match spike-in controls. |
pseudostrains |
Logical. Make pseudostrains such as "total", which is the sum of all non-spike-ins. |
This creates the concensusDataSet object from data_filename
on which downstream analysis is carried out.
The input CSV should be found at data_filename
. This CSV should at least have the headers
'id', 'compound', 'concentration', 'strain', 'plate_name', 'count', 'well'
, and must have one row per
strain-compound-concentration-plate_name-well combination. Together, "id", "plate_name" and "well" define unique experimental
samples; "id" refers to sequencing (technical) replicates (if any) and "plate_name" and "well" are biological replicates
(recommended). Any given condition should have at least 2 replicates of some kind. If "row" and "column"
are present but not "well", then well is constructed by concatenating "row" and "column".
Firstly, this function loads a CSV from data_filename
. This may take some time if it is a large file. It then checks
that the minimum headers are present.
It then checks for either a negative_control
column or a list of control compounds supplied to controls
argument.
Under-represented (assumed to be spurious) strains and plates (as defined by threshold
) are removed.
Then, pseudo-strains are built if requested (the default). So far, the only pseudo-strain defined is "total", which is the total counts per well of non-spike-in strains.
If defined, annotations (like experimental meta-data) are loaded from annotation_filename
. This CSV file needs a
column name in common with data_filename
(case sensitive) since the next step is a join on the shared columns.
Also, every observation in the input data that you want to keep must be annotated.
Finally, it adds a negative_control
column if necessary and a positive_control
column if defined, and checks
that there are at least 2 negative control observations.
list of class "concensusDataSet"
newWorkflow, pipeline, concensusDataSetFromFile
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.