Description Usage Arguments Details Value Author(s)
View source: R/QC_fgc_crispr_data.R
Produce Quality Control (QC) metrics from the following FGC CRISPR screen data inputs: an analysis config JSON file, a combined counts csv file, a Bagel results tsv file (Control vs Plasmid), a gRNA library tsv file ('cleanr.tsv'), and one or more bcl2fastq2 output JSON files ('Stats.json'). QC output will be returned for a single named comparison.
1 2 3 4 5 6 7 8 9 10 11 12 13 | QC_fgc_crispr_data(
analysis_config,
combined_counts,
bagel_ctrl_plasmid,
bagel_treat_plasmid,
bcl2fastq,
library,
comparison_name,
output,
output_R_object,
norm_method = "median_ratio",
mock_missing_data = FALSE
)
|
analysis_config |
A path to a valid analysis config JSON file to be used as input for the AZ-CRUK CRISPR analysis pipeline. |
combined_counts |
A path to a valid combined counts csv file (produced by the AZ-CRUK CRISPR analysis pipeline). |
bagel_ctrl_plasmid |
A path to a valid Bagel tsv results file for Control vs Plasmid. If |
bagel_treat_plasmid |
A path to a valid Bagel tsv results file for Treatment vs Plasmid. If |
bcl2fastq |
A character string giving one or more paths to valid |
library |
A valid path to a library tsv file in which the first column gives the sgRNA sequence and the second column gives the sgRNA ID (produced by the AZ-CRUK CRISPR reference data generation pipeline). May be |
comparison_name |
A character string naming a single comparison to extract QC data for (should correspond to the comparison name used in the analysis config JSON file). |
output |
A character string giving an output file name for the csv results. If |
output_R_object |
A character string giving an output file name for the returned R object (useful for trouble-shooting). If |
norm_method |
A character string naming a normalization method for the count data. Can be |
mock_missing_data |
A logical indicating whether any missing inputs should be mocked or not. Defaults to |
There are 7 main steps in the function:
Check inputs (mock missing data, if needed).
Read data.
QC for sequencing metrics (merge with qc data in analysis config).
Normalize counts and calculate logFC data.
QC for counts and logFC data.
QC for Bagel Bayes Factors.
Wrap up and write out results (mask any mocked columns).
A list containing the following elements:
qc_metrics
- A data frame containing QC metrics as columns and samples as rows; this data will also be written to the output
file, if not NULL
.
comparisons
- A data frame of samples belonging to the focal comparison.
seq_metrics
- A data frame of CI sequencing metrics at both flowcell and sample levels.
log2FC
- A list containing normalized counts and logFC data frames at both the gRNA and gene level.
bagel_ROC
- A list containing Bagel Bayes Factor data with True_Positive_Rate
and False_Positive_Rate
columns for specific gene_sets
.
bagel_PrRc
- A list containing Precision-Recall data for different sample comparisons.
Alex T. Kalinka, alex.kalinka@cancer.org.uk
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.