CELP_bias | R Documentation |
Function to compute codon-level stalling bias coefficients and bias-corrected read counts using the CELP (Consistent Excess of Loess Preds) method
CELP_bias(tr_codon_read_count_list, codon_raduis = 5, loess_method = "interpolate", gini_moderation = FALSE)
tr_codon_read_count_list |
A list generated by |
loess_method |
Determines whether the fitted surface should be computed exactly ("direct") or via interpolation from a kd tree ("interpolate"). Default: "interpolate". |
gini_moderation |
Logical argument. If set to TRUE, (bias_coefficient)^(gini_index) is used as correction factor. If set to FALSE, bias_coefficient is used as correction factor. Default: FALSE. |
codon_radius |
Number of codons on either side of each codon influencing the loess prediction for the middle codon. Default: 5. |
This function is the heart of CELP method for stalling bias detection and correction. It starts with running a loess curve on per-codon read counts along the transcript to borrow information from neighboring codons mitigating the uncertainty of p-site offset assignment and experimental stochasticity. Loess span parameter is calculated from the user-defined codon radius and CDS length. Then, a bias coefficient is calculated for each codon by integrating information on the excess of loess-predicted read counts at that codon comapred to the transcript's background across all samples. Finally, loess predicted counts read counts are divided by bias coefficients to calculate bias-corrected counts. The "direct" fitting method takes longer to complete but does not run into kd-tree-related memory issues. Gini index for each transcript is calculated from the bias coefficients of all of its codons. Gini moderation ensures that the strength of bias correction is proportional to the original level of heterogenity in read distribution along the transcript.
A list composed of two lists: 1. bias coefficients 2. bias-corrected read counts The bias coefficient list has the following structure: list$<transcript.ID> data.frame: [1] codon_number [2] codon_type [3] aa_type [4] bias_coefficient. The bias-corrected read count list has the following structure: list$<sample.name>$<transcript.ID> data.frame: [1] codon_number [2] codon_type [3] aa_type [4] observed_count [5] bias_coefficient [6] corrected_count. Gini moderation ensures that the strength of correction is proportional to the original level of heterogenity in read distribution along the transcript.
tr_codon_bias_coeff_corrected_count_LMCN <- CELP_bias(tr_codon_read_count_LMCN) tr_codon_bias_coeff_corrected_count_LMCN_gini_moderated <- CELP_bias(tr_codon_read_count_LMCN, gini_moderation = TRUE) tr_codon_bias_coeff_corrected_count_LMCN_direct_fit <- CELP_bias(tr_codon_read_count_LMCN, loess_method = "direct")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.