model3: Empirical error model for Illumina Genome Analyzer IIx with...
In mikelove/polyesterAlpineMs: Simulate RNA-seq reads

Description Format Source References

for each position in a single-end read generated with the specified Illumina chemistry, this data frame contains the probability of not making a sequencing error, and of making each of the 4 possible types of sequencing errors. The reference base (truth) is in column 1, and the probabilities of sequencing that base given its read position (column 7) as each of the 5 possible bases (A, T, G, C, and N) is given in columns 2 through 6, respectively. So for example, at position 8 in mate 1 of a read where the true base is A, the probability of correctly calling that base an A is 0.9998, the probability of making an error by sequencing a T is 2.95e-05, the probability of making an error by sequencing a G is 1.27e-04, the probability of making an error by sequencing a C is 1.85e-05, and the probability of reading an 'N' at position 8 is 0. This can be seen by looking at model3[model3$pos == 8,]. Note that position indexing is 1-based, though a 0 position is included as described in the GemSIM documentation.

data frame named model3, 7 columns, 505 rows

processed from the Illumina v4 error model that ships with GemSIM (see references)

McElroy KE, Luciani F, Thomas T (2012). GemSIM: general, error-model based simulator of next-generation sequencing data. BMC Genomics 13(1), 74.

mikelove/polyesterAlpineMs documentation built on May 22, 2019, 10:52 p.m.