Description Usage Arguments Details Value Author(s) See Also
View source: R/generate_random_db.R
This function generates a database of random sequences using a restricted randomization procedure that shuffels the amino acids of the input database over its peptide length distribution. It also respects the N-terminal pyroglutamination and C-terminal amidation frequencies. If we plot the sorted molecular weights of the mock database on the sorted molecular weights of the input database, we expect them to reside approximately on y=x. The generate_random_db function will be used in the false discovery estimation of pep.id
.
1 | generate_random_db(db, size = 1, plot = F, verbose = F)
|
db |
A database with the first 3 columns |
size |
Numeric. The desired mock database size as a proportion of the original database size. |
plot |
Logical. If TRUE, mock database is plotted onto original database. Only works if mock and real database are of equal size ( |
verbose |
Logical. If TRUE, some properties of the mock database are printed in the terminal. |
A mock database is generated based on the input database. This function works as follows: all peptide lengths (in number of amino acids) of the input database are stored in one vector, and all amino acids of the input are stored in a second vector. Next, samples with replacement are taken from the length distribution, and peptides with these lenghts are generated by sampling with replacement from the amino acid vectors. In a last step, amidations and pyroglutaminations are added with a chance equal to their proportion in the input database. As a result, all masses in the newly generated database are realistic peptide masses.
A database of random peptides.
Rik Verdonck
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.