Description Usage Arguments Value Column Meanings Note Author(s) References Examples
View source: R/G4iM.Grinder.Funs.R
A function to detect and analyse quadruplex sequences in a genome. G4iM Grinder can be applied as a tool for possible G-quadruplex, i-Motif and higher-order structure identification, characterization and punctuation as the probability of in vitro formation and biological relevance. The search algorithm is highly configurable in all of the process steps.
1 2 3 4 | G4iMGrinder(Name, Sequence, DNA = TRUE, Complementary = TRUE, RunComposition = "G", BulgeSize = 1, MaxIL = 3, MaxRunSize = 5, MinRunSize = 3, MinNRuns = 4,
MaxNRuns = 0, MaxPQSSize = 33, MinPQSSize = 15, MaxLoopSize = 10, MinLoopSize = 0, LoopSeq = c("G", "T", "A", "C"), Method2 = TRUE, Method3 = TRUE, G4hunter = TRUE,
cGcC = FALSE, PQSfinder = TRUE, Bt = 14, Pb = 17, Fm = 3, Em = 1, Ts = 4, Et = 1, Is = -19, Ei = 1, Ls = -16, ET = 1, WeightParameters = c(50, 50, 0),
FreqWeight = 0, KnownQuadruplex = TRUE, KnownNOTQuadruplex = FALSE, RunFormula = FALSE, NCores = 1, Verborrea = TRUE)
|
Name |
character, name of the DNA or RNA sequence to grind. |
Sequence |
character, DNA or RNA sequence to grind composed of the nucleotide arrangement. |
DNA |
logical, controls if the sequence is DNA or RNA. The factory-fresh default is |
Complementary |
logical, controls if the Complementary strand should be created and analyzed. The factory-fresh default is |
RunComposition |
character, nucleotide that composes the runs. |
BulgeSize |
integer, number of acceptable non- |
MaxRunSize |
integer, max. number of |
MinRunSize |
integer, min. number of |
MaxLoopSize |
integer, max. number of nucleotides that may exist between runs to assume relationship. The factory-fresh default is |
MinLoopSize |
integer, min. number of nucleotides that may exist between runs to assume relationship. The factory-fresh default is |
MaxNRuns |
integer, max. number of runs that compose a structure. The factory-fresh default is |
MinNRuns |
integer, min. number of runs that compose a structure. The factory-fresh default is |
MaxPQSSize |
integer, max. number of nucleotides that compose a structure. The factory-fresh default is |
MinPQSSize |
integer, min. number of nucleotides that compose a structure. The factory-fresh default is |
MaxIL |
integer, total number of nucleotides to allow to exist in between all the |
Method2 |
logical, to apply method 2 (M2A) of analysis to the sequence search results. This will search for structures with defined size and runs. Depends on variables: |
Method3 |
logical, to apply method 3 (M3A) to the sequence search results. Search for structures with unrestricted size and numbers of runs. Useful for searching higher forming structures. Depends on variables: |
LoopSeq |
character, vector that defines what nucleotide and/or nucleotide pattern to quantify in each structure detected. The factory-fresh default is |
WeightParameters |
vector of three integers, where each of the integers are the weighted value of each possible scoring system: G4hunter, PQSfinder and cGcC (in that order). Depends on the scoring system to be |
G4hunter |
logical, to apply G4hunter algorithm as a scoring mechanism of in vitro probability of formation. The factory-fresh default is |
cGcC |
logical, to apply cGcC algorithm as a scoring mechanism of in vitro probability of formation. Only for RNA sequences. The factory-fresh default is |
PQSfinder |
logical, to apply an adaptation of PQSfinder algorithm as a scoring mechanism of in vitro probability of formation. The factory-fresh default is |
Bt |
integer, tetrad stacking bonus constant used for the PQSfinder adaptation calculations. The factory-fresh default is |
Pb |
integer, inter-Loop penalization constant used for the PQSfinder adaptation calculations. The factory-fresh default is |
Fm |
integer, loop length penalization constant used for the PQSfinder adaptation calculations. The factory-fresh default is |
Em |
integer, loop length exponential constant used for the PQSfinder adaptation calculations. The factory-fresh default is |
Ts |
integer, tetrad supplement constant used for the PQSfinder adaptation calculations. The factory-fresh default is |
Et |
integer, inter-Loop supplement constant used for the PQSfinder adaptation calculations. The factory-fresh default is |
Is |
integer, loop supplement constant used for the PQSfinder adaptation calculations. The factory-fresh default is |
Ei |
integer, tetrad exponential constant used for the PQSfinder adaptation calculations. The factory-fresh default is |
Ls |
integer, inter-Loop exponential constant used for the PQSfinder adaptation calculations. The factory-fresh default is |
ET |
integer, total formula exponential constant used for the PQSfinder adaptation calculations. The factory-fresh default is |
KnownQuadruplex |
logical, controls if G4iM Grinder should compare the results with a list of known sequences that have already been demonstrated to form in vitro. Only for |
KnownNOTQuadruplex |
logical, controls if G4iM Grinder should compare the results with a list of known sequences that have already been demonstrated to NOT form in vitro. Only for |
FreqWeight |
integer, an arbitrary constant to which calculate the importance of the structure frequency. Useful only for M2B and M3B, were frequency of the structures are calculated and a new score is computed considering structure frequency.The factory-fresh default is |
RunFormula |
logical, should the formula of the PQS be calculated. The factory-fresh default is |
NCores |
integer, number of Cores to cede to the function for parallel computation. The factory-fresh default is |
Verborrea |
logical, allow the function to update the user with its progress. The factory-fresh default is |
The result of G4iM Grinder is a List
Configuration |
A |
FunTime |
A |
PQSM2a |
A |
PQSM2b |
A |
PQSM3a |
A |
PQSM3b |
A |
Start: integer, start position of the sequence in the genome. Only for M2A and M3A.
Finish: integer, end position of the sequence in the genome. Only for M2A and M3A.
Freq: integer, sequence frequency of appearance in the genome. Only for M2B and M3B.
Runs: integer, number of runs (G-runs for PQS, C-runs for PiQS) in the sequence.
IL: integer, number of bulges in the sequence.
mRun: numeric, average run size.
Sequence: character, sequence nucleotide arrangement.
Length: integer, size in nucleotides of the sequence.
Strand: character, strand position of the sequence. "+" is the original and "-" is the complementary strand. Only if Complementary = TRUE.
G4Hunter: numeric, sequence score by G4Hunter. Only if G4Hunter = TRUE.
pqsfinder: numeric, sequence score by GiG's PQSfinder. Only if pqsfinder = TRUE.
cGcC: numeric, sequence score by GiG's cGcC. Only if cGcC = TRUE.
Score: numeric, sequence score combining all selected scores and sequence frequency.
Conf.Quad.Seqs: character, name and times found (in parenthesis) of known-to-form quadruplexes in the sequence. DNA Known-to-form structures have asterisk (*) after the number of times detected. RNA known-to-form structures have a circumflex (^) after the times detected. Only if KnownQuadruplex = TRUE.
Conf.NOT.Quad.Seqs: character, name and times found (in parenthesis) of known-NOT-to-form quadruplexes in the sequence. DNA Known-NOT-to-form structures have asterisk (*) after the number of times detected. RNA known-NOT-to-form structures have a circumflex (^) after the times detected.Only if KnownNOTQuadruplex = TRUE.
M1 is Method 1. M2 is Method 2. M3 is method 3.
Efres Belmonte-Reche
Belmonte-Reche,E. and Morales,J.C. (2019) G4-iM Grinder: when size and frequency matter. G-Quadruplex, i-Motif and higher order structure search and analysis tool. NAR Genomics and Bioinformatics, 2.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | library(G4iMGrinder)
#Retrieving a Sequence
{if(!require("seqinr")){install.packages("seqinr")}}
Name <- "LmajorESTs"
Sequence <-
paste0(read.fasta(file = url("http://tritrypdb.org/common/downloads/release-36/Lmajor/fasta/TriTrypDB-36_Lmajor_ESTs.fasta"),
as.string = TRUE, legacy.mode = TRUE, seqonly = TRUE, strip.desc = TRUE, seqtype = "DNA" ), collapse = "")
#For G-quadruplex search in DNA.
Rs <- G4iMGrinder(Name = Name, Sequence = Sequence)
#For G-quadruplex search in RNA.
Rs <- G4iMGrinder(Name = Name, Sequence = Sequence, DNA = FALSE, cGcC = TRUE)
#For i-Motifs search in DNA.
Rs <- G4iMGrinder(Name = Name, Sequence = Sequence, RunComposition == "C")
#For flexible G-quadruplex search in DNA.
Rs <- G4iMGrinder(Name = Name, Sequence = Sequence, BulgeSize = 2, MaxLoopSize = 20, MaxIL = 10)
#Visualization of Results
View(Rs$PQSM2a) # To view M2A (Method2a) results. Size dependent structures with overlapping.
View(Rs$PQSM2b) # To view M2B (Method2b) results. Method2 with frequency considerations.
View(Rs$PQSM3a) # To view M3A (Method3a) results. Size independent structures without overlapping.
View(Rs$PQSM3b) # To view M3B (Method3b) results. Method3 with frequency considerations.
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.