Description Usage Arguments Details Value References Examples
Finds mutational clusters after reordering the protein using the traveling salesman approach.
1 2 3 | GraphClust(mutation.data, position.data, insertion.type = "cheapest_insertion", alpha = 0.05,
MultComp = "Bonferroni", fix.start.pos = "Y", Include.Culled = "Y",
Include.Full = "Y")
|
mutation.data |
A matrix of 0's (no mutation) and 1's (mutation) where each column represents an amino acid in the protein and each row represents an individual sample (test subject, cell line, etc). Thus if column i in row j had a 1, that would mean that the ith amino acid for person j had a nonsynonomous mutation. |
position.data |
A dataframe consisting of six columns: 1) Residue Name, 2) Amino Acid number in the protein, 3) Side Chain, 4) X-coordinate, 5) Y-coordinate and 6) Z-coordinate. Please see get.Positions and get.AlignedPositions in the iPAC package for further information on how to construct this matrix. |
insertion.type |
Specifies the type of insertion method used. Please see the TSP package for more details. |
alpha |
The significance level required in order to find a mutational cluster significance. Please see the NMC package for further information. |
MultComp |
The multiple comparison adjustment required as all pairwise mutations are considered. Options are: “Bonferroni", "BH", or "None". |
fix.start.pos |
The TSP package starts the path at a random amino acid. Such that the results are easily reproducible, the default starts the path on the first amino acid in the protein. |
Include.Culled |
If "Y", the standard NMC algorithm will be run on the protein after removing the amino acids for which there is no positional data. |
Include.Full |
If "Y", the standard NMC algorithm will be run on the full protein sequence. |
The protein reordering is done using the TSP package available on CRAN. This hamiltonian path then serves as the new protein ordering.
The position data can be created via the “get.AlignedPositions" or the “get.Positions" functions available via the imported iPAC package.
The mutation matrix must have the default R column headings “V1", “V2",...,“VN", where N is the last amino acid in the protein. No positions should be skipped in the mutaion matrix.
When unmapping back to the original space, the end points of the cluster in the mapped space are used as the endpoints of the cluster in the unmapped space.
Remapped |
This shows the clusters found while taking the 3D structure into account and remapping the protein using a traveling salesman approach. |
OriginalCulled |
This shows the clusters found if you run the NMC algorithm on the canonical linear protein, but with the amino acids for which we don't have 3D positional data removed. |
Original |
This shows the clusters found if you run the NMC algorithn on the canonical linear protein with all the amino acids. |
candidate.path |
This shows the path found by the TSP package that heuristically minimizes the total distance through the protein. |
path.distance |
The length of the candidate path if traveled from start to finish. |
linear.path.distance |
The length of the sequential path 1,2,3...,N (where N is the total number of amino acids in the protein). |
protein.graph |
A graph object created by the igraph package that has edges between amino acids on the candidate.path. This can be passed to plotting functions to create visual represnetations. |
missing.positions |
This shows which amino acids are present in the mutation matrix but for which we do not have positions. These amino acids are cut from the protein when calculating the Remapped and OriginalCulled results. |
Ye et. al., Statistical method on nonrandom clustering with application to somatic mutations in cancer. BMC Bioinformatics. 2010. doi:10.1186/1471-2105-11-11.
Michael Hahsler and Kurt Hornik (2011). Traveling Salesperson Problem (TSP) R package version 1.0-7. http://CRAN.R-project.org/.
Csardi G, Nepusz T: The igraph software package for complex network research, InterJournal, Complex Systems 1695. 2006. http://igraph.sf.net
Gregory Ryslik and Hongyu Zhao (2012). iPAC: Identification of Protein Amino acid Clustering. R package version 1.1.3. http://www.bioconductor.org/.
1 2 3 4 5 6 7 8 9 10 11 12 | ## Not run:
#Load the positional and mutatioanl data
CIF<-"https://files.rcsb.org/view/3GFT.cif"
Fasta<-"https://www.uniprot.org/uniprot/P01116-2.fasta"
KRAS.Positions<-get.Positions(CIF,Fasta, "A")
data(KRAS.Mutations)
#Calculate the required clusters
GraphClust(KRAS.Mutations,KRAS.Positions$Positions,insertion.type = "cheapest_insertion",
alpha = 0.05, MultComp = "Bonferroni")
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.