Description Usage Arguments Value See Also Examples
View source: R/identifyClusters.R
A function that finds and annotate clusters in a genomic data tibble.
1 2 3 4 5 6 7 8 9 10 | identifyClusters(dataTable, maxDistance, chromHeader = "chrom",
sampleIdHeader = "sampleIDs", positionHeader = "start",
refHeader = "ref", altHeader = "alt",
contextHeader = "surrounding", mutationSymbol = ".",
linkPatterns = TRUE, reverseComplement = FALSE,
searchPatterns = NULL, searchRefHeader = "ref",
searchAltHeader = "alt", searchContextHeader = "surrounding",
searchIdHeader = "process", searchDistanceHeader = "maxDistance",
searchMutationSymbol = ".", searchReverseComplement = TRUE,
linkClustersOnly = TRUE, renameReverse = FALSE, asTibble = TRUE)
|
dataTable |
A data.frame or tibble that contains at least chromosome
name, sample ID and position information. The data cannot contain any NA.
For an example use |
maxDistance |
A number with the maximum distance between DNA mutations that are defined as being in a cluster. |
chromHeader |
A string with the name of the column with the chromosome name. (So the data in the column needs to be notated as e.g. "chr2") |
sampleIdHeader |
A string with the name of the column with the sample ID. |
positionHeader |
A string with the name of the column with the position of the mutation. (The data in the column needs to be nummeric.) |
refHeader |
Contains the name of the column with the reference nucleotides. |
altHeader |
Contains the name of the column with the alternative nucleotides. |
contextHeader |
A string with the name of the column with the context.
The data inside this column is e.g. "C.G" hereby stands the "." for the
location of the mutation. What symbol is used to describe this location is
arbitrary but be sure to adjust the |
mutationSymbol |
A string with the symbol that stands for the mutated
nucleotide location in the |
linkPatterns |
A Boolean to tell if it's necessary to try and link the
mutations to patterns. If FALSE then the |
reverseComplement |
A Boolean to tell if the |
searchPatterns |
A tibble with the known mutation patterns. The
|
searchRefHeader |
A string with the column name of the one with the reference nucleotide in the searchPatterns table. |
searchAltHeader |
A string with the column name of the one with the alternative nucleotide in the searchPatterns table. |
searchContextHeader |
A string with the column name of the one with the context nucleotide in the searchPatterns table. |
searchIdHeader |
A string with the column name of the one with the pattern IDs. |
searchDistanceHeader |
A string with the column name of the one with the maximum distance between clustered mutations. Not needed if the distance parameter is NULL. NA's within this column are allowed. |
searchMutationSymbol |
A string with symbol that stands for the mutated
nucleotide location in the column of the |
searchReverseComplement |
A boolean to also search the patterns in the reverse complement of the searchPatterns tibble. |
linkClustersOnly |
A boolean to tell if only the clustered mutations are
needed to be linked with the patterns in the |
renameReverse |
A Boolean if the id of the process needs to be renamed.
This has the effect on the cMut functions that it will no longer treat the
reverse complement and non reverse complement as the same. This parameter
will irrelevant if |
asTibble |
A boolean to tell if the result table has to be a tibble. When it is FALSE it will return data.frame |
The tibble that was sent as an argument for this function with extra columns: clusterId, is.clustered and distance till nearest mutation below the maximum distance.
testDataSet
for an example of data as
input for parameter dataTable
mutationPatterns
for looking at the default pattern search table
Use the following code to access the vignette with detailed examples of how to use the functions of cMut: vignette("analysis_of_clusterpattterns",package = "cMut")
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | # Example data set:
data <- testDataSet
# Example for just clustering:
results <- identifyClusters(dataTable = data,
maxDistance = 20000,
linkPatterns = FALSE)
# Example for clustering and linking patterns with the default searchPattern table:
results <- identifyClusters(dataTable = data,
maxDistance = 20000,
linkPatterns = TRUE)
# For more information about the added columns, use:
cat(comment(results))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.