groupClusters: groupClusters

Description Usage Arguments See Also Examples

View source: R/groupClusters.R

Description

A function that will group the clusters and if wanted find the intersection of patterns between the mutations within a cluster. And is also able to search for cluster patterns.

Usage

1
2
3
4
5
6
7
8
9
groupClusters(dataTable, clusterIdHeader = "clusterId",
  refHeader = "ref", altHeader = "alt",
  contextHeader = "surrounding", mutationSymbol = ".",
  asTibble = TRUE, patternIntersect = TRUE,
  searchClusterPatterns = TRUE, patternHeader = "linkedPatterns",
  showWarning = TRUE, searchPatterns = NULL, searchRefHeader = "ref",
  searchAltHeader = "alt", searchIdHeader = "process",
  searchDistanceHeader = "maxDistance", searchReverseComplement = TRUE,
  renameReverse = FALSE, reverseComplement = FALSE)

Arguments

dataTable

A table with columns containing cluster IDs, reference and alternative nucleotide. See the output of the identifyClusters function for more information about the table.

clusterIdHeader

Contains the name of the column with the cluster IDs.

refHeader

Contains the name of the column with the reference nucleotides.

altHeader

Contains the name of the column with the alternative nucleotides.

contextHeader

A string with the name of the column with the context. The data inside this column is e.g. "C.G" hereby stands the "." for the location of the mutation. What symbol is used to describe this location is arbitrary but be sure to adjust the mutationSymbol accordingly when searching for patterns. The contextHeader is irrelevant if linkPatterns is FALSE.

mutationSymbol

A string with the symbol that stands for the mutated nucleotide location in the context. (e.g. "." in "G.C")

asTibble

A boolean to tell if the result table has to be a tibble. When it is FALSE it will return data.frame

patternIntersect

A Boolean if the table contains patterns and these needed to be processed as well.

searchClusterPatterns

A Boolean if it's needed to search to cluster patterns (e.g. GA > TT).

patternHeader

A string with the column name of the column with the found patterns from the identifyClusters. Only in use when patternIntersect is TRUE.

showWarning

A Boolean if there need to be a warning if nrow is 0.

searchPatterns

A tibble with the known mutation patterns. The mutationPatterns is the default search table.

searchRefHeader

A string with the column name of the one with the reference nucleotide in the searchPatterns table.

searchAltHeader

A string with the column name of the one with the alternative nucleotide in the searchPatterns table.

searchIdHeader

A string with the column name of the one with the pattern IDs.

searchDistanceHeader

A string with the column name of the one with the maximum distance between clustered mutations. Not needed if the distance parameter is NULL. NA's within this column are allowed.

searchReverseComplement

A boolean to also search the patterns in the reverse complement of the searchPatterns tibble.

renameReverse

A Boolean if the id of the process needs to be renamed. This has the effect on the cMut functions that it will no longer treat the reverse complement and non reverse complement as the same. This parameter will irrelevant if searchReverseComplement is FALSE.

reverseComplement

A Boolean to tell if the ref, alt needs to be searched with the reverse complement. Irrelevant if searchClusterPatterns = FALSE or searchReverseComplement = TRUE.

See Also

See mutationPatterns help page for a full explanation of the differences between mutation patterns and cluster patterns.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# Example of a table containing the right columns and data for the
# identifiAndAnnotateClusters function:
test <- testDataSet

# Example of using this function with data that contain patterns:
mutations <- identifyClusters(dataTable      = test,
                              maxDistance    = 20000,
                              chromHeader    = "chrom",
                              sampleIdHeader = "sampleIDs",
                              positionHeader = "start",
                              linkPatterns   = TRUE)
clusters <- groupClusters(dataTable             = mutations,
                          patternIntersect      = TRUE,
                          searchClusterPatterns = FALSE)
# searchClusterPatterns = FALSE to emphasise the effect for the patterns from the identify function


# Example of using this function when it is needed to search for
# cluster patterns. Use ?mutationPatterns to learn about the
# difference between mutation patterns and cluster patterns.
clusters <- groupClusters(dataTable             = mutations,
                          patternIntersect      = TRUE,
                          searchClusterPatterns = TRUE)

# For more information about the table:
cat(comment(clusters))

AlexJanse/cMut documentation built on May 25, 2019, 4 a.m.