intersectClonesets: Intersection between sets of sequences or any elements.

Description Usage Arguments Details Value See Also Examples

Description

Functions for the intersection of data frames with TCR / Ig data. See the repOverlap function for a general interface to all overlap analysis functions.

intersectClonesets - returns number of similar elements in the given two clonesets / data frames or matrix with counts of similar elements among each pair of objects in the given list.

intersectCount - similar to tcR::intersectClonesets, but with fewer parameters and only for two objects.

intersectIndices - returns matrix M with two columns, where element with index M[i, 1] in the first given object is similar to an element with index M[i, 2] in the second given object.

intersectLogic - returns logic vector with TRUE values in positions, where element in the first given data frame is found in the second given data frame.

Usage

1
2
3
4
5
6
7
8
intersectClonesets(.alpha = NULL, .beta = NULL, .type = "n0e", .head = -1, .norm = F,
          .verbose = F)

intersectCount(.alpha, .beta, .method = c('exact', 'hamm', 'lev'), .col = NULL)

intersectIndices(.alpha, .beta, .method = c('exact', 'hamm', 'lev'), .col = NULL)

intersectLogic(.alpha, .beta, .method = c('exact', 'hamm', 'lev'), .col = NULL)

Arguments

.alpha

Either first vector or data.frame or list with data.frames.

.beta

Second vector or data.frame or type of intersection procedure (see the .type parameter) if .alpha is a list.

.type

Types of intersection procedure if .alpha and .beta is data frames. String with 3 characters (see 'Details' for more information).

.head

Parameter for the head function, applied before intersecting.

.norm

If TRUE than normalise result by product of length or nrows of the given data.

.verbose

if T then produce output of processing the data.

.method

Method to use for intersecting string elements: 'exact' for exact matching, 'hamm' for matching strings which have <= 1 hamming distance, 'lev' for matching strings which have <= 1 levenshtein (edit) distance between them.

.col

Which columns use for fetching values to intersect. First supplied column matched with .method, others as exact values.

Details

Parameter .type of the intersectClonesets function is a string of length 3 [0an][0vja][ehl], where:

  1. First character defines which elements intersect ("a" for elements from the column "CDR3.amino.acid.sequence", "n" for elements from the column "CDR3.nucleotide.sequence", other characters - intersect elements as specified);

  2. Second character defines which columns additionaly script should use ('0' for cross with no additional columns, 'v' for cross using the "V.gene" column, 'j' for cross using "J.gene" column, 'a' for cross using both "V.gene" and "J.gene" columns);

  3. Third character defines a method of search for similar sequences is use: "e" stands for the exact match of sequnces, "h" for match elements which have the Hamming distance between them equal to or less than 1, "l" for match elements which have the Levenshtein distance between tham equal to or less than 1.

Value

intersectClonesets returns (normalised) number of similar elements or matrix with numbers of elements.

intersectCount returns number of similar elements.

intersectIndices returns 2-row matrix with the first column stands for an index of an element in the given x, and the second column stands for an index of an element of y which is similar to a relative element in x;

intersectLogic returns logical vector of length(x) or nrow(x), where TRUE at position i means that element with index i has been found in the y

See Also

repOverlap, vis.heatmap, ozScore, permutDistTest, vis.group.boxplot

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
## Not run: 
data(twb)
# Equivalent to intersectClonesets(twb[[1]]$CDR3.nucleotide.sequence,
#                         twb[[2]]$CDR3.nucleotide.sequence)
# or intersectCount(twb[[1]]$CDR3.nucleotide.sequence,
#                    twb[[2]]$CDR3.nucleotide.sequence)
# First "n" stands for a "CDR3.nucleotide.sequence" column, "e" for exact match.
twb.12.n0e <- intersectClonesets(twb[[1]], twb[[2]], 'n0e')
stopifnot(twb.12.n0e == 46)
# First "a" stands for "CDR3.amino.acid.sequence" column.
# Second "v" means that intersect should also use the "V.gene" column.
intersectClonesets(twb[[1]], twb[[2]], 'ave')
# Works also on lists, performs all possible pairwise intersections.
intersectClonesets(twb, 'ave')
# Plot results.
vis.heatmap(intersectClonesets(twb, 'ave'), .title = 'twb - (ave)-intersection', .labs = '')
# Get elements which are in both twb[[1]] and twb[[2]].
# Elements are tuples of CDR3 nucleotide sequence and corresponding V-segment
imm.1.2 <- intersectLogic(twb[[1]], twb[[2]],
                           .col = c('CDR3.amino.acid.sequence', 'V.gene'))  
head(twb[[1]][imm.1.2, c('CDR3.amino.acid.sequence', 'V.gene')])
data(twb)
ov <- repOverlap(twb)
sb <- matrixSubgroups(ov, list(tw1 = c('Subj.A', 'Subj.B'), tw2 = c('Subj.C', 'Subj.D')));
vis.group.boxplot(sb)

## End(Not run)

Example output

Loading required package: ggplot2
Loading required package: dplyr

Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

Loading required package: gridExtra

Attaching package: 'gridExtra'

The following object is masked from 'package:dplyr':

    combine

Loading required package: reshape2
Loading required package: igraph

Attaching package: 'igraph'

The following objects are masked from 'package:dplyr':

    as_data_frame, groups, union

The following objects are masked from 'package:stats':

    decompose, spectrum

The following object is masked from 'package:base':

    union

sh: 1: cannot create /dev/null: Permission denied
sh: 1: wc: Permission denied
Could not detect number of cores, defaulting to 1.

Attaching package: 'tcR'

The following object is masked from 'package:igraph':

    diversity

[1] 158
       Subj.A Subj.B Subj.C Subj.D
Subj.A     NA    158     65     58
Subj.B    158     NA     56     47
Subj.C     65     56     NA    131
Subj.D     58     47    131     NA
Warning: Ignoring unknown aesthetics: fill
   CDR3.amino.acid.sequence V.gene
8             CASSLGLHYEQYF TRBV28
14            CAWSRQTNTEAFF TRBV30
17            CASSLGVGYEQYF TRBV28
19            CASSLGLHYEQYF TRBV28
30            CASSLGLNYEQYF TRBV28
66            CASSLGVSYEQYF TRBV28

  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |======================================================================| 100%

tcR documentation built on July 2, 2020, 3:18 a.m.