Repertoire overlap

Repertoire overlap is the most common approach to measure repertoire similarity. It is achieved by computation of specific statistics on clonotypes shared between given repertoires, also called “public” clonotypes. immunarch provides several indices: - number of public clonotypes (.method = "public") - a classic measure of overlap similarity.

  • overlap coefficient (.method = "overlap") - a normalised measure of overlap similarity. It is defined as the size of the intersection divided by the smaller of the size of the two sets.

  • Jaccard index (.method = "jaccard") - it measures similarity between finite sample sets, and is defined as the size of the intersection divided by the size of the union of the sample sets.

  • Tversky index (.method = "tversky") - an asymmetric similarity measure on sets that compares a variant to a prototype. If using default arguments, it’s similar to Dice’s coefficient.

  • cosine similarity (.method = "cosine") - a measure of similarity between two non-zero vectors

  • Morisita’s overlap index (.method = "morisita") - a statistical measure of dispersion of individuals in a population. It is used to compare overlap among samples.

  • incremental overlap - overlaps of the N most abundant clonotypes with incrementally growing N (.method = "inc+METHOD", e.g., "inc+public" or "inc+morisita").

The function that includes described methods is repOverlap. Again the output is easily visualised when passed to vis() function that does all the work:

imm_ov1 = repOverlap(immdata$data, .method = "public", .verbose = F)
imm_ov2 = repOverlap(immdata$data, .method = "morisita", .verbose = F)

grid.arrange(vis(imm_ov1), vis(imm_ov2, .text.size=2), ncol = 2)

vis(imm_ov1, "heatmap2")

You can easily change the number of significant digits:

grid.arrange(vis(imm_ov2, .text.size=2.5, .signif.digits=1), vis(imm_ov2, .text.size=2, .signif.digits=2), ncol = 2)

To analyse the computed overlap measures function apply repOverlapAnalysis.

## Standard deviations (1, .., p=4):
## [1] 0 0 0 0
## 
## Rotation (n x k) = (12 x 2):
##                [,1]       [,2]
## A2-i129 -20.2308709  22.431389
## A2-i131   8.3055445 -26.779321
## A2-i133  45.9341813  -6.893304
## A2-i132 -55.0903957 -18.572513
## A4-i191  23.7461189  -1.118162
## A4-i192  -4.4041243  38.028858
## MS1     -19.5494165 -12.836320
## MS2      -1.9063188  -6.075283
## MS3      -9.8321059  11.217724
## MS4       0.9127103   1.154627
## MS5       5.6552254 -27.415676
## MS6      26.4594518  26.857981
##              DimI      DimII
## A2-i129  66.63816 -114.58184
## A2-i131  13.03745  -30.09978
## A2-i133 -49.17857   80.77521
## A2-i132 -41.13009   87.09387
## A4-i191 -45.49009   96.35028
## A4-i192  63.34893 -123.78398
## MS1      74.31461 -120.62902
## MS2     -52.80716   89.85864
## MS3      67.77735 -118.54062
## MS4     -48.48749   88.83537
## MS5      11.26974  -26.67354
## MS6     -59.29284   91.39541
## attr(,"class")
## [1] "matrix"      "immunr_tsne"

## Standard deviations (1, .., p=4):
## [1] 0 0 0 0
## 
## Rotation (n x k) = (12 x 2):
##                [,1]       [,2]
## A2-i129 -20.2308709  22.431389
## A2-i131   8.3055445 -26.779321
## A2-i133  45.9341813  -6.893304
## A2-i132 -55.0903957 -18.572513
## A4-i191  23.7461189  -1.118162
## A4-i192  -4.4041243  38.028858
## MS1     -19.5494165 -12.836320
## MS2      -1.9063188  -6.075283
## MS3      -9.8321059  11.217724
## MS4       0.9127103   1.154627
## MS5       5.6552254 -27.415676
## MS6      26.4594518  26.857981
##              DimI      DimII
## A2-i129 -247.9701  -56.01593
## A2-i131  229.6228  350.57072
## A2-i133 -322.4567 -194.78082
## A2-i132  214.2786  -53.78506
## A4-i191  127.0437  -21.76749
## A4-i192 -166.0795  -31.99389
## MS1     -245.8420   28.87181
## MS2      146.5506 -109.38137
## MS3     -225.5800  -26.96249
## MS4      155.5883  -72.40200
## MS5      195.0063  351.17525
## MS6      139.8378 -163.52873
## attr(,"class")
## [1] "matrix"      "immunr_tsne"

Public repertoire

In order to build a massive table with all clonotypes from the list of repertoires use the pubRep function.

##                                                    CDR3.nt Samples A2-i129
##     1:                   TGCGCCAGCAGCTTGGAAGAGACCCAGTACTTC       8       2
##     2:                   TGTGCCAGCAGCTTCCAAGAGACCCAGTACTTC       7      NA
##     3:                   TGTGCCAGCAGTTACCAAGAGACCCAGTACTTC       7       1
##     4:                   TGCGCCAGCAGCTTCCAAGAGACCCAGTACTTC       6       2
##     5:                      TGTGCCAGCAGCCAAGAGACCCAGTACTTC       6       5
##    ---                                                                    
## 86979:             TGTGCTTCACAACTCTTATTGGACGAGACCCAGTACTTC       1      NA
## 86980: TGTGCTTCACAAGCCCTACAGGGCACTTTCCATAATTCACCCCTCCACTTT       1      NA
## 86981:                   TGTGCTTCAGGGCGGGCCTACGAGCAGTACTTC       1      NA
## 86982:             TGTGCTTCCGCCGGACCGGACCGGGAGACCCAGTACTTC       1      NA
## 86983:                TGTGCTTGCGGGACAGATAACTATGGCTACACCTTC       1      NA
##        A2-i131 A2-i133 A2-i132 A4-i191 A4-i192 MS1 MS2 MS3 MS4 MS5 MS6
##     1:      NA       2       1      NA       1  NA  NA   1   1   1   1
##     2:       1       1       2       1      NA   2  NA  NA   2  NA   1
##     3:       1       1      NA       1       1   1  NA   2  NA  NA  NA
##     4:      NA       1       1      NA      NA  NA   1  NA   1  NA   1
##     5:       3      NA       2       3       2  NA  NA  NA  NA   5  NA
##    ---                                                                
## 86979:       1      NA      NA      NA      NA  NA  NA  NA  NA  NA  NA
## 86980:      NA      NA      NA      NA      NA  NA  NA  NA  NA   1  NA
## 86981:      NA      NA      NA      NA      NA   1  NA  NA  NA  NA  NA
## 86982:      NA       1      NA      NA      NA  NA  NA  NA  NA  NA  NA
## 86983:      NA      NA      NA      NA       1  NA  NA  NA  NA  NA  NA
##                    CDR3.aa   V.name Samples A2-i129 A2-i131 A2-i133 A2-i132
##     1:         CASSLEETQYF  TRBV5-1       8       2      NA       3       1
##     2:     CASSDSSGGANEQFF  TRBV6-4       6       1       1       2      NA
##     3:     CASSDSSGSTDTQYF  TRBV6-4       6      NA      NA      NA       4
##     4:         CASSFQETQYF  TRBV5-1       6       3      NA       1       1
##     5:         CASSLGETQYF TRBV12-4       6       2      NA      NA       4
##    ---                                                                     
## 86181:     CTSSRPTQGAYEQYF  TRBV7-2       1      NA      NA      NA      NA
## 86182:    CTSSSRAGAGTDTQYF  TRBV7-2       1      NA      NA      NA      NA
## 86183: CTSSYPGLAGLKRKETQYF  TRBV7-2       1      NA      NA      NA       1
## 86184:    CTSSYRQRPYQETQYF  TRBV7-2       1      NA      NA      NA      NA
## 86185:      CTSSYSTSGVGQFF  TRBV7-2       1      NA      NA      NA      NA
##        A4-i191 A4-i192 MS1 MS2 MS3 MS4 MS5 MS6
##     1:      NA       2  NA  NA   1   1   1   1
##     2:       5      NA  NA  NA   2  NA  NA  15
##     3:       1       1  NA  NA   1   1  NA   2
##     4:      NA      NA  NA   1  NA   1  NA   1
##     5:       3      NA   1  NA  NA  NA   2   3
##    ---                                        
## 86181:      NA      NA  NA  NA  NA  NA  NA   1
## 86182:      NA      NA  NA  NA   1  NA  NA  NA
## 86183:      NA      NA  NA  NA  NA  NA  NA  NA
## 86184:      NA      NA  NA  NA   1  NA  NA  NA
## 86185:      NA      NA  NA  NA  NA   1  NA  NA