Calculate the kmer statistics of immune repertoires

getKmers(.data, .k, .col = c("aa", "nt"), .coding = TRUE)



The data to be processed. Can be data.frame, data.table, or a list of these objects.

Every object must have columns in the immunarch compatible format. immunarch_data_format

Competent users may provide advanced data representations: DBI database connections, Apache Spark DataFrame from copy_to or a list of these objects. They are supported with the same limitations as basic objects.

Note: each connection must represent a separate repertoire.


Integer. Length of kmers.


Character. Which column to use, pass "aa" (by default) for CDR3 amino acid sequence, pass "nt" for CDR3 nucleotide sequences.


Logical. If TRUE (by default) then remove all non-coding sequences from input data first.


Data frame with two columns (kmers and their counts).


kmers <- getKmers(immdata$data[[1]], 5)
kmers %>% vis()