Calculate the k-mer statistics of immune repertoires

getKmers(.data, .k, .col = c("aa", "nt"), .coding = TRUE)



The data to be processed. Can be data.frame, data.table, or a list of these objects.

Every object must have columns in the immunarch compatible format. immunarch_data_format

Competent users may provide advanced data representations: DBI database connections, Apache Spark DataFrame from copy_to or a list of these objects. They are supported with the same limitations as basic objects.

Note: each connection must represent a separate repertoire.


Integer. Length of k-mers.


Character. Which column to use, pass "aa" (by default) for CDR3 amino acid sequence, pass "nt" for CDR3 nucleotide sequences.


Logical. If TRUE (by default) then removes all non-coding sequences from input data first.


Data frame with two columns (k-mers and their counts).


kmers <- getKmers(immdata$data[[1]], 5)
kmers %>% vis()