Calculate the k-mer statistics of immune repertoires

getKmers(.data, .k, .col = c("aa", "nt"), .coding = TRUE)

Arguments

.data

The data to be processed. Can be data.frame, data.table, or a list of these objects.

Every object must have columns in the immunarch compatible format. immunarch_data_format

Competent users may provide advanced data representations: DBI database connections, Apache Spark DataFrame from copy_to or a list of these objects. They are supported with the same limitations as basic objects.

Note: each connection must represent a separate repertoire.

.k

Integer. Length of k-mers.

.col

Character. Which column to use, pass "aa" (by default) for CDR3 amino acid sequence, pass "nt" for CDR3 nucleotide sequences.

.coding

Logical. If TRUE (by default) then removes all non-coding sequences from input data first.

Value

Data frame with two columns (k-mers and their counts).

Examples

data(immdata)
kmers <- getKmers(immdata$data[[1]], 5)
kmers %>% vis()