Annotate clonotypes in immune repertoires using clonotype databases such as VDJDB and MCPAS

Annotate clonotypes using immune receptor databases with known condition-associated receptors. Before using this function, you need to download database files first. For more details see the tutorial https://immunarch.com/articles/web_only/v11_db.html.

dbAnnotate(.data, .db, .data.col, .db.col)

Arguments

.data

The data to process. It can be a data.frame, a data.table, or a list of these objects.

Every object must have columns in the immunarch compatible format. immunarch_data_format

Competent users may provide advanced data representations: DBI database connections, Apache Spark DataFrame from copy_to or a list of these objects. They are supported with the same limitations as basic objects.

Note: each connection must represent a separate repertoire.

.db

A data frame or a data table with an immune receptor database. See dbLoad on how to load databases into R.

.data.col

Character vector. Vector of columns in the input repertoires to use for clonotype search. E.g., `"CDR3.aa"` or `c("CDR3.aa", "V.name")`.

.db.col

Character vector. Vector of columns in the database to use for clonotype search. The order must match the order of ".data.col". E.g., if ".data.col" is `c("CDR3.aa", "V.name")`, then ".db.col" must have the exact order of columns. i.e., the first column must correspond to CDR3 amino acid sequences, and the second column must correspond to V gene segment names.

Value

Data frame with input sequences and counts or proportions for each of the input repertoire.

Examples

data(immdata)

#' # Example file path
file_path <- paste0(system.file(package = "immunarch"), "/extdata/db/vdjdb.example.txt")

# Load the database with human-only TRB-only receptors for all known antigens
db <- dbLoad(file_path, "vdjdb", "HomoSapiens", "TRB")

res <- dbAnnotate(immdata$data, db, "CDR3.aa", "cdr3")
res
#> Empty data.table (0 rows and 14 cols): CDR3.aa,Samples,A2-i129,A2-i131,A2-i133,A2-i132...