R/annotation.R
dbAnnotate.Rd
Annotate clonotypes using immune receptor databases with known condition-associated receptors. Before using this function, you need to download database files first. For more details see the tutorial https://immunarch.com/articles/web_only/v11_db.html.
dbAnnotate(.data, .db, .data.col, .db.col)
The data to process. It can be a data.frame, a data.table, or a list of these objects.
Every object must have columns in the immunarch compatible format. immunarch_data_format
Competent users may provide advanced data representations: DBI database connections, Apache Spark DataFrame from copy_to or a list of these objects. They are supported with the same limitations as basic objects.
Note: each connection must represent a separate repertoire.
A data frame or a data table with an immune receptor database. See dbLoad on how to load databases into R.
Character vector. Vector of columns in the input repertoires to use for clonotype search. E.g., `"CDR3.aa"` or `c("CDR3.aa", "V.name")`.
Character vector. Vector of columns in the database to use for clonotype search. The order must match the order of ".data.col". E.g., if ".data.col" is `c("CDR3.aa", "V.name")`, then ".db.col" must have the exact order of columns. i.e., the first column must correspond to CDR3 amino acid sequences, and the second column must correspond to V gene segment names.
Data frame with input sequences and counts or proportions for each of the input repertoire.
data(immdata)
#' # Example file path
file_path <- paste0(system.file(package = "immunarch"), "/extdata/db/vdjdb.example.txt")
# Load the database with human-only TRB-only receptors for all known antigens
db <- dbLoad(file_path, "vdjdb", "HomoSapiens", "TRB")
res <- dbAnnotate(immdata$data, db, "CDR3.aa", "cdr3")
res
#> Empty data.table (0 rows and 14 cols): CDR3.aa,Samples,A2-i129,A2-i131,A2-i133,A2-i132...