R/annotation.R
dbLoad.Rd
The function automatically detects the database format and loads it into R. Additionally, the function provides a general query interface to databases that allows filtering by species, chain types (i.e., locus) and pathology (i.e., antigen species).
Currently we support three popular databases:
VDJDB - https://github.com/antigenomics/vdjdb-db
McPAS-TCR - http://friedmanlab.weizmann.ac.il/McPAS-TCR/
TBAdb from PIRD - https://db.cngb.org/pird/
dbLoad(.path, .db, .species = NA, .chain = NA, .pathology = NA)
Character. A path to the database file, e.g., "/Users/researcher/Downloads/McPAS-TCR.csv".
Character. A database type: either "vdjdb", "vdjdb-search", "mcpas" or "tbadb".
"vdjdb" for VDJDB; "vdjdb-search" for search table obtained from the web interface of VDJDB; "mcpas" for McPAS-TCR; "tbadb" for PIRD TBAdb.
Character. A string or a vector of strings specifying which species need to be in the database, e.g., "HomoSapiens". Pass NA (by default) to load all available species.
Character. A string or a vector of strings specifying which chains need to be in the database, e.g., "TRB". Pass NA (by default) to load all available chains.
Character. A string or a vector of strings specifying which disease, virus, bacteria or any condition needs to be in the database, e.g., "CMV". Pass NA (by default) to load all available conditions.
Data frame with the input database records.
# Example file path
file_path <- paste0(system.file(package = "immunarch"), "/extdata/db/vdjdb.example.txt")
# Load the database with human-only TRB-only receptors for all known antigens
db <- dbLoad(file_path, "vdjdb", "HomoSapiens", "TRB")
db
#> # A tibble: 10 × 19
#> gene cdr3 species antigen.epitope antigen.gene antigen.species complex.id
#> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
#> 1 TRB CASSQD… HomoSa… RLRAEAQVK EBNA3A EBV 19268
#> 2 TRB CSASIL… HomoSa… KLGGALQAK IE1 CMV 8584
#> 3 TRB CASSYF… HomoSa… KLGGALQAK IE1 CMV 3445
#> 4 TRB CASSAF… HomoSa… NLVPMVATV pp65 CMV 0
#> 5 TRB CASSLW… HomoSa… KLGGALQAK IE1 CMV 19396
#> 6 TRB CASSLT… HomoSa… NLVPMVATV pp65 CMV 0
#> 7 TRB CASTAK… HomoSa… KLGGALQAK IE1 CMV 10972
#> 8 TRB CASSGA… HomoSa… KLGGALQAK IE1 CMV 6231
#> 9 TRB CASSLI… HomoSa… KLGGALQAK IE1 CMV 12587
#> 10 TRB CATSSS… HomoSa… KLGGALQAK IE1 CMV 13267
#> # ℹ 12 more variables: v.segm <chr>, j.segm <chr>, v.end <dbl>, j.start <dbl>,
#> # mhc.a <chr>, mhc.b <chr>, mhc.class <chr>, reference.id <chr>,
#> # vdjdb.score <dbl>, Species <chr>, Chain <chr>, Pathology <chr>