The function automatically detects the database format and loads it into R. Additionally, the function provides a general query interface to databases that allows filtering by species, chain types (i.e., locus) and pathology (i.e., antigen species).

Currently we support three popular databases:

VDJDB - https://github.com/antigenomics/vdjdb-db

McPAS-TCR - http://friedmanlab.weizmann.ac.il/McPAS-TCR/

TBAdb from PIRD - https://db.cngb.org/pird/

dbLoad(.path, .db, .species = NA, .chain = NA, .pathology = NA)

Arguments

.path

Character. A path to the database file, e.g., "/Users/researcher/Downloads/McPAS-TCR.csv".

.db

Character. A database type: either "vdjdb", "vdjdb-search", "mcpas" or "tbadb".

"vdjdb" for VDJDB; "vdjdb-search" for search table obtained from the web interface of VDJDB; "mcpas" for McPAS-TCR; "tbadb" for PIRD TBAdb.

.species

Character. A string or a vector of strings specifying which species need to be in the database, e.g., "HomoSapiens". Pass NA (by default) to load all available species.

.chain

Character. A string or a vector of strings specifying which chains need to be in the database, e.g., "TRB". Pass NA (by default) to load all available chains.

.pathology

Character. A string or a vector of strings specifying which disease, virus, bacteria or any condition needs to be in the database, e.g., "CMV". Pass NA (by default) to load all available conditions.

Value

Data frame with the input database records.

Examples

# Example file path
file_path <- paste0(system.file(package = "immunarch"), "/extdata/db/vdjdb.example.txt")

# Load the database with human-only TRB-only receptors for all known antigens
db <- dbLoad(file_path, "vdjdb", "HomoSapiens", "TRB")
db
#> # A tibble: 10 × 19
#>    gene  cdr3    species antigen.epitope antigen.gene antigen.species complex.id
#>    <chr> <chr>   <chr>   <chr>           <chr>        <chr>                <dbl>
#>  1 TRB   CASSQD… HomoSa… RLRAEAQVK       EBNA3A       EBV                  19268
#>  2 TRB   CSASIL… HomoSa… KLGGALQAK       IE1          CMV                   8584
#>  3 TRB   CASSYF… HomoSa… KLGGALQAK       IE1          CMV                   3445
#>  4 TRB   CASSAF… HomoSa… NLVPMVATV       pp65         CMV                      0
#>  5 TRB   CASSLW… HomoSa… KLGGALQAK       IE1          CMV                  19396
#>  6 TRB   CASSLT… HomoSa… NLVPMVATV       pp65         CMV                      0
#>  7 TRB   CASTAK… HomoSa… KLGGALQAK       IE1          CMV                  10972
#>  8 TRB   CASSGA… HomoSa… KLGGALQAK       IE1          CMV                   6231
#>  9 TRB   CASSLI… HomoSa… KLGGALQAK       IE1          CMV                  12587
#> 10 TRB   CATSSS… HomoSa… KLGGALQAK       IE1          CMV                  13267
#> # ℹ 12 more variables: v.segm <chr>, j.segm <chr>, v.end <dbl>, j.start <dbl>,
#> #   mhc.a <chr>, mhc.b <chr>, mhc.class <chr>, reference.id <chr>,
#> #   vdjdb.score <dbl>, Species <chr>, Chain <chr>, Pathology <chr>