Introduction

immunarch is an R package designed to analyse T-cell receptor (TCR) and B-cell receptor (BCR) repertoires, aimed at medical scientists and bioinformaticians. The mission of immunarch is to make immune sequencing data analysis as effortless as possible and help you focus on research instead of coding. Follow us on Twitter for news and updates.

Installation

In order to install immunarch execute the following R command:

install.packages("immunarch")

That’s it, you can start using immunarch now! See the Quick Start section below to dive into immune repertoire data analysis. If you run in any trouble with installation, take a look at the Installation Troubleshooting section below.

Note that there are quite a lot of dependencies to install with the package because it installs all the widely-used packages for data analysis and visualisation. You got both the AIRR data analysis framework and Data Science package eco-system with only one command!

You can find the list of releases of immunarch here: https://github.com/immunomind/immunarch/releases

Pre-release version installation

Since releasing on CRAN is limited to one release per one-two months, you can install the latest pre-release version with bleeding edge features and optimisations directly from a code repository. In order to install the latest pre-release version, you need to execute only two commands:

install.packages("devtools") # skip this if you already installed devtools
devtools::install_url("https://github.com/immunomind/immunarch/raw/master/immunarch.tar.gz")

Features

  1. Fast and easy manipulation of immune repertoire data:

    • The package automatically detects the format of your files—no more guessing what format is that file, just pass them to the package;

    • Supports all popular TCR and BCR analysis and post-analysis formats, including single-cell data: ImmunoSEQ, IMGT, MiTCR, MiXCR, MiGEC, MigMap, VDJtools, tcR, AIRR, 10XGenomics, ArcherDX. More coming in the future;

    • Works on any data source you are comfortable with: R data frames, data tables from data.table, databases like MonetDB, Apache Spark data frames via sparklyr;

    • Tutorial is available here.

  2. Immune repertoire analysis made simple:

    • Most methods are incorporated in a couple of main functions with clear naming—no more remembering tens and tens of functions with obscure names. For details see link;

    • Repertoire overlap analysis (common indices including overlap coefficient, Jaccard index and Morisita’s overlap index). Tutorial is available here;

    • Gene usage estimation (correlation, Jensen-Shannon Divergence, clustering). Tutorial is available here;

    • Diversity evaluation (ecological diversity index, Gini index, inverse Simpson index, rarefaction analysis). Tutorial is available here;

    • Tracking of clonotypes across time points, widely used in vaccination and cancer immunology domains. Tutorial is available here;

    • Kmer distribution measures and statistics. Tutorial is available here;

    • Coming in the next releases: CDR3 amino acid physical and chemical properties assessment, mutation networks.

  3. Publication-ready plots with a built-in tool for visualisation manipulation:

    • Rich visualisation procedures with ggplot2;

    • Built-in tool FixVis makes your plots publication-ready: easily change font sizes, text angles, titles, legends and many more with clear-cut GUI;

    • Tutorial is available here.

Quick start

The gist of the typical TCR or BCR data analysis workflow can be reduced to the next few lines of code.

1) Load the package and the data

2) Analyse repertoire similarity at the clonotype level

3) Find repertoire differences in the Variable gene usage

4) Find differences in the diversity of repertoires

5) Manipulate plots to make them publication-ready

# 5.1) Manipulate the visualisation of diversity estimates to make the plot publication-ready:
div = repDiversity(immdata$data, .method = "chao1")
div.plot = vis(div, .by=c("Status", "Lane"), .meta=immdata$meta)
fixVis(div.plot)

6) Advanced methods

For advanced methods such as clonotype tracking, kmer analysis and public repertoire analysis see “Tutorials”.

Bugs and Issues

The mission of immunarch is to make immune repertoires painless to analyse. All bug reports, documentation improvements, enhancements and ideas are welcome.

If through using immunarch you have an idea of your own or are looking for something in the documentation and thinking ‘this can be improved’… you can do something about it! Just let us know via GitHub or support@immunomind.io.

Bug reports are an important part of making immunarch more stable. Having a complete bug report will allow us to reproduce the bug and provide insight into fixing.

Bug reports must:

  1. Include a short, self-contained R snippet reproducing the problem.
  2. Add minimal data sample for us to reproduce the problem. If for some reasons you don’t want to share it publicly on Gihub we are always available through support@immunomind.io.
  3. Explain why the current behavior is wrong/not desired and what you expect instead.
  4. If the issue is somehow connected with plotting or visualization, please attach a picture. It’ll be much simple for us to see what you see.

Citation

ImmunoMind Team. (2019). immunarch: An R Package for Painless Analysis of Large-Scale Immune Repertoire Data. Zenodo. http://doi.org/10.5281/zenodo.3367200

BibTex:

@misc{immunomind_team_2019_3367200,
  author       = {{ImmunoMind Team}},
  title        = {{immunarch: An R Package for Painless Analysis of 
                   Large-Scale Immune Repertoire Data}},
  month        = aug,
  year         = 2019,
  doi          = {10.5281/zenodo.3367200},
  url          = {https://doi.org/10.5281/zenodo.3367200}
}

For EndNote citation import the immunarch-citation.xml file.

Preprint on BioArxiv is coming soon.

License

The package is freely distributed under the Apache v2 license. You can read more about it here.

Additionally, we provide an annual subscription that includes next services:

  • Package modifications and feature implementations are issued promptly;
  • Use immunarch team expertise in your projects;
  • Priority email and call support;
  • 100+ hours of consultations on the TCR & BCR repertoire analysis;
  • Setup a cloud or cluster installation of immunarch, including the development of cloud immunarch-based software;
  • If you need license other than the current, contact us.

Contact us at for more information.