The curious case of the missing T cell receptor transcripts, part 3

[ maehrlab  single_cell  ]

This is part 3 of a three-part post on T-cell receptors & RNA data. Here are parts 1 (intro / summary) and 2 (technical details).

Bonus post: TCR summary statistics

In part 1, I showed that a simple approach allows for specific detection of TCR transcripts in mature thymocytes. In part 2, I gave the exact details of the method. The important work is done, at least as befits our lab’s priorities for that project, and this post is purely for amusement.

As long as I had all the human TCR segments in a tidy dataframe, I decided to display some basic properties. Here’s the length and number of known segments by type and by locus.

Counts of known human TCR segments …

… from downloaded TCR recombinome

  C V D J
A 1 119 0 68
B 1 142 3 16
G 1 22 0 6
D 1 6 3 4

… from a textbook (Mak and Saunders, “Primer to the Immune Response”)

  C V D J
A 1 42 0 61
B 1 48 2 13
G 1 14 0 5
D 1 10 3 3

These counts are different from what my immunology textbook gives. (I’m reading “Primer to the Immune Response” by Tak Mak and Mary Saunders, and the TCR segment counts are in table 8-2, page 137.) Most counts pulled straight from the data are bigger, and a few are much bigger. For instance, Mak and Saunders say there are only 42 V segments in the human TCRA locus, whereas the table above has 119.

Why? Because the TRACER recombinome lists a different segment for every allele, whereas the textbook counts two different alleles as belonging to the same segment. I found this out because the names look like TRAV1-1*02, and the IMGT IG/TR nomenclature page says “Allele names comprise the IMGT gene name followed by an asterisk and a two-figure number.” (See footnote.)

This brings up another potential improvement for TCR alignment: don’t build the reference out of near-identical segments that differ by only a single variant!

Mak and Saunders claim there are 10 V segments in TCRD locus, whereas the recombinome only lists 6. This is the only category for which Mak and Saunders indicate more segments than TRACER does. I still don’t know why. Maybe the best available information has changed since 2008 when this copy of the textbook was printed – but it seems unlikely that the number of known segments would decrease.

Footnote

That IMGT site is amazing. If you ever need an IGK allele table for the common brush-tailed possum, look no further.

Written on November 23, 2018