Skip to main content
Fig. 1 | Genome Medicine

Fig. 1

From: Exploring the pre-immune landscape of antigen-specific T cells

Fig. 1

Estimating baseline T cell frequencies using a VDJ rearrangement model. a Schematic description of the TCRβ baseline frequency estimator. CDR3 sequences were sampled from the pre-trained probabilistic model of Murugan et al. for each VJ segment combination, translated, and matched to a given CDR3 sequence (allowing at most one amino acid substitution, see the “Methods” section) to estimate its theoretical rearrangement probability. Resulting probabilities were corrected for the sample-specific VJ segment frequency profile. b The observed (Y-axis) versus estimated (X-axis) rearrangement frequencies for 6853 human TCR sequences with known antigen specificities selected from VDJdb in 786 immune repertoire samples from Emerson et al. containing 151,020,646 unique rearrangements (identical TCRβ nucleotide sequences observed in different donors were counted as distinct). Observed frequencies were computed as the total number of unique rearrangements encoding a given CDR3 amino acid sequence in the pooled dataset (with at most one substitution) divided by the total number of unique rearrangements. The red line displays the linear model fit for log-transformed frequencies. c Density plot showing the probability of rearranging the same nucleotide sequence in different individuals versus the theoretical rearrangement probability for VDJdb TCR variants (amino acid sequences). The red curve displays the smoothing fit

Back to article page