HLA Evolutionary Divergence Project
Purpose
This was my first personal project that I undertook in order to actively improve my Python programming skills. Using data collected from the IPD-IMGT/HLA Database alignments, I was able to create a program that would calculate the Evolutionary Divergence of HLA genes from a file containing a patient or donors HLA typing results.
The current project has also been turned into a Flask web application where this file can be uploaded and return a json file containing the evolutionary divergence of the HLA genes in the provided file.
Academic Background
HLA Evolutionary Divergence (HED) has previously been used as a way to describe the immunopeptidome of HLA expression on the cell surface and evidence has been shown that the HED scores of HLA class I alleles can impact the effects of cancer immunotherapies.The HED Score for each loci is calculated using scores from the Grantham Amino Acid Matrix to determine the difference between two amino acids at the same position in the protein sequence between the two alleles of this loci.
The HED score for each class is calculated through the mean of the HED scores at each loci in this classification.
For further information regarding the HED score please refer to the papers referenced towards the bottom of the page.
Assumptions and Limitations
While this project is a work in progress, I have made some assumptions that I would like to address. These assumptions are as follows:- All Null alleles contribute 0 to the overall score for the relevant gene.
- Homozygous alleles contribute 0 to the overall score for the relevant gene.
- The ARD is classified as Exons 2 and 3 for HLA class I genes and Exon 2 for HLA class II genes.
The main limitation of this project is that it currently only works on ARD regions of expressed genes at the key 5 loci A, B, C, DRB1, DQB1.
I am still working on the logic to accurately calculate the evolutionary divergence of all HLA genes and HLA related genes that a patient or donor may have.
Future Work
The current iteration of this work is a small side project that I did in order to apply my Bioinformatics and Python skills as well as having a research goal to better understand HLA variation. I am currently working on a more advanced version of this project with logic that will be able to calculate the evolutionary divergence of all HLA genes and HLA related genes that a patient or donor may have and is provided to the program.Code
The code for this project is publicly available in GitHub Repositories linked below. For instructions on using these tools please follow the README.md file.References
- Chowell, D., Krishna, C., Pierini, F., Makarov, V., Rizvi, N. A., Kuo, F., … Chan, T. A. (2019). Evolutionary divergence of HLA class I genotype impacts efficacy of cancer immunotherapy. Nature Medicine, 25(11), 1715–1720. https://doi.org/10.1038/s41591-019-0639-4
- Pierini, F., & Lenz, T. L. (2018). Divergent Allele Advantage at Human MHC Genes: Signatures of Past and Ongoing Selection. Molecular Biology and Evolution, 35(9), 2145–2158. Retrieved from http://dx.doi.org/10.1093/molbev/msy116
Disclaimer
The code for this project is a work in progress and is absolutely not intended for clinical use.The code is provided as is and I take no responsibility for any errors or omissions in the code or the results produced by the code.
The code is provided for educational purposes only and should not be used for any other purpose.