Article Information
History
- July 10, 2024.
Article Versions
- Version 1 (November 22, 2023 - 08:54).
- You are viewing Version 2, the most recent version of this article.
Copyright
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license.
Author Information
- aDepartment of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06510, USA
- bProgram in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
- cDepartment of Computer Science, Yale University, New Haven, CT 06511, USA
- dDepartment of Statistics and Data Science, Yale University, New Haven, CT 06511, USA
- ↵*Corresponding author; email: pi{at}gersteinlab.org
Dr. Mor Frank - end-to-end modeling framework from concept to actual methodology of both the machine learning and language models. Code writing covers all the pipeline from feature engineering, and modeling approaches to analyzing modeling results as well as writing the manuscript. Dr. Pengyu Ni - interpretation of the LLM, managing the genetic analysis and reviewing the manuscript. Dr. Matthew Jensen - preprocessed the AMP-AD dataset and obtained the gene expression matrix, Prof. Mark B Gerstein - contributed as an expert in the area of biophysics-related computational models through scientific and professional discussions as well as reviewing the manuscript.