Introduction to LLMs
Bolus Dose EP. 03
Finally! We’re talking about large language models (LLMs) including ChatGPT. We discuss the following topics:
Brief explanation of transformer models and how they work including the attention mechanism and context window
Discuss a brief development of LLMs to this point
How LLMs are trained
Some limitations - privacy, accuracy, provenance
Some resources and papers we discuss:
Attention is all you need: Vaswani, Ashish et al. “Attention is All you Need.” Neural Information Processing Systems (2017).
Discussion of the environmental impact of LLMs
Discussion of hallucination in LLMs:
Use of LLMs to make inpatient discharge summaries more interpretable to patients, 56% were deemed complete, 18% with critical omissions: Zaretsky J, Kim JM, Baskharoun S, Zhao Y, Austrian J, Aphinyanaphongs Y, Gupta R, Blecker SB, Feldman J. Generative Artificial Intelligence to Transform Inpatient Discharge Summaries to Patient-Friendly Language and Format. JAMA Netw Open. 2024 Mar 4;7(3):e240357.
Lost in the middle aspect of context windows: Liu, Nelson F., et al. “Lost in the Middle: How Language Models Use Long Contexts.” Transactions of the Association for Computational Linguistics, vol. 12, 2024, pp. 157–73. Crossref, https://doi.org/10.1162/tacl_a_00638.