Introduction to LLMs

22 July

Written By Alyssa Pradhan

Bolus Dose EP. 03

Finally! We’re talking about large language models (LLMs) including ChatGPT. We discuss the following topics:

Brief explanation of transformer models and how they work including the attention mechanism and context window
Discuss a brief development of LLMs to this point
How LLMs are trained
Some limitations - privacy, accuracy, provenance

Some resources and papers we discuss:

Attention is all you need: Vaswani, Ashish et al. “Attention is All you Need.” Neural Information Processing Systems (2017).

Discussion of the environmental impact of LLMs

Discussion of hallucination in LLMs:

Use of LLMs to make inpatient discharge summaries more interpretable to patients, 56% were deemed complete, 18% with critical omissions: Zaretsky J, Kim JM, Baskharoun S, Zhao Y, Austrian J, Aphinyanaphongs Y, Gupta R, Blecker SB, Feldman J. Generative Artificial Intelligence to Transform Inpatient Discharge Summaries to Patient-Friendly Language and Format. JAMA Netw Open. 2024 Mar 4;7(3):e240357.

Lost in the middle aspect of context windows: Liu, Nelson F., et al. “Lost in the Middle: How Language Models Use Long Contexts.” Transactions of the Association for Computational Linguistics, vol. 12, 2024, pp. 157–73. Crossref, https://doi.org/10.1162/tacl_a_00638.

Alyssa Pradhan

LLM LLMitations

Neural networks in healthcare