Introduction to LLMs

 

Bolus Dose EP. 03

Finally! We’re talking about large language models (LLMs) including ChatGPT. We discuss the following topics:

  • Brief explanation of transformer models and how they work including the attention mechanism and context window

  • Discuss a brief development of LLMs to this point

  • How LLMs are trained

  • Some limitations - privacy, accuracy, provenance

Some resources and papers we discuss: 

Attention is all you need: Vaswani, Ashish et al. “Attention is All you Need.” Neural Information Processing Systems (2017).

Discussion of the environmental impact of LLMs 

Discussion of hallucination in LLMs:

Use of LLMs to make inpatient discharge summaries more interpretable to patients, 56% were deemed complete, 18% with critical omissions: Zaretsky J, Kim JM, Baskharoun S, Zhao Y, Austrian J, Aphinyanaphongs Y, Gupta R, Blecker SB, Feldman J. Generative Artificial Intelligence to Transform Inpatient Discharge Summaries to Patient-Friendly Language and Format. JAMA Netw Open. 2024 Mar 4;7(3):e240357.

Lost in the middle aspect of context windows: Liu, Nelson F., et al. “Lost in the Middle: How Language Models Use Long Contexts.” Transactions of the Association for Computational Linguistics, vol. 12, 2024, pp. 157–73. Crossref, https://doi.org/10.1162/tacl_a_00638. 

 
 
Previous
Previous

LLM LLMitations

Next
Next

Neural networks in healthcare