Artificial Intelligence in Medical Education
09/01/2024
Since the availability of version 3 on Nov. 30, 2022, there has been widespread use of artificial intelligence (AI) in many fields . The past, present and future of AI, particularly large language models (LLMs) in medical education are emerging topics that are important for the modern medical educator. This blog post will discuss the basics of how LLMs work, how they are currently being used, and potential possibilities and pitfalls of use in medical education.
Past
LLMs are a type of AI algorithm that work by analyzing a large amount of data in the form of text, for example, internet text including Wikipedia, programming code, and public domain books. Training data is processed through a . A neural network is an advancement in computational hardware and software that models the human brain's network of neurons for pattern recognition and problem-solving. Deep learning uses neural networks with multiple layers to process data, recognize patterns, and form abstract representations, learning context and meaning within the training data set. LLMs undergo a process called pretraining to generate a model that can predict natural language patterns including the appropriate context (surrounding information), syntax (word order) and semantics (word meaning). Once the model is fine-tuned through human supervised learning and encoded in a chat interface, a user can input a query and the model will predict a sequence of words that are the most likely appropriate response as the output. Experts describe LLMs as an , similar to the feature on smartphones. As such, outputs can range from extremely helpful and insightful to completely incoherent, similar to phone autocompletion. Until recently, LLMs have not been widely available for public use.
In November 2022 released or GPT-3 (generative pre-trained transformer 3), which was the first publicly available general purpose LLM. The model has been updated since then with the most recent version being 4o. The specifics of the training data are proprietary but they are understood to on the publicly available information on the internet as well as other privately obtained data sources. A general purpose LLM will respond to almost any topic with a model generated response. For example, if the following prompt was input into ChatGPT: 鈥淩ewrite the Hippocratic Oath in iambic pentameter,鈥 the output would be a rearrangement of the classical version of the oath in the requested rhythmic meter.
Since the release of ChatGPT, several other LLMs have been released including , and . Developers have also released specialized LLMs trained on specific content areas. As learners and educators begin to use this technology, understanding its strengths and limitations will become crucial. Under expert supervision, LLMs can enhance medical educators' capabilities by processing and summarizing large amounts of data, allowing them to focus on, refine, and implement big ideas by freeing them from minutiae.
Present
A recent summarized the growing role of AI in medical education, highlighting its ability to outperform humans on , its potential in personalized learning, and its applications in academic writing and summarization. There is a need for a new AI literacy among learners and educators including understanding its capabilities and integrating it responsibly. The following are some immediate applications for educators starting to use LLMs.
Specialized LLMs to Query the Literature
In academics generally and medicine specifically, developing the skill of querying the vast published literature to answer specific questions is crucial. Since this is essentially a task of language processing, LLMs trained on peer reviewed published literature can respond to natural language queries with model generated responses to specific questions. These models are in their infancy but a resource that is currently available to licensed healthcare practitioners is , which is produced by the Mayo Clinic Platform Accelerate program.
General Purpose LLMs for Journal Club
As a powerful text processing tool, general LLMs can be utilized to consume large amounts of text and accurately summarize and respond to queries about the input. A practical application in medical education could be in the critical appraisal of publications in the form of journal clubs. The manuscript along with all published supplementary materials including entire protocols could be input into an LLM and questions could be asked as needed. For example, the LLM could articulate the study question of a clinic trial in PICO format, summarize the study protocol and find details such as outcome results that are published in any of the uploaded documents. This tool could be valuable for educators and learners in journal clubs, enabling them to query the LLM to answer discussion questions instead of sifting through large amounts of text.
General Purpose LLMs for Administrative Tasks
Any administrative task that requires the generation of text could be aided heavily by general purpose LLMs. For example, our Internal Medicine Residency Program recently used ChatGPT to generate the dates that internal medicine residents will have their continuity clinics in the next academic year based on specific parameters such as which weeks they are in clinic, which days of the week they are assigned clinic and excluding days the clinics are closed such as during conferences and vacations. After reviewing the output for accuracy, we were able to use the dates to open clinic schedules on the appropriate dates for the entire year. A task which previously took one to two hours was completed by the LLM in a matter of minutes. A multitude of similar tasks could be enhanced through the use of LLMs including summarizing feedback and evaluation for learners, generating summaries or flashcards from lecture material for learners, and developing teaching cases or questions for clinical learning.
Future
Though the possible applications for LLMs are compelling, its limitations must be considered. The first is that the output of available models are not generated by content experts and thus require the judgement of experts reviewing the output to ensure accuracy. In medical education, educators with expertise in the specific area that pertains to the input would be best equipped to write a specific prompt as well as review the output for accuracy. The source of inaccurate outputs include the fact that training data include inherent bias and thus outputs are expected to be biased, though it may be subtle and unrecognizable to nonexperts. Another peculiar reason for inaccurate outputs is the fact that LLMs will on occasion generate completely erroneous outputs for unclear reasons which have been termed 鈥.鈥 Educators without content expertise may not be able to differentiate valid outputs from these hallucinations. Due to these flaws, it is crucial for educators to verify all LLM outputs to confirm accuracy and usability. The other major concern that must be considered is that of privacy. None of the publicly available LLMs are HIPAA compliant and the privacy of input information cannot be ensured. Therefore, use should not include any protected health information and inputs should be deidentified to maintain privacy. In the academic settings, misuse of LLMs could bring up concerns of academic integrity necessitating clear policies on its use. When available, institutional policies should be strictly adhered to and appropriate credit disclosed.
AI is an emerging technology that appears to be a significant leap forward in computerized performance of cognitive tasks. Though the current and possible future applications are compelling, the academic medical community must ensure careful consideration in order to preserve the unique and irreplaceably human elements of medical education and practice.
Disclosure: GPT-4o proofread the first draft of this post for grammar and clarity.
Paul Kunnath, M.D., is assistant professor of internal medicine, associate program director of the Internal Medicine Residency program and co-director of the ambulatory medicine clerkship at SSM Health and the 91女神 School of Medicine. He is a fellow of the American College of Physicians. Kunnath's areas of professional interest include learner self-reflection, evidence-based medicine, and patient-centered teaching. Kunnath can be found on or contacted via email.