As long as accuracy is maintained, generative artificial intelligence (AI) may assist enhance patient-physician communication, according to research from New York University’s Langone Health.
According to the study, which was published in JAMA Network Open, generative AI might be used to help a large language model (LLM) convert patient discharge summaries—which are legally required to be available to everyone immediately—into a more comprehensible and user-friendly format.
“Increased patient access to their clinical notes through widespread availability of electronic patient portals has the potential to improve patient involvement in their own care, as well as confidence in their care from their care partners,” write lead author Jonah Zaretsky, a physician and researcher at NYU Langone Health, and colleagues.
“However, clinical notes are typically filled with technical language and abbreviations that make notes difficult to read and understand for patients and their care partners. This issue can create unnecessary anxiety or potentially delay care recommendations or follow-up for patients and their families.”
Zaretsky and colleagues investigated if an LLM could efficiently convert normal patient discharge data into a format that would be easier to understand and more practical. The study comprised 50 patient discharge reports in total.
The Flesch-Kincaid Grade Level test was used to determine the LLM output’s readability, and the Patient Education Materials Assessment Tool (PEMAT) was used to determine its understandability. Two doctors evaluated the AI-generated summaries’ accuracy using a six-point rating system.
The AI-generated discharge summaries had a lower Flesch-Kincaid Grade Level (6.2) than the original discharge summaries (11), which suggests greater readability. Additionally, the AI summaries’ PEMAT understandability scores were significantly greater than the originals’—81% vs. 13%, respectively.
Physician-rated accuracy was assessed as 100% accurate in 54 out of 100 reports, with a score of six. Nevertheless, because of data gaps and certain false claims made by the AI, 18 evaluations expressed worries regarding safety. Additionally, 44% of the AI-generated statements were deemed incomplete by the physicians.
“We think a major source of inaccuracy due to omission and incompleteness comes from prompt engineering that optimized for readability and understandability. For example, limiting the number of words in a sentence or a document is considered more understandable,” write the authors.
“This makes it difficult to provide a detailed, comprehensive description of a complex patient condition. Future iterations will have to explore the trade-off between readability and understandability on one hand and completeness on the other.”
This study indicates that there is potential for using generative AI to improve patient-physician communication, even though it may not be ready for general adoption just yet if accuracy problems can be resolved.
“LLMs may not be ready for widespread unsupervised use to generate patient facing discharge summaries, given real safety risks and formidable, if solvable, technological and workflow barriers,” Charumathi Raghu Subramanian, a physician and researcher at the University of California, and colleagues wrote in an associated commentary article in the same journal. “But perhaps in the near future, with better safety profiles, more automated inputs and outputs, and strict clinician oversight, they may become important tools in enhancing health care communication.”