Systematic Review of Reporting guidelines for large language models used in healthcare research

Saif Aldeen Alryalat; Iyad Sultan

doi:10.59707/hymrUXPX7081

Vol. 3 No. 2 (2025), Reviews

Vol. 3 No. 2 (2025)

Systematic Review of Reporting guidelines for large language models used in healthcare research

Reviews

https://doi.org/10.59707/hymrUXPX7081

Published 2025-12-01

Saif Aldeen Alryalat
Iyad Sultan⁺⁻

Saif Aldeen Alryalat

Iyad Sultan

King Hussein Cancer Center

https://orcid.org/0000-0002-2664-1565

PDF

HTML

XML

Supplementary Files

Supplementary file

Keywords

Large Language Models
Reporting Guidelines
CHART
TRIPOD

How to Cite

Alryalat, S. A., & Iyad Sultan. (2025). Systematic Review of Reporting guidelines for large language models used in healthcare research. High Yield Medical Reviews, 3(2). https://doi.org/10.59707/hymrUXPX7081

Abstract

This systematic review aims to synthesize existing reporting guidelines for large language models (LLMs) in healthcare research and evaluate their adequacy in addressing gaps in transparency, reproducibility, and clinical applicability. A systematic search was conducted to identify relevant studies on reporting guidelines for LLMs used in healthcare research using the PubMed database. We included 18 studies focused on reporting guidelines for LLMs used in healthcare research. The studies primarily aimed to develop or evaluate reporting frameworks to improve transparency, reproducibility, and methodological rigor in LLM applications. Several studies focused on creating structured reporting checklists for LLM applications in healthcare. The Chatbot Assessment Reporting Tool (CHART) was developed across multiple studies. Similarly, TRIPOD-LLM extended the TRIPOD+AI framework with 19 main items and 50 subitems, emphasizing modular reporting for diverse LLM tasks. Ultimately, while existing reporting guidelines represent an important advancement toward standardizing LLM research, their long-term impact will rely on broad adoption and iterative refinement to meet the evolving challenges of artificial intelligence.

https://doi.org/10.59707/hymrUXPX7081

PDF

HTML

XML

References

Bedi S, Liu Y, Orr-Ewing L, Dash D, Koyejo S, Callahan A, Fries JA, Wornow M, Swaminathan A, Lehmann LS, Hong HJ, Kashyap M, Chaurasia AR, Shah NR, Singh K, Tazbaz T, Milstein A, Pfeffer MA, Shah NH. Testing and Evaluation of Health Care Applications of Large Language Models: A Systematic Review. JAMA. 2025

Zhang L, Zhao Q, Zhang D, Song M, Zhang Y, Wang X. Application of large language models in healthcare: A bibliometric analysis. Digital health. 2025

Gallifant J, Afshar M, Ameen S, Aphinyanaphongs Y, Chen S, Cacciamani G, Demner-Fushman D, Dligach D, Daneshjou R, Fernandes C, Hansen LH, Landman A, Lehmann L, McCoy LG, Miller T, Moreno A, Munch N, Restrepo D, Savova G, Umeton R, Gichoya JW, Collins GS, Moons KGM, Celi LA, Bitterman DS. The TRIPOD-LLM reporting guideline for studies using large language models. Nature medicine. 2025

Gallifant J, Afshar M, Ameen S, Aphinyanaphongs Y, Chen S, Cacciamani G, Demner-Fushman D, Dligach D, Daneshjou R, Fernandes C, Hansen LH, Landman A, Lehmann L, McCoy LG, Miller T, Moreno A, Munch N, Restrepo D, Savova G, Umeton R, Gichoya JW, Collins GS, Moons KGM, Celi LA, Bitterman DS. The TRIPOD-LLM Statement: A Targeted Guideline For Reporting Large Language Models Use. medRxiv : the preprint server for health sciences. 2024

Fareed M, Fatima M, Uddin J, Ahmed A, Sattar MA. A systematic review of ethical considerations of large language models in healthcare and medicine. Frontiers in digital health. 2025. doi: 10.3389/fdgth.2025.1653631

Iqbal U, Tanweer A, Rahmanti AR, Greenfield D, Lee LT, Li YJ. Impact of large language model (ChatGPT) in healthcare: an umbrella review and evidence synthesis. Journal of biomedical science. 2025

Guo Z, Lai A, Thygesen JH, Farrington J, Keen T, Li K. Large Language Models for Mental Health Applications: Systematic Review. JMIR mental health. 2024

Hobensack M, von Gerich H, Vyas P, Withall J, Peltonen LM, Block LJ, Davies S, Chan R, Van Bulck L, Cho H, Paquin R, Mitchell J, Topaz M, Song J. A rapid review on current and potential uses of large language models in nursing. International journal of nursing studies. 2024

This work is licensed under a Creative Commons Attribution 4.0 International License.

Systematic Review of Reporting guidelines for large language models used in healthcare research

Supplementary Files

Keywords

How to Cite

Download Citation

Abstract

References