Bluesky Facebook Reddit Email

Temporal evolution of large language models in oncology: performance trends of ChatGPT-3.5, ChatGPT-4, and Gemini

11.09.25 | FAR Publishing Limited

Apple iPhone 17 Pro

Apple iPhone 17 Pro delivers top performance and advanced cameras for field documentation, data collection, and secure research communications.


Large language models (LLMs) have emerged as transformative tools in healthcare, offering potential value in oncology for information retrieval, clinical decision support, and patient communication. However, the dynamic nature of oncological knowledge—including evolving treatment guidelines and diagnostic standards—raises questions about how LLMs’ performance holds up over time, especially as these models are relied on for increasingly nuanced clinical tasks.

This study, conducted in adherence to PRISMA guidelines, systematically collected relevant literature through 2025 from PubMed, Google Scholar, and Web of Science databases. The research focused on three prominent LLMs: ChatGPT-3.5, ChatGPT-4, and Gemini. Researchers analyzed 614 oncology questions spanning common malignancies (e.g., lung, breast, colorectal cancer) and rare tumors (e.g., glioma, multiple myeloma), using both original study scoring criteria and a standardized five-point Likert scale to assess response accuracy.

Key findings reveal clear divergent temporal trends across the models:

Subjective questions—those requiring complex analysis, integration of clinical context, and nuanced judgment—were far more susceptible to temporal performance degradation than objective, fact-based queries. This disparity highlights the unique challenges LLMs face in applying evolving clinical knowledge to real-world oncology scenarios, where flexibility and alignment with the latest standards are critical.

The study’s results provide vital guidance for the responsible deployment of LLMs in oncology. As healthcare systems increasingly adopt these AI tools to support patient care and clinical decision-making, ongoing performance monitoring, standardized evaluation protocols, and strategies to integrate up-to-date clinical data will be essential to ensure safety and reliability.

Journal of Translational Medicine

10.1186/s12967-025-07227-2

Meta-analysis

People

Temporal Evolution of Large Language Models (LLMs) in Oncology

4-Nov-2025

The authors declare that they have no competing interests.

Keywords

Article Information

Contact Information

Chris Zhou
FAR Publishing Limited
editorial@fargroups.com

Source

How to Cite This Article

APA:
FAR Publishing Limited. (2025, November 9). Temporal evolution of large language models in oncology: performance trends of ChatGPT-3.5, ChatGPT-4, and Gemini. Brightsurf News. https://www.brightsurf.com/news/L7VMK948/temporal-evolution-of-large-language-models-in-oncology-performance-trends-of-chatgpt-35-chatgpt-4-and-gemini.html
MLA:
"Temporal evolution of large language models in oncology: performance trends of ChatGPT-3.5, ChatGPT-4, and Gemini." Brightsurf News, Nov. 9 2025, https://www.brightsurf.com/news/L7VMK948/temporal-evolution-of-large-language-models-in-oncology-performance-trends-of-chatgpt-35-chatgpt-4-and-gemini.html.