ChatGPT demonstrates ‘impressive’ accuracy in clinical decision making

By News EditorPublished On: August 23, 2023Last Updated: August 23, 2023

ChatGPT was about 72 per cent accurate in overall clinical decision making in a recent study, from coming up with possible diagnoses to making final diagnoses and care management decisions.

The large-language model (LLM) AI chatbot performed equally well in both primary care and emergency settings across all medical specialties.

The findings are published in the Journal of Medical Internet Research.

Corresponding author Marc Succi, MD, associate chair of innovation and commercialisation and strategic innovation leader at Mass General Brigham and executive director of the MESH Incubator, said:

“Our paper comprehensively assesses decision support via ChatGPT from the very beginning of working with a patient through the entire care scenario, from differential diagnosis all the way through testing, diagnosis, and management.

“No real benchmarks exists, but we estimate this performance to be at the level of someone who has just graduated from medical school, such as an intern or resident.

“This tells us that LLMs in general have the potential to be an augmenting tool for the practice of medicine and support clinical decision making with impressive accuracy.”

Changes in AI technology are occurring at a fast pace and transforming many industries, including healthcare.

However, the capacity of LLMs to assist in the full scope of clinical care has not yet been studied.

In this comprehensive, cross-specialty study of how LLMs could be used in clinical advisement and decision making, Succi and his research team tested the hypothesis that ChatGPT would be able to work through an entire clinical encounter with a patient and recommend a diagnostic workup, decide the clinical management course, and ultimately make the final diagnosis.

The study was done by pasting successive portions of 36 standardised, published clinical vignettes into ChatGPT.

The AI tool first was asked to come up with a set of possible, or differential, diagnoses based on the patient’s initial information, which included age, gender, symptoms, and whether the case was an emergency.

ChatGPT was then given additional pieces of information and asked to make management decisions as well as give a final diagnosis to simulate the entire process of seeing a real patient.

The team compared ChatGPT’s accuracy on differential diagnosis, diagnostic testing, final diagnosis, and management in a structured blinded process, awarding points for correct answers and using linear regressions to assess the relationship between the technology’s performance and the vignette’s demographic information.

The researchers found that overall, ChatGPT was about 72 per cent accurate and that it was best in making a final diagnosis, where it was 77 per cent accurate.

It was lowest-performing in making differential diagnoses, where it was only 60 per cent accurate.

And it was only 68 per cent accurate in clinical management decisions, such as figuring out what medications to treat the patient with after arriving at the correct diagnosis.

Other notable findings from the research included that ChatGPT’s answers did not show gender bias and that its overall performance was steady across both primary and emergency care.

Succi said:

“ChatGPT struggled with differential diagnosis, which is the meat and potatoes of medicine when a physician has to figure out what to do.

“That is important because it tells us where physicians are truly experts and adding the most value—in the early stages of patient care with little presenting information, when a list of possible diagnoses is needed.”

The researchers note that before tools like ChatGPT can be considered for integration into clinical care, more benchmark research and regulatory guidance is needed.

Next, the research team is looking at whether AI tools can improve patient care and outcomes in hospitals’ resource-constrained areas.

Intermittent fasting improves memory and sleep in Alzheimer's patients

AI gives paralysed woman her voice back

Cookie	Duration	Description
__cfduid	1 month	The cookie is used by cdn services like CloudFare to identify individual clients behind a shared IP address and apply security settings on a per-client basis. It does not correspond to any user ID in the web application and does not store any personally identifiable information.
__hssrc	session	This cookie is set by Hubspot. According to their documentation, whenever HubSpot changes the session cookie, this cookie is also set to determine if the visitor has restarted their browser. If this cookie does not exist when HubSpot manages cookies, it is considered a new session.
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	1 year	This cookies is set by GDPR Cookie Consent WordPress Plugin. The cookie is used to remember the user consent for the cookies under the category "Analytics".
cookielawinfo-checkbox-necessary	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".

Cookie	Duration	Description
__hssc	30 minutes	This cookie is set by HubSpot. The purpose of the cookie is to keep track of sessions. This is used to determine if HubSpot should increment the session number and timestamps in the __hstc cookie. It contains the domain, viewCount (increments each pageView in a session), and session start timestamp.
tve_leads_unique	1 month	This cookie is set by the provider Thrive Themes. This cookie is used to know which optin form the visitor has filled out when subscribing a newsletter.

Cookie	Duration	Description
__hstc	1 year 24 days	This cookie is set by Hubspot and is used for tracking visitors. It contains the domain, utk, initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session).
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form.
hubspotutk	1 year 24 days	This cookie is used by HubSpot to keep track of the visitors to the website. This cookie is passed to Hubspot on form submission and used when deduplicating contacts.

Cookie	Duration	Description
cookielawinfo-checkbox-functional	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-others	1 year	No description
lfuuid	9 years 11 months	Third party (Lead Forensics) cookie which enables us to track visitor behaviour on our site. Tracking is performed anonymously until a user identifies themselves by submitting a form.
tl_554_555_1	1 month	No description
tl_554_605_2	1 month	No description
tlf_1	5 days	No description