Research suggests AI does not support radiologists’ performance

By News EditorPublished On: March 19, 2024Last Updated: March 19, 2024

New research has shown that Artificial Intelligence (AI) may not be supportive of the performance of radiologists.

It is commonly assumed that AI can support clinicians by helping them interpret images such as X-rays and CT scans with greater precision to make more accurate diagnoses.

However, new research, published in Nature Medicine and led by Harvard Medical School, suggests this may not always be the case.

Working with researchers at MIT and Stanford, the team found that individual clinician differences shape the interaction between humans and AI in ways that researchers do not yet fully understand.

The analysis is based on data from an earlier working paper by the same research group released by the National Bureau of Economic Research.

Humans and AI

The research showed that in some instances the use of AI can interfere with a radiologist’s performance and the accuracy of their interpretation.

Co-senior author Pranav Rajpurkar, assistant professor of biomedical informatics in the Blavatnik Institute at HMS, stated: “We find that different radiologists, indeed, react differently to AI assistance — some are helped while others are hurt by it.

“What this means is that we should not look at radiologists as a uniform population and consider just the ‘average’ effect of AI on their performance. To maximise benefits and minimise harm, we need to personalise assistive AI systems.”

The findings underscore the importance of carefully calibrated implementation of AI into clinical practice, but the researchers emphasise that it should in no way discourage the adoption of AI in radiologists’ offices and clinics.

Instead, the results should signal the need to better understand how humans and AI interact and to design carefully calibrated approaches that boost human performance rather than hurt it.

“Clinicians have different levels of expertise, experience, and decision-making styles, so ensuring that AI reflects this diversity is critical for targeted implementation,” said Feiyang “Kathy” Yu, who conducted the work while at the Rajpurkar lab with co-first author on the paper with Alex Moehring at the MIT Sloan School of Management.

“Individual factors and variation would be key in ensuring that AI advances rather than interferes with performance and, ultimately, with diagnosis.”

AI tools affected different radiologists differently

While previous research has shown that AI assistants can boost radiologists’ diagnostic performance, these studies have looked at radiologists as a whole without accounting for variability from radiologist to radiologist.

This new study looks at how individual clinician factors — area of specialty, years of practice, prior use of AI tools — come into play in human-AI collaboration.

The researchers examined how AI tools affected the performance of 140 radiologists on 15 X-ray diagnostic tasks — how reliably the radiologists were able to spot telltale features on an image and make an accurate diagnosis.

To determine how AI affected doctors’ ability to spot and correctly identify problems, the researchers used advanced computational methods that captured the magnitude of change in performance when using AI and when not using it.

The effect of AI assistance was inconsistent and varied across radiologists, with the performance of some radiologists improving with AI and worsening in others.

For instance, factors that affected the outcomes included how many years of experience a radiologist had, whether they specialised in thoracic, or chest, radiology, and whether they’d used AI readers before, did not reliably predict how an AI tool would affect a doctor’s performance.

Another finding that challenged the prevailing wisdom was that clinicians who had low performance at baseline did not benefit consistently from AI assistance.

Some benefited more, some less, and some none at all. Overall, however, lower-performing radiologists at baseline had lower performance with or without AI. The same was true among radiologists who performed better at baseline. They performed consistently well, overall, with or without AI.

Additionally, more accurate AI tools boosted radiologists’ performance, while poorly performing AI tools diminished the diagnostic accuracy of human clinicians.

While the analysis was not done in a way that allowed researchers to determine why this happened, the finding points to the importance of testing and validating AI tool performance before clinical deployment, the researchers said. Such pre-testing could ensure that inferior AI doesn’t interfere with human clinicians’ performance and, therefore, patient care.

The researchers cautioned that their findings do not provide an explanation for why and how AI tools seem to affect performance across human clinicians differently, but note that understanding why would be critical to ensuring that AI radiology tools augment human performance rather than hurt it.

The team noted that AI developers should work with physicians who use their tools to understand and define the precise factors that come into play in the human-AI interaction, adding that the radiologist-AI interaction should be tested in experimental settings that mimic real-world scenarios and reflect the actual patient population for which the tools are designed.

Apart from improving the accuracy of the AI tools, it’s also important to train radiologists to detect inaccurate AI predictions and to question an AI tool’s diagnostic call, the research team said. To achieve that, AI developers should ensure that they design AI models that can “explain” their decisions.

“Our research reveals the nuanced and complex nature of machine-human interaction,” said study co-senior author Nikhil Agarwal, professor of economics at MIT. “It highlights the need to understand the multitude of factors involved in this interplay and how they influence the ultimate diagnosis and care of patients.”

Official opening held for London Clinic's Rapid Diagnostics Centre

No more on-prem? How quickly is the NHS imaging moving to public cloud?

Cookie	Duration	Description
__cfduid	1 month	The cookie is used by cdn services like CloudFare to identify individual clients behind a shared IP address and apply security settings on a per-client basis. It does not correspond to any user ID in the web application and does not store any personally identifiable information.
__hssrc	session	This cookie is set by Hubspot. According to their documentation, whenever HubSpot changes the session cookie, this cookie is also set to determine if the visitor has restarted their browser. If this cookie does not exist when HubSpot manages cookies, it is considered a new session.
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	1 year	This cookies is set by GDPR Cookie Consent WordPress Plugin. The cookie is used to remember the user consent for the cookies under the category "Analytics".
cookielawinfo-checkbox-necessary	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".

Cookie	Duration	Description
__hssc	30 minutes	This cookie is set by HubSpot. The purpose of the cookie is to keep track of sessions. This is used to determine if HubSpot should increment the session number and timestamps in the __hstc cookie. It contains the domain, viewCount (increments each pageView in a session), and session start timestamp.
tve_leads_unique	1 month	This cookie is set by the provider Thrive Themes. This cookie is used to know which optin form the visitor has filled out when subscribing a newsletter.

Cookie	Duration	Description
__hstc	1 year 24 days	This cookie is set by Hubspot and is used for tracking visitors. It contains the domain, utk, initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session).
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form.
hubspotutk	1 year 24 days	This cookie is used by HubSpot to keep track of the visitors to the website. This cookie is passed to Hubspot on form submission and used when deduplicating contacts.

Cookie	Duration	Description
cookielawinfo-checkbox-functional	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-others	1 year	No description
lfuuid	9 years 11 months	Third party (Lead Forensics) cookie which enables us to track visitor behaviour on our site. Tracking is performed anonymously until a user identifies themselves by submitting a form.
tl_554_555_1	1 month	No description
tl_554_605_2	1 month	No description
tlf_1	5 days	No description