Research suggests AI does not support radiologists’ performance



New research has shown that Artificial Intelligence (AI) may not be supportive of the performance of radiologists.

It is commonly assumed that AI can support clinicians by helping them interpret images such as X-rays and CT scans with greater precision to make more accurate diagnoses.

However, new research, published in Nature Medicine and led by Harvard Medical School, suggests this may not always be the case.

Working with researchers at MIT and Stanford, the team found that individual clinician differences shape the interaction between humans and AI in ways that researchers do not yet fully understand.

The analysis is based on data from an earlier working paper by the same research group released by the National Bureau of Economic Research.

Humans and AI

The research showed that in some instances the use of AI can interfere with a radiologist’s performance and the accuracy of their interpretation.

Co-senior author Pranav Rajpurkar, assistant professor of biomedical informatics in the Blavatnik Institute at HMS, stated: “We find that different radiologists, indeed, react differently to AI assistance — some are helped while others are hurt by it.

“What this means is that we should not look at radiologists as a uniform population and consider just the ‘average’ effect of AI on their performance. To maximise benefits and minimise harm, we need to personalise assistive AI systems.”

The findings underscore the importance of carefully calibrated implementation of AI into clinical practice, but the researchers emphasise that it should in no way discourage the adoption of AI in radiologists’ offices and clinics.

Instead, the results should signal the need to better understand how humans and AI interact and to design carefully calibrated approaches that boost human performance rather than hurt it.

“Clinicians have different levels of expertise, experience, and decision-making styles, so ensuring that AI reflects this diversity is critical for targeted implementation,” said Feiyang “Kathy” Yu, who conducted the work while at the Rajpurkar lab with co-first author on the paper with Alex Moehring at the MIT Sloan School of Management.

“Individual factors and variation would be key in ensuring that AI advances rather than interferes with performance and, ultimately, with diagnosis.”

AI tools affected different radiologists differently

While previous research has shown that AI assistants can boost radiologists’ diagnostic performance, these studies have looked at radiologists as a whole without accounting for variability from radiologist to radiologist.

This new study looks at how individual clinician factors — area of specialty, years of practice, prior use of AI tools — come into play in human-AI collaboration.

The researchers examined how AI tools affected the performance of 140 radiologists on 15 X-ray diagnostic tasks — how reliably the radiologists were able to spot telltale features on an image and make an accurate diagnosis.

To determine how AI affected doctors’ ability to spot and correctly identify problems, the researchers used advanced computational methods that captured the magnitude of change in performance when using AI and when not using it.

The effect of AI assistance was inconsistent and varied across radiologists, with the performance of some radiologists improving with AI and worsening in others.

For instance, factors that affected the outcomes included how many years of experience a radiologist had, whether they specialised in thoracic, or chest, radiology, and whether they’d used AI readers before, did not reliably predict how an AI tool would affect a doctor’s performance.

Another finding that challenged the prevailing wisdom was that clinicians who had low performance at baseline did not benefit consistently from AI assistance.

Some benefited more, some less, and some none at all. Overall, however, lower-performing radiologists at baseline had lower performance with or without AI. The same was true among radiologists who performed better at baseline. They performed consistently well, overall, with or without AI.

Additionally, more accurate AI tools boosted radiologists’ performance, while poorly performing AI tools diminished the diagnostic accuracy of human clinicians.

While the analysis was not done in a way that allowed researchers to determine why this happened, the finding points to the importance of testing and validating AI tool performance before clinical deployment, the researchers said. Such pre-testing could ensure that inferior AI doesn’t interfere with human clinicians’ performance and, therefore, patient care.

The researchers cautioned that their findings do not provide an explanation for why and how AI tools seem to affect performance across human clinicians differently, but note that understanding why would be critical to ensuring that AI radiology tools augment human performance rather than hurt it.

The team noted that AI developers should work with physicians who use their tools to understand and define the precise factors that come into play in the human-AI interaction, adding that the radiologist-AI interaction should be tested in experimental settings that mimic real-world scenarios and reflect the actual patient population for which the tools are designed.

Apart from improving the accuracy of the AI tools, it’s also important to train radiologists to detect inaccurate AI predictions and to question an AI tool’s diagnostic call, the research team said. To achieve that, AI developers should ensure that they design AI models that can “explain” their decisions.

“Our research reveals the nuanced and complex nature of machine-human interaction,” said study co-senior author Nikhil Agarwal, professor of economics at MIT. “It highlights the need to understand the multitude of factors involved in this interplay and how they influence the ultimate diagnosis and care of patients.”

Click to comment

Trending stories

Exit mobile version