OpenAI Tool Used by Doctors ‘Whisper’ Tricks: Read
ChatGPT maker OpenAI launched Whisper two years ago as an AI tool that transcribes speech to text. Now, this tool is being used by the AI healthcare company Nabla and its 45,000 doctors to help document medical conversations at more than 85 organizations, such as the University of Iowa Health Care.
However, a new study shows that Whisper has been “hallucinating,” or adding statements that no one has said, to chat texts, raising the question of how quickly medical institutions should use AI when it reveals errors.
According to the Associated Press, a researcher at the University of Michigan discovered 80% of Whisper’s recordings. An anonymous developer found negative comments on more than 100 hours of writing. One developer found accuracy in almost all of the 26,000 scripts they made with Whisper.
Poor transcription of conversations between doctors and patients can have “really serious consequences,” Alondra Nelson, a professor at the Institute for Advanced Study in Princeton, NJ, told the AP.
“No one wants to be misdiagnosed,” Nelson said.
Related: AI Is Not ‘Revolutionary,’ And Its Benefits Are ‘Overblown,’ MIT Economist Says.
Earlier this year, researchers from Cornell University, New York University, the University of Washington, and the University of Virginia published a study that tracked how often OpenAI’s Whisper-Whisper-to-text service had to write 13,140 audio segments with an average length of 10 seconds. The audio was taken from TalkBank’s AphasiaBank, a database of voices from people with aphasia, a language disorder that makes communication difficult.
The researchers found 312 cases of “completely forgotten phrases or sentences, which were not in any way in the background noise” when they conducted the study in the spring of 2023.
Related: Google’s New AI Search Results Are Already Deceptive — Telling Users to Eat Stones and Make Pizza Sauce with Glue
Among the forgotten texts, 38% contain hurtful language, such as violence or stereotypes, that do not fit the context of the conversation.
“Our work shows that there are serious concerns about Whisper’s accuracy due to ambiguous perceptions,” the researchers wrote.
The researchers say the study may also indicate a whispering bias in Whisper, or a tendency for it to make errors more often in a particular group – not just people with aphasia.
“Based on our findings, we suggest that this type of reasoning may appear in any group of people with speech impairments that bring about more dissonance (such as speakers with other speech impairments such as dysphonia. [disorders of the voice]very old people, or non-native speakers),” the researchers said.
Related: OpenAI Reportedly Used Over a Million Hours of YouTube Videos to Train Its Latest AI Model
Whisper has documented seven million medical conversations about Nabla, per The Verge.
Source link