aiOla's Speech AI Technology Outperforms OpenAI's Whisper in Recognizing Jargon
aiOla's model automates the creation of customized processes and workflows for conducting reports and inspections across industries such as manufacturing, supply chain and logistics, pharma, and more
TEL AVIV, Israel, April 18, 2024 /PRNewswire/ -- aiOla, an AI-powered technology that automates business workflows by capturing spoken data, has announced a major milestone in speech recognition. aiOla's solution, powered by a novel keyword spotting model, has advanced to match human proficiency in understanding industry-specific jargon. The patented AdaKWS model achieved 95% accuracy in keyword spotting, surpassing OpenAI's industry-leading Whisper model which reached 88% accuracy.
Keyword spotting is an essential aspect of speech recognition that tackles the problem of identifying jargon by detecting predefined words and phrases. "Think about a courier delivery where your package arrives damaged. The courier needs to file a report using specific codes and acronyms that describe the situation — those codes and acronyms are keywords. Industry jargon is everywhere and in many fields, it dominates communication, comprising up to half of workers' speech," said aiOla's CEO and co-founder, Amir Haramaty. "The ability to spot keywords enables automation of everyday processes across a wide range of industries, from filing a parcel damage report to completing a safety inspection in a food manufacturing plant, transforming speech into actions."
aiOla's process automation applications can accurately understand speech, jargon and acronyms across over 100 languages, regardless of accents and background noises. aiOla achieves this by combining its state-of-the-art keyword spotting model with a speech recognition model. The onboarding process takes mere hours: clients provide examples of their checklists or forms, and aiOla automatically generates custom language models for the use case. Workers are then able to complete their operations verbally using the aiOla app while keeping their eyes and hands on the equipment. aiOla's exceptional ability to spot rare industry terms with high accuracy allows the platform to easily distinguish between speech related to work processes and everyday conversation.
The app leverages a proprietary model that was developed by aiOla's team of scientists to recognize a predefined list of keywords within speech. This enables aiOla's solution to be instantly adapted to the jargon of any industry without needing to retrain its AI model. On a benchmark of keyword and jargon detection that includes 16 languages, Whisper's largest model yields 88% accuracy compared to aiOla's model achieving 95% accuracy. Additionally, in a recent benchmark which is composed of hard-to-detect keywords taken from English language audiobooks, the CED model from a team of Apple researchers yields 92.7% whereas aiOla's AdaKWS reaches 95.1% accuracy.
"Keyword spotting poses significant challenges due to the scarcity of training data, especially across diverse languages and dialects. It typically requires industry-specific fine-tuning to enable models to recognize jargon not commonly found in everyday speech," said aiOla's Chief Scientist, Professor Joseph Keshet. "Our model consistently surpassed the OpenAI Whisper baselines by a significant margin, achieving a substantial improvement compared to the top-performing baseline. Furthermore, our model is far more efficient, using 15x fewer parameters."
To learn more about aiOla's technology visit: https://aiola.com
Explore aiOla's keyword spotting research: https://arxiv.org/pdf/2309.08561.pdf
About aiOla:
aiOla's patented technology comprehends over 100 languages, and discerns jargon, abbreviations and acronyms, demonstrating a low error rate even in noisy environments. aiOla's technology converts manual processes in critical industries into data-driven, paperless, AI-powered workflows through cutting-edge speech recognition.
Share this article