Blockchain

Watsonx Orders reveals innovative AI technology

Head to your favorite drive-thru for some fries and a cheeseburger. It’s a simple procedure, but as you wait in line, you’ll notice that there isn’t much of a line. What could go wrong? abundant.

The restaurant is located near a busy highway with lots of traffic noise and planes flying low overhead as they approach the nearby airport. It’s very windy. The stereo is blaring in the car behind you, and the customer in the next lane is trying to order at the same time as you. The cacophony will challenge even the most experienced human command follower.

With IBM® watsonx™ Orders, we created an AI-powered voice agent that can take drive-thru orders without human intervention. The product uses cutting-edge technology to isolate and understand human voices in noisy situations while supporting natural, free-flowing conversations between customers placing orders and voice agents.

Watsonx Orders understands your voice and delivers your orders.

IBM watsonx Orders starts the process when it detects a vehicle approaching the speaker post. Greet customers and ask them what they would like to order. It then processes the incoming audio and isolates the human voice. This detects orders and items and then shows the customer what they heard on the digital menu. If the customer says everything looks good, watsonx Orders sends the order to the point of sale (POS) and the kitchen. Finally, the kitchen prepares food. The entire ordering process is shown in the picture below.

There are three parts to understanding customer orders: The first part is isolating the human voice and ignoring conflicting environmental sounds. The second part is understanding speech, including the complexities of accents, colloquialisms, emotions, and misexpressions. Finally, the third part is converting voice data into actions that reflect customer intent.

Isolating Human Voices

When you call your bank or utility company, a voice agent chatbot will answer your call first and ask you why you’re calling. The chatbot expects relatively quiet audio from your phone with little to no background noise.

There is always background noise in the drive-thru. No matter how good your audio hardware is, loud noises, such as the horn of a passing train, can drown out human voices.

Because watsonx Orders captures audio in real time, it uses machine learning techniques to perform digital noise and echo cancellation. Ignores wind, rain, highway traffic, and airport noise. Other noise issues include unexpected background noise and crosstalk from people talking in the background while ordering. Watsonx Orders uses advanced technology to minimize these disruptions.

speech understanding

Most voice chatbots started out as text chatbots. Traditional voice agents first convert speech into written text and then analyze the written sentence to figure out what the speaker wants.

This is computationally slow and wasteful. Instead of first converting sounds into words and sentences, watsonx Orders turns speech into phonemes (the smallest sound units of speech that convey distinct meaning). For example, if you say “shake”, watsonx Orders parses that word into “sh”, “ay”, and hard “k”. Converting speech to phonemes instead of full English text improves accuracy for different accents and reduces latency within conversations, actively supporting real-time conversation flow.

Putting Understanding into Action

Next, watsonx Orders identifies the intent, such as “I want” or “Cancel.” It then identifies items related to the command, such as “cheeseburger” or “apple pie.”

There are several machine learning techniques for intent recognition. State-of-the-art techniques use elementary and large-scale language models that can theoretically understand any question and respond with an appropriate answer. This is too slow and computationally expensive for hardware-limited use cases. It may be impressive to have a drive-thru voice agent answer “Why is the sky blue?”, but it will slow down your drive-thru, create confusion for people in line, and reduce your revenue.

Watsonx Orders uses a very specific model optimized to understand the hundreds of millions of ways you can order a cheeseburger, including “no onions, light on special sauce, extra tomatoes,” etc. This model also allows customers to modify their orders mid-menu. “Actually, there are no tomatoes in that burger.”

In production, watsonx Orders can complete more than 90% of orders on its own, without human intervention. It’s worth noting that other vendors in this space use contact centers with human operators to take over when AI agents get stuck, and consider the interaction “automated.” According to the IBM watsonx Orders standard, “automation” means processing orders end-to-end without human intervention.

Make money with real implementations

During peak hours, watsonx Orders can handle more than 150 cars per hour in a dual-lane restaurant, which is better than most human order takers. More vehicles per hour equals more revenue and profit, so our engineering and modeling approaches are continuously optimized for this metric.

Watsonx Orders has processed 60 million real orders from dozens of restaurants despite challenging noise, crosstalk, and order complexity. We want to be able to work with every quick service restaurant chain around the world, so we’ve built a platform that can easily adapt to new menus, restaurant technology stacks, and centralized menu management systems.

Keep your restaurant running smoothly with AI that handles even the most challenging orders

Was this article helpful?

yesno

Related Articles

Back to top button