Best Practices in Speech Recognition: A Blueprint for Successful Speech Applications

by Maria Simonton

[frame src=”https://www.interactivenw.com/wp-content/uploads/2016/06/Speech-Recognition-INI.jpg” link=”” target=”_self” width=”” height=”” alt=”Speech-Recognition-INI” title=”” align=”right” prettyphoto=”false”]While SMS, webchat, and other digital channels have risen in popularity, the phone remains the preferred contact method for 80% of users surveyed in a recent industry study.

Given that, it would be remiss to ignore the role that voice can—and should—play in customer handling. One way that companies can vastly improve customer usage and perception of interactive voice response systems is to provide a natural, efficient experience using speech recognition.Deploying speech applications may seem daunting to the uninitiated.  By following these suggested best practices, companies can leverage speech technologies in the contact center with confidence and positive results.

Designing for Speech

At the heart of all good user interfaces is a great design.  But before design can begin, business needs must be identified and analyzed.  Answers to questions such as “What are our biggest call drivers?” and “What information do our customers need most?” will help inform the design process.  Analyzing contact center data and interviewing agents are excellent ways to gain insight into caller behavior.

Once the application tasks and goals have been outlined, the next step is to render the caller interaction as a visual flow diagram.  A dialogue specification that spells out message and menu verbiage will allow the designer to walk through various use cases and ensure that the application covers all major functional requirements. Detailed documentation is the key deliverable of this phase—it is the blueprint on which the application is built, and on which all future phases depend.

Off to See the Wizard

With a mature draft of the call flow in hand, the designer can then set off down the road of simulation testing.  Often referred to as Wizard of Oz testing, this process involves someone “playing” the role of the application while interacting with a live group of users.  During this phase, it becomes obvious where callers will stumble or become confused and frustrated.  It will also allow the designer to take note of unexpected things which the user tries to do or say, and revise the design accordingly.

Recipe for Success

The brains behind a speech application are its vocabulary lists—called grammars—which specify exactly what a user is allowed to say. Grammar design is half art, half science, and arguably the most important part of the speech recognition puzzle.  Accurately predicting what a caller will say is tricky, of course, so a well-researched grammar specification is imperative.

The grammar spec should incorporate both official business language and colloquialisms that real-world callers use, including findings from Wizard of Oz tests.  Common acronyms, slang, and regional pronunciations must be considered for inclusion as well.  For more natural-style applications, grammars should also allow the caller to speak optional polite and request phrases (“please”, “I’d like”) in order to improve recognition success.  A well-documented grammar specification will become an indispensable tool for tuning efforts down the road.

Design Meets Reality

Even the most well-thought-out designs aren’t always perfectly implemented; what looked good on paper doesn’t necessarily sound good on the phone.  That’s why it’s important to assess usability metrics internally before releasing the application to customers.  The practice of usability testing is quite different from quality assurance or functional testing.  It focuses solely on the caller experience and how usable the application is from the tester’s perspective.

Typical measures of usability are: amount of time to task completion, ease of navigation, clarity of prompts and messaging, and general satisfaction.  With the exception of time to completion, all are highly subjective categories.  Because customer satisfaction is a highly subjective experience, it makes sense to give credence to feedback gained from usability testing and revise the application design as needed.

Care and Feeding

Your application is up and running and taking live calls. Although you may start enjoying the ROI right away, a speech recognition application requires occasional fine-tuning in order to remain effective.  Customer needs change over time, as do business goals, and a speech application should be evaluated and optimized periodically.

Application tuning typically involves capturing a large sample of recorded utterances over a multi-week period.  Each utterance, in the form of an audio file, is then listened to, categorized, and compared to the results generated by the speech recognition engine.  It’s a laborious process, but one that reaps significant benefits as it provides insight into what callers are actually doing and saying.  Grammars and prompts are then updated based on these findings.

The Time is Now

Given today’s technology advances, there is no need—or excuse – for contact centers to be stuck in the 1990s.  Never before has speech-enabling an IVR been so attainable and affordable.  Tiered licensing options are available for companies wanting to test the waters with speech, making it possible to start small and dream big.  Customers demand better service channels, and lose brand loyalty easily when unhappy.  So, why wait? Schedule a consultation with one of INI’s solution specialists to learn more about elevating your customer engagement strategy with automated speech recognition.

[button link=”https://www.interactivenw.com/schedule-a-consultation/” size=”large”]Learn more about ASR Solutions from INI[/button]