Will Voice Replace Your Website?: Six Tips For Content Management in a Voice-First World

Published by Nate Treloar | President & COO at Orbita on Nov 1, 2018 11:51:01 PM

No, not really. Voice will not replace your website. At least not any time soon. But by now, it should be obvious that we’re entering a new era of digital experiences. An era where voice assistants, chatbots, and other virtual, conversational assistants will work in confluence with traditional Web and mobile applications.

One thing that hasn’t changed in the digital world is the importance of content. More than 20 years ago, Bill Gates declared “Content is King”. This reality brought about the concept of digital content management and all its variants – along with an industry full of tools, technologies, and approaches to support content management strategies. If voice is the next user interface, and conversational experiences the next digital channel, then it follows that content management strategies need to evolve reflect the unique requirements of conversational, voice-powered user experiences.

In this post, we’ll explore some of the nuances of content management in voice-based applications. Additionally, we’ll offer specific tips on how to prepare your content management strategy for the approaching voice-first digital world.

The Rise of Voice
Businesses across a variety of industries, from healthcare to finance, retail, travel, and hospitality, are rushing to build virtual assistants and chatbot applications that promise to improve customer engagement and provide new operational efficiencies.

According to Gartner, by 2020, 50% of all internet searches will be done through voice.

Voice and, more generally, conversational AI applications represent a new paradigm of digital experiences - a new channel for organizations to engage their customers, employees, and other stakeholders simply through the power of voice and conversation. Luminary Labs refers to voice as the next wave of digital disruptions; on par with past major digital disruptions, starting the invention of web browsing, through search, social, and mobile.

luminarylabsgoingvoicefirst-180208205629

As with previous digital disruptions, organizations adopting voice need to think about how their brand will be delivered and business goals achieved through these new voice applications.

It all begins with content
The disciplines of content management, and its cousin, digital experience management, are as old as the web – arguably older - and have been critical components of every digital experience disruption since the invention of the first web browsers. And with each disruption, from the first web sites, search sites, mobile apps, and social, we witness the initial awkward attempts and outright failures to deliver content to the new digital channel.

In the early days of the web, businesses rushed to build web sites without being thoughtful about how to truly cultivate the new medium. Many “brochure-ware” web sites resulted that didn’t take advantage of the unique features of the web, did not scale, and were costly to build and maintain. Fast forward 10 to 15 years to the age of smartphones and the same pattern repeats. Early attempts to bring digital experiences for smartphones were simply web sites shrunk down to the smaller screen size. Unusable eye tests.

With the rise of voice applications, we see history repeat itself yet again.

Content Management for Voice
The problem with existing digital content is not just that it’s not designed for voice-first applications. The problem is that organizations lack the tools to develop, manage, migrate, curate, or otherwise prepare their existing content for these new conversational channels. For most organizations working in voice, the state-of-the-art for voice content development involves a series of awkward and time-consuming steps to convert existing content to voice-ready form. These steps include a lot of copying, pasting, and editing by software engineers, not subject matter experts or content specialists. At best, this means having to develop and maintain two sets of content “truth” – one for traditional channels like web, mobile apps, and print, and one for conversational channels. At worst, content developed for voice applications may bypass existing processes and checks that ensure its quality and validity.

Again, history repeats itself. In the early days of web and mobile, tools to streamline content for these new digital channels did not exist. The result was a lot of hacking and a lot of poorly designed experiences that did not deliver. In time, content management systems evolved that allowed non-technical people to manage content and, eventually, the complete user experience of web site and mobile app users.

Best Practices for Content Management in Voice
Much has been written about designing user experiences for voice applications, but less attention has been placed on preparing existing content resources for the many unique aspects of voice and conversational applications. Consider this example from healthcare.

A standard assessment for rheumatoid arthritis asks the patient to select from a list of 50 symptoms that they may have recently experienced. In a web or mobile app, the approach to this question is to simply provide a list of symptoms with check boxes. Attempting to map this sort of question, as is, to a voice assistant doesn’t work. A voice assistant must mimic a human interaction. In this case, you wouldn’t expect a nurse to ask about symptoms by rattling off a list of them (“Do you have weakness, numbness, stomach ache…”). Instead, a voice-based version of this survey needs to support a more natural form of input. So, just as a nurse would phrase it, the virtual assistant might instead say, “Please tell me any symptoms you may have experienced since your last assessment”. A well-designed voice application can then apply natural language processing techniques to capture and encode the patient’s responses. For example, “I’m feeling a little dizzy” = “mild vertigo”).

Consider Answers, Not Just Content – Developing and maintaining content for voice is different. For example, it’s not practical to have a voice assistant simply read two pages of text off your web site in response to a user’s request for information. Yet those two pages may contain valuable information - content that has been carefully created, vetted, and approved to proactively answer many possible questions a visitor might have. Anyone exploring voice in their organization will end up confronting the fundamental challenge of repurposing existing digital content to the peculiar requirements of conversational applications. Tools for creating and maintaining voice-ready content derived for existing sources are just now entering the market. Orbita Answers™ is one example.
Optimize for Automatic Speech Recognition (ASR) – Any voice application, whether an assessment survey, like the earlier example, or simple question answering (FAQ) application, must be able to accurately recognize a user’s spoken word to properly interpret their input. In non-voice modalities, like a web or mobile apps, those inputs can be explicitly encoded in a form to remove all ambiguity, but in a voice app that accepts freeform spoken input, the logic for resolving ambiguity must be developed and managed. For example, a patient answering the question “how many hours did you sleep last night?” may answer “eight”. The ASR rule is to know that, in this context, the patient is saying the number 8, not the word “ate” or any other homonym.
Consider the Content for “Intent Handling” – Conversational technologies provide a way to map user “utterances” - what they actually say, to specific “intents” - what they generally mean. For example, the utterances “I’m thirsty”, “I’d like a drink of water”, and “I need some water” are all variations of the same intent. This is a place where AI/machine learning come in. Most virtual assistant technologies provide a way to train intent recognition with example utterances so that the assistant can learn how to handle variations. Managing the intent handling is a content and experience management exercise with characteristic similar to managing full-text search. Look for tools that will allow your content managers and digital experience managers to control the rules and training inputs of these intent handlers.
Plan for Speech Synthesis – The other side of the coin from ASR, is speech synthesis. Content management for voice includes ensuring that words and phrases are pronounced properly by the voice synthesis engine. This is never a guarantee with any of the commercial voice engines on the market. For example, speech engines from Amazon, Google, IBM and others are not guaranteed to pronounce all the various and sometimes bizarre commercial drug names. The problem is more obvious when it comes to organization-specific concepts (e.g. in healthcare, the names of doctors, facilities, wings, acronyms, etc..). Controlling how concepts are pronounced and simplifying this control through intuitive tools that non-technical staff can manage is a priority for any voice application development project.
Use Empathy Modeling – Although it’s clear that conversational virtual assistants are artificial, we generally expect them to be human-like in how they interact with the end user. This goes beyond just making sure the virtual assistant is pronouncing words properly – speech synthesis. It involves adjusting content to provide a more natural, less clinical, response. Cadence of speech, inflection, and tone are elements of speech that should be controllable. Also, the inclusion of normally throw-away words and phrases like “Ok”, “I understand”, or even just “Mmm hmm”, create a more natural experience that can improve user engagement in voice-powered applications.
Remember Context – Voice assistants often serve multiple purposes. It’s not unusual in a healthcare application, for example, for a voice assistant to provide a combination of health assessment and health education in a single application. In other words, an assistant that can both ask and answer questions. Like a human, a voice assistant should be able to hold a conversation by providing the right response (or question) in the right context, handling context-shifts in the middle of a conversation, and disambiguating context when necessary (“Do you mean…?”).

Enabling content for voice assistants has many challenges, but careful planning and the right tools will ensure that your content is ready for the voice revolution.

Orbita is the only conversational platform focused on helping organizations simplify how high-value content is delivered through conversational channels, securely and at scale. Orbita provides tools for voice-first content management, ranging from authoring to integration, and including intuitive tools and templates to simplify recognition of concepts, intent mapping, speech synthesis, empathy modeling, and conversational flow design.