Recently I made one of the biggest decisions in my career to leave a stable role at one of the Big Four global firms and join a Boston based start-up, Orbita, specializing in conversational AI in healthcare. Many of you saw this coming for a while, but for those just learning about it, I want to share some of my thinking on what has me so excited about voice, the revolution in user interfaces, and the way we work and interact with technology.
Over the past 20-plus years I have worked in operational transformation programs; developing expertise in Lean Six Sigma, digital transformation, and most recently cognitive and conversational AI. My specialization was in helping field workers rethink how we work every day, and to find more efficient and more engaging approaches to their work. This took me from hangar floors to telephone exchanges, from defense force operations to hospitals and senior living facilities.
Throughout these experiences, I've been passionate about the evolving technology landscape, and working closely with a number of leading experts changing the way we work. Without doubt, the pace of technological development is faster than ever, and the disruption faced at the front line of business operations has gone from the ‘once a decade upgrade’ to an annual system disruption.
With all of this technology evolving, why focus on voice? What makes voice so special?
I’ve come to the conclusion that there are four main drivers:
1. A natural interface that is inevitable
Think of films and characters including Hal from 2001: A Space Odyssey, Star Trek, C3PO, Jarvis, Samantha (Her), Minority Report— and the list goes on. For generations we have envisioned a day when we would be able to talk to our technology, and it would be able to talk back. We saw the inevitability of not just transactional communication, but also emotional interactions, personalized advice, unusual requests, and education from our technological innovations.
This inevitability is partly because voice is fast. We speak at about 125 words per minute, while the average person types around 40 words per minute. There are various studies on the number of words we speak per day, ranging from 7,000 words all the way up to 47,000 words.
Also, voice is intuitive. We have been speaking conversationally for most of our lives. Inflections, accents, tone, volume, emphasis, and other elements of language and linguistics provide an incredibly rich data set that we don’t even notice but respond to in our relationships.
Conversational AI technology can understand a complex sentence and these linguistic nuances and perform activities in response. When speaking to a conversational AI interface, the user doesn’t need to learn how to write code. In fact, when we are unsure of how to provide instructions, a voice assistant can re-prompt to help us complete an interaction. How we say or ask things is just as important, and incredible work is being done to take advantage of these intricacies of language.
Recently, I worked with a team of caregivers who used the native translation capability of a smart speaker to support a patient who could only speak Mandarin. The relief the patient felt was immediately evident in their demeanor.
3. An emotional connection that we trust
There are numerous articles written about the importance of verbal connection in a society overrun with text-based communications. The University of Glasgow analyzed how people felt after hearing the word ‘hello’ said in different ways. Interestingly, one variable was level of trust. A lively ‘hello’ (one with varied tones: beginning high, dropping in the middle, then rising at the end) was perceived as more trustworthy.
Applying this approach to voice interfaces, we are able to develop linguistic models which could put the user at ease when in a stressful environment, such as an ER. Conversely, this approach can also be used to provide warnings and cautions when we need to capture the attention of the user.
As we become more reliant on the voice interface, developing trust models will become a competitive advantage for capturing and retaining customers.
4. The sheer challenge of getting it right
There is joy in seeing a user interact with a voice interface for the first time, such as seeing a spinal injury patient reconnecting with the world after a tragic accident.
I once read that designing a voice interaction is like writing a script for a dialogue when you only know what one person is going to say. That statement starts to hint at the complexity in voice interface design.
When it comes down to it, I love complex and challenging problems. Being part of Orbita’s team and building out one of the most exciting voice platforms that I've ever seen, I feel incredibly lucky. They are brilliant people, and passionate about the future of healthcare enabled by conversational AI and the natural language interface.
As the pioneers of voice-based solutions in healthcare, we have an incredible opportunity to change the patient-provider experience, and to lay the foundations for how the healthcare industry communicates for generations to come.
Nick has deep expertise in helping transform operations via technology from a 17-year career with Deloitte Consulting in Australia, the UK, and Southeast Asia. While managing Deloitte’s Australia-based Smart Healthcare Solution Team, he co-created and incubated the award-winning OrbitaASSIST solution.