Et tu Apple Watch: Integrating Siri with an AI engine to reduce patient wait time

The Patient is in apps integrate with Siri to provide a voice interface for messaging between the doctor and the charge nurse. This includes listening to patient assignments, rejecting an assignment, accepting an assignment and providing an estimated time of arrival, and lastly notifying the charge nurse of assignment completion which allows the staff to start the cleaning process to reduce the wait time for the next patient.

On both the iPhone and Apple Watch, this enables the doctor to use AirPods and a few other Bluetooth headsets to remotely manage patient assignments exclusively with her voice. For shorter distances, the “Hey Siri” voice trigger also works well and that is the exclusive technique for hands-free voice control on Apple Watch as of watchOS 3.2. Click on this link to learn about Siri support in the Patient is in.

In this article, we will look at the underlying natural language processing (NLP) engine built for Siri integration with the Patient is in messaging features.

Natural Language Processing Theory

NLP, is one of the core AI speech technologies along with text-to-speech (TTS) and speech-to-text which is also known as speech recognition. Advances in text-to-speech has led to more natural sounding computer voices and advances in speech recognition has led to better audio transcription.

NLP attempts to understand the meaning or intent of a sentence and to do that, an NLP engine must first be able to determine in which language is the sentence. Next, the part of speech of the words used in the sentence, as well as other components of a language such as word stems and contractions must be identified. For example, the doctor’s sentence: “I’ll go to the front office” would be decomposed or parsed into the pronoun “I”, the verb “will”, the verb  “go”, the preposition “to”, the determiner “the”, the adjective “front”, and the noun “office”.  So even though the doctor spoke the word “I’ll” the NLP engine had to understand the concept of American English language contractions and process the two words “I” and “will”. When the doctor annunciates correctly, then everything works well as we see in these screen shots:

Siri in Healthcare: Combinators and the Patient is in

Beyond contractions, advanced natural language processing algorithms must also consider the accent with which a user speaks. From the NLP engine’s point of view, everyone has an accent whether its the slow-taking southerner’s accent, the neutral mid-atlantic accent, the fast-taking New Yorker’s accent, a Bostonian accent, or perhaps the accent of a partial deaf therapist. And of course, when we are tired, we all tend to mumble a bit making us sound inarticulate. This also happens to surgeons and anesthesiologist after a middle-of-the-night emergency surgery. So, it should not be surprising that the intended word “I’ll” can be misspoken just enough to be transcribed as the word “all” as we see in the following screen shot:


The NLP engine used in the Patient is in apps on both Apple Watch and iPhone compensates for these common transcription errors because the algorithms were tailored for a doctor’s use of Siri.

Other transcription errors can seem insurmountable to deduce the doctor’s intent. For example, can you fix the phrase “go to low or one”? The app’s NLP engine was able to correctly determine that the doctor intended “I’ll go to OR 1” as we see in these screen shots:

You should also notice that the NLP engine also understands American Language homophones as in “8 and eight and ate”, “4 and for and four”, et tu “2 and to and too and two”.

If we reexamine the classic NLP pipeline advocated in AI research as mentioned at the start of this article, we learned that it started with first identifying the context language and then parses the text into parts of speech. When we applied this to the doctor’s message we saw that the phrase “Exam Room” was decomposed and identified as the adjective “front” and the noun “office”. While linguistically correct, that does not help us because the app has to notify the charge nurse that the doctor has accepted the assignment to go to the specific room named Exam Room to treat a patient or perform surgery.

Unlike the Siri messaging support in apps such WhatsApp or Messages which simply relay the transcription provided by Siri’s speech recognition, the Patient is in must determine which room is the doctor discussing, if the doctor is rejecting an assignment, completing an assignment, or accepting an assignment and providing an estimated time of arrival which may be provided in hours or minutes.

The NLP engine uses multiple algorithms and text processing techniques to best ensure that the doctor’s intent is correctly captured even if the doctor’s words were incorrectly transcribed. The approach to use multiple algorithms and extraction techniques is relatively new and is called parser combinators. The NLP engine used in the Patient is in apps on both the iPhone and Apple Watch adds many Siri-specific and doctor-specific algorithms to the classic NLP approach.

Due to the uncertainty in processing speech, a new user interface idiom has evolved. Called the conversational user interface, it enables Siri to mediate a conversation between the user and the app and more abstractly between the doctor and the charge nurse.

Introducing Conversational Interfaces

Like in real life, rarely is information unambiguously clear. If you ask a taxi driver to drive you from the John Wayne Airport in Santa Ana, California to your office “on Main and MacArthur in the next town over” which is a few blocks away in the city of Irvine, he may instead drive you a few miles further to Main and MacArthur in the city of Costa Mesa as both cities are adjacent to Santa Ana and not only have the same street names but are actually the exact same streets which intersect in two different cities a few miles apart.

With voice interfaces, the app has to support a conversation to clarify the user’s words by asking for more information from the user either because the user has not provided enough information or has provided ambiguous information. The app must also confirm its understanding before taking action on behalf of the user.

Siri mediates this conversation between the user and the app and handles the speech recognition and passes the text transcript to the app which processes that with natural language algorithms and other text processing technologies. If the app needs additional information, it asks Siri to prompt the user by providing Siri with text to read to the user. Siri uses text-to-speech to ask those questions. After the app is satisfied that it understands the user’s request, the app asks Siri to ask the user to confirm or reject those assumptions and with the user’s permission, the app finally processes the user’s request.

What’s next in Conversational Interfaces?

Voice technologies and conversational interfaces empower app developers to use AI to provide alternative ways for users to access their app. These technologies, also extend access to a larger set of users. Through a voice interface, perhaps app developers will finally realize that accessibility and usability are fundamentally intertwined or as Tim Cook recently said in an interview marking Global Accessibility Awareness Day 2017,  accessibility is a human right.

Siri integration for a voice interface

The Patient is in apps are integrated with Siri on both the iPhone and Apple Watch to provide a voice interface for messaging.

Conversations with the charge nurse using Siri on the iPhone: “Hey Siri, read my Patient messages” allows the doctor to hear her patient assignments and “Hey Siri, send a Patient message saying I’ll go to Therapy Room 2 in 10 minutes” allows the doctor to respond to assignments from either her Apple Watch or iPhone with an estimated time of arrival. Notice of assignment completion is supported from both the Apple Watch and iPhone with “Hey Siri, send a Patient message saying I’ve completed my assignment in Recovery Room 2”.

All of these statements are processed by the Patient is in natural language processing (NLP) engine fixing homophones (“Exam Room 2″ vsexam room to” vsexam room too” vsexam room two”) and other linguistic and transcription impediments to create a structured message upon which the charge nurse’s iPad app can visually display and drive real world processes such as cleaning the room so that the next patient’s wait time is significantly reduced.

For a truly hands-free experience on the iPhone, the doctor needs only to use AirPods or another bluetooth headset and use the phrase “Read my Patient messages” after activating Siri either with a double tap on either AirPod or with the “Hey Siri” trigger phrase.

And then the doctor may respond to the assignment using the phrases: “Hey Siri, send a Patient message…” as seen in these Apple Watch screen shot:


and on the iPhone as seen in this screen shot we see the natural language processing (NLP) engine of the Patient is in fixing homophones (“Exam Room 2″ was transcribed by Siri as exam room to” ) and other linguistic and transcription impediments to understand the doctor’s intent:


To ensure the best experience, the internal NLP engine used in the Patient is in apps has been optimized for Siri in a medical environment.

The following are some sample phrases which the doctor may use with Siri on both the iPhone, Apple Watch, and HomePod:

  • Send a Patient message saying I’ll go to Exam Room 1 in 15 minutes
  • Send a Patient message saying I’ve completed my assignment in Recovery Room 2
  • Send a Patient message saying I cannot go to Exam Room 2

Exclusive to the iPhone and HomePod, the doctor may ask Siri the following:

  • Read my Patient messages
  • What are my messages on Patient?

Understanding the AI integration with Siri

Click below to learn more about the technology powering the integration with Siri:

Videos of using Siri with the Patient is in app

In the following HomePod video, a group of anesthesiologists use Siri to respond to patient assignments:

In the following iPhone video, the doctor uses Siri to listen to her patient assignments: