A big part of artificial intelligence technology is the area of natural language processing or
an LP NLP. Take for example the now popular personal assistant applications like Amazon Echo or apples sire I Apple Siri. These tools are made possible by a subset of natural language processing called natural language understanding. Speech to text software, otherwise called voice recognition software, is perhaps the most basic function for natural language understanding.
As a matter of fact, we’re writing this article right now using only the voice recognition software we bought from a company called Nuance. The product is called Dragon NaturallySpeaking and costs $69.99.
This is the world’s best-selling voice recognition software and we would argue, the world’s best voice recognition software based on who has adopted it. We were surprised to find out that in fact Nuance partnered with Apple and provides the natural language processing technology that drives
apples sire right Apple Siri. We can only assume that Apple chose the best voice recognition technology out there. Nuance, the company that built Dragon NaturallySpeaking, has made numerous acquisitions over the years and even bought technology from IBM. While this voice recognition software may be perfect for courtroom transcribing or writing articles, can we use it to communicate with the computer using natural language, the same natural language that we use to communicate to each other?
There’s an interesting term used in the artificial intelligence world called AI complete. Also referred to as
a I hard AI-hard, this term refers to the ability for artificial intelligence to achieve a human level of understanding. At that point in time we should not even need a keyboard nor even speech, as “brain to computer interfaces” should eventually allow you to give commands with your thoughts. Realistically though, we will still need natural language processing to interact with all the objects around us in the same way that we do among ourselves.
The only thing slightly more frustrating than trying to interact with a
mom by Mumbai call center agent, is trying to interact with an “artificial intelligence enabled” customer support line. If you’re calling about your checking account say yes. Yes. We’ve all had these awkward conversations with “artificially intelligent” call centers. The area of natural language processing and focus here is called natural language understanding. Voice recognition software needs to be perfect for natural language understanding to become the preferred input method as opposed to a keyboard, mouse, or telephone keypad.
Getting back to our experience so far with the Dragon NaturallySpeaking voice recognition software, the setup was quite easy. The brief tutorial was useful, and included some real gems such as this one:
While we all had a good chuckle when we first read that, it wasn’t that long before we tried to do that exact same thing. Here’s a few more gems:
to who the hell thinks of the entire sentence they’re going to say before they say it? And how do I speak “clearly but naturally”? Anyways.
So far we are impressed with this voice recognition software aside from having to say the word
. At period at the end of every sentence. So there’s a pretty obvious flaw. How can we say the P – word without the plug-in assuming were we’re done finishing her our sentence? So the first rule of punctuation is “don’t talk about punctuation”. Speaking of which, you can just say the words and the punctuation appears (? “”-$, Etc.). As for commas, it would be nice if the voice recognition software knew where to insert comments commas.
So let’s say we wanted to have a discussion about natural language understanding startups. One such startup is called Swift Keys. Note that the voice recognition software did not recognize Swift Keys as a company and capitalize it. We had to capitalize it after the fact using the statement
“Swift Keys” “capitalize swift keys”. Another such startup is called alien. Because the entire context of this conversation is about natural language processing and we use the word startups in a sentence, we would expect that the voice recognition software recognizes that we were referring to a startup called Alyen which is spelt with a why y. We were easily able to correct the incorrect spelling of alien without having to touch the keyboard, just using vocal commands but it took about ten seconds.
Another obvious problem could be people’s names. If I want to refer to the founder of Founders Fund, I would say Peter
seal Thiel. The voice recognition software should have matched Founders Fund with Peter seal Thiel and figured out who I was talking about. What about Donald Trump Obama and Bill Gates? All good. What about George Stephanopoulos? That works to too. What about Randall J Kirk who is the CEO of in tracks on Intrexon? That kind of worked. What if we start to throw some diversity into the mix? What about the name she shut today Shisha Apte taken from the “random Indian name generator”? Now we can see where this becomes somewhat problematic.
it and analyze at Nanalyze, we are always interested in finding startups that have solved difficult problems. If you think your startup has developed NLP voice recognition technology that can handle some of the problems we have encountered while writing this article, drop us a line.
If you haven’t figured it out yet, the red text shows the problems we encountered with the Dragon NaturallySpeaking voice recognition software while writing this article during which time we
didn’t touch the keyboard once tried not to touch the keyboard. Our goal is to transcribe this article using voice recognition software with the resulting content having almost no red text. We’ll continue using Nuance’s voice recognition software as we believe we can use it to write our articles much faster. Eventually. Microphone off.
Want to pick up a copy of the software we used to write this article? Right now, Dragon Naturally Speaking Home Version is 30% off and you can download a copy for $69.99 or even go for a 1-week free trial on your smartphone.