Whither Voice Command Over BTh?

I had a request to do a blog entry explaining why Voice Command doesn't work over Bluetooth (BTh) headsets.  The long story is a sordid tale of love, betrayal, and the number 8. 

 

The short story is that the current version of Voice Command relies on microphones that sample at 16 kHz, but BTh headsets only do 8kHz.

 

A sample by any other name...
So, you want the long story, huh?  Okay, first let's talk about how computers collect data from microphones.  Computers do almost everything in discrete chunks.  When you listen to an audio recording, it sounds like a continuous stream of sound.  But it's actually a bunch of small bits of sound played so close together that your ears can't tell the difference.  A microphone is continuous.  It continually outputs a value that represents what it's currently hearing.  But the computer that is hooked up to that microphone is discrete.  The computer generally ignores what the microphone is saying.  Then, every so often, it checks the microphone's current value and writes it down.  Then it goes back to ignoring the microphone again.  When the computer checks the microphone, you could say that it is "taking a sample."  Most people just call this "sampling."

 

The more often you sample, the more accurate your recording is.  If you took one sample every second and then played it back, it wouldn't sound anything at all like what actually happened.  If you sampled a thousand times a second, you'd end up with something that somewhat resembled the original sound.  Do it 8 thousand times a second, and you'll do a pretty decent job of reproducing simple sounds like speech, but will do a poor job of reproducing more complex sounds like music.  You need to sample 16 thousand times a second (or more) to reproduce music reasonably. 

 

We talk about this in terms of "kilohertz" or "kHz."  Kilohertz means "a thousand times a second."

 

The best laid plans
Some software takes a very long time to write, often multiple years.  A problem facing any software developer on a multiyear program is that the world tends to switch out from under us.  Assumptions we make at the start of development might prove to be incorrect by the time we're done.

 

This is what happened to the Voice Command team.  They set out to make the best voice recognition software they could.  And, their original analysis said that they could do much better recognition if they designed their database around the assumption that audio would be sampled at 16 kHz instead of 8. 

 

At this point in history, BTh's future was murky.  Although it had strong backers, it wasn't doing very well.  There were people who promised that it would align the planets, create spiritual harmony, and bring about world peace.  There were others who felt that it was a flash in the pan that wouldn't go anywhere.  And there were a million more opinions that covered pretty much everything in between.  Publicly, Microsoft was pretty cold on BTh at the start, though a number of people in the company were strong proponents. 

So the Voice Command team had a tough decision to make.  Should they assume BTh would continue to flounder and write a better recognizer that relied on built in microphones (which can sample at 16 kHz)?  Or should they assume BTh would take off and everyone would want to do Voice Command over BTh headsets?

 

They decided that BTh would succeed, but that it was going to take a while to do it.  So they chose to go with the better recognizer first.  Then BTh took off more quickly than they expected.  It hasn't aligned the planets yet, but it's clearly going to be around for a long time to come.  So, in the end, they made the wrong choice.  And, as a result, Voice Command doesn't work on BTh headsets today.

 

Are we learning yet?
Have we learned our lesson and will never fail to foresee the future again?  Nope.  This isn't the first time this has happened to us, and it's not going to be the last time either.  We are constantly put in a position where there are competing technologies on the horizon but we only have the resources to support one of them.  Sometimes we'll pick the right one, and sometimes we won't.  

 

If our history has shown anything, though, it's that we won't give up.  We'll continue to work on our products and fix our past mistakes.  I can't announce anything with respect to future versions of Voice Command because announcing features is marketing's jurisdiction, and I'm a developer.  But I will say that, yes, we understand that Voice Command over BTh is an important feature.  You'll have to look to our history to figure out where we'll go from there.

 

Mike Calligaro