A quick test and tip for OneNote audio search

Looking through the newsgroup this weekend showed this comment: basically, this person was having some difficulty getting audio search working well. I had some problems with this in the past and wanted to share a tip to possibly get this working better for you.

First, the quality of the audio should be as high as possible. Dan wrote about this during the OneNote 2007 development phase and documents some tips at his blog.

This tip might get you a little further along the line. Here's the testing I performed along with the tip.

The first step was to get an audio file. I downloaded an audio recording of Patrick Henry's "Give me liberty or give me death" speech (read by Richard Shulman) from http://www.history.org/media/audio.cfm. It's 2.76MB and about as high quality as you can get. Expected search results for this file should be very accurate.

I added it to a page and made sure audio indexing was enabled in Tools | Options | Audio and Video:


Since the audio indexer only runs when OneNote is idle, and at low priority, I went away for a day's worth of meetings. That gave the indexer plenty of time to run. The rule of thumb is to let the indexer run for 2-3 times the length of the recording. If you have a day's worth of audio, letting the indexer run overnight is probably a good idea.

Now, I know the words "I know not" (what course others may take) are in this speech, so I searched for them. OneNote found two results:


Already we have some erratic results. The phrase "I know not" is only in the text of the speech at the 7:54 mark. The text of the speech at 7:35 is "idle what is". Still, this shows the non-exact nature of phonetic matching.

Now the tip for what to do if no results are found and you really expected some. If no matches were found, down at the bottom of the results page is the "View More" link to let you change the threshold for audio searches. Click that.


If you had some results found, the UI will look like this:


Obviously, you want to click the "Click here to view matches" link…

Down at the bottom of this task pane is the dropdown to let you lower the confidence level the audio indexer uses to find a match:


Lowering this threshold should cause more potential results to be found, but the quality of the results may be lower. In other words, you may get more "false positive" results. Likewise, raising it to 0.8 will use a higher threshold and narrow results accordingly. In this case, changing it to .3 gives the results I expected:


Lowering it to 0.3 finds one extra result, this time the word "involatile" at the 5:05 mark. It's confidence level was 0.4, so that explains why it was not shown with the default of 0.5. At 0.1, a result from "there is no longer any room for hope" gets returned at the 4:58 mark. Lowering the threshold did find more results, but with this test file, they were clearly not correct.

One last quick test here. Since audio searching is based on the phonetic representation of words, a phrase like "I know not" should produce the same results as "eye no knot." A search for that second phrase finds:


Which is exactly what I expected. The test passed!

And finally, to answer the final question from the original thread about finding out if audio indexing is finished. An easy way (from the test point of view) is to look in the audio cache folder for an FI file for each embedded audio or video file. That location on my Vista machine is C:\Users\John\AppData\Local\Microsoft\OneNote\12.0\Audio Cache. To help find pages with embedded files, you can get my powertoy.

Questions, comments, concerns and criticisms always welcome,


Comments (4)

  1. jsgoodrich says:

    I have read your blog looking for answer to the following question.

    How, can I get audio indexing to work again.  This year my onenote 2010 64bit will not index any audio.  It says that I have 290 pages that have been skipped.

    In C:UsersjsgoodrichAppDataLocalMicrosoftOneNote14.0Audio Cache

    I have only thri audio files with the WMA extension

    I have one main.ri which was updated today.

    There are 13.fi files dates range from  4-29-2011

    I have left my computer on.  Sleep mode off, screen saver on, for on week.  

    I have looked on every website, newsgroup, and blog I can find to fix the problem.  Anythoughts?

    I have also turned over audio indexing, restarted computer, then turned on audio indexing also.  Same number.

  2. John says:

    Can you email me via the link at the upper right?  I want to get some details about your audio card (in control panel sound) that may help isolate the problems.

  3. jsgoodrich says:

    So I have sent you the information it is now a month latter any update?

  4. John says:

    Sorry – I have been out of the office and dropped the ball here.  Can you re-send me the email?

Skip to main content