Responding to a question in a tweet

I’m still reading your tweets and saw this one the other day:

Is there a way to "purge" MS Onenote? Kinda like purging AutoCAD of unused blocks, layers, and other things to make the file lighter?

Pasted from <https://twitter.com/archiwiz>

It's kind of hard to tell what Archiwiz is asking. He may want to optimize the notebook section file sizes, or the cache file sizes, or may just be idly speculating on optimization in general. There is a way to "purge" some data from OneNote to minimize file sizes. All these settings can be found in Tools | Options.

First, the cache file is almost certainly the largest single file OneNote creates on your machine. This is a mirror of all the ntoebooks you have open. When you make a change to a page, only the data from that change is written to this cache file. This individual change would make no sense except as it applies to the file it is modifying. When the cache syncs to the file, the changes are applied to the resulting section file in the notebook. As you can imagine, that list of changes can grow, and as that list grows, the file size grows right along with it. You can even start imagining that if the third change merely removes the second change in the list, then an optimization would be to remove the second and third changes from the list completely. Wouldn't it be great to be able to control how long those lists gets before the logic to optimize the file size kicks in? We expose those settings and here's where they are.

In Tools | Options | Save you can see these settings in the middle of the dialog:

clip_image001

The second setting there is easy to explain. If you let OneNote sit idle for 30 minutes, it will optimize the files. A completely separate consideration to file size for changing this is power consumption, especially among laptops. If the hard drive has gone to sleep, waking it up to spin it to optimize this file might not be the best use of the battery.

The percentage of unused space to allow in the files without optimizing is a little trickier to explain. Think of the cache as needing a little extra "padding" in it to allow for changes to be immediately saved as you enter them. If there was no padding, then every time you added any content to a page, OneNote (technically, the OS) would need to resize that file to add the new data to it. If there is some blank padding allowed, the file won't need to be resize - instead, some of that blank space can be used to hold the changes you make. If you make a lot of deletes, you can wind up with a large amount of blank space, so you can change the OneNote optimize behavior here. I don't recommend changing it, by the way, but feel free to experiment. If you have a laptop, you may want to keep an eye on hard drive access. Tweaking these could cause the drive to spin more which may drain your battery faster.  For OneNote 2007, one of the team actually hooked up a power meter to a laptop to test battery consumption on this.  I’m looking for photos of that and will post them if I can find them.

To get back to a different interpretation of the original question, though, there is not much you can do to "purge" OneNote content. You could probably get a few bytes back by removing author information with the privatizer powertoy at https://blogs.msdn.com/johnguin/archive/2007/12/15/notebook-cleaner-and-privatizer-powertoy.aspx, but you would lose some search functionality. Since we already store your changes as a difference, there is not much more optimization possible. Details on the sync mechanism and how this works can be found at https://blogs.msdn.com/descapa/archive/2007/02/20/a-teaser-on-how-onenote-storage-and-replication-works.aspx

Now on to twitter to see if I can find ArchiWiz and let him know I blogged about this.

(Oh, and here's a small teaser. Now that Office 2010 is out in the technical preview, I can finally start talking about testing it. One of the features I test is the Equation support, so expect to be hearing more about that. I'm OOF next week, but intend to talk about it when I am back. Start here if you want to get a jump on this: https://blogs.msdn.com/murrays/)

image !

Questions, comments, concerns and criticisms always welcome,

John