Testing OneNote performance with huge amounts of text on a page

 

I was reading the newsgroups a few weeks ago and saw this question: "Will onenote run better if I keep my pages shorter?"  OneNote performance is pretty good overall, but I had also noticed when I was creating my  Project Gutenberg importer that things "went haywire" if I were to paste the entire text of Pride and Prejudice into a single OneNote page.  Testers hate the phrase "went haywire" since it doesn't really tell us anything specific about what happened - we need to push for details so when we create a bug report, we have some firm data on what actually happened and what we expected to happen.  Normally, this comes from the design specification for the feature which spells out behavior, so I checked the specification and see what testing had been done.

 

And there was no detailed case to paste an entire book of prose onto a single OneNote page.  This did not surprise me too much - this is not a common operation at all, and I doubt any significant number of people would do this.  In test terms, we call this a "corner case" - clearly not a mainstream scenario, but something that needs to be tested and understood at a bare minimum. 

 

When I noticed performance was poor with a huge amount of text, I wanted to enter a bug.  The first step was to define "huge amount of text."  I use OneNote for my blog entries and have noticed that two pages length of text (two screens worth, if you will) is about 900 words or so.  That seems reasonable.  My notes from the math class I was testing were handwritten, and went four to six screens in length.  The largest page I could find in the real notebooks we use at work was 11 screens long.  I decided to use A Room with a View as my test case.  When copied/pasted as is from the Gutenberg site, it takes 179 pages to display.  My first test was a quick test to compare pasting the entire contents of the book from the clipboard to Notepad and into OneNote.  Notepad took 22 seconds, OneNote 14 seconds.  Not too bad (and I'm not worried about the stats of my machine I'm using to test at this point).

 

Scrolling through notepad went fairly quick, but scrolling in OneNote was noticeably slower.  I looked at the data on the page (using OMSpy) and saw there were 9462 OEs (outline elements).  Each line of text from the file became it's own individual element when pasted into OneNote.  Here's what the schema looks like for the first two lines of the novel, with the actual text in red:

<one:OE creationTime="2008-02-03T17:21:09.000Z" lastModifiedTime="2008-02-03T17:21:21.000Z" objectID="{E20F877D-1184-487B-8DB8-F11DF96F03C8}{240}{B0}" alignment="left">

        <one:T><![CDATA[The Signora had no business to do it,&quot; said Miss Bartlett, &quot;no]]></one:T>

      </one:OE>

      <one:OE creationTime="2008-02-03T17:21:09.000Z" lastModifiedTime="2008-02-03T17:21:21.000Z" objectID="{E20F877D-1184-487B-8DB8-F11DF96F03C8}{242}{B0}" alignment="left">

        <one:T><![CDATA[business at all. She promised us south rooms with a view close]]></one:T>

      </one:OE>

 

And a blank line would be stored in OneNote something like this:

      <one:OE creationTime="2008-02-03T17:21:09.000Z" lastModifiedTime="2008-02-03T17:21:21.000Z" objectID="{E20F877D-1184-487B-8DB8-F11DF96F03C8}{243}{B0}" alignment="left">

        <one:T><![CDATA[]]></one:T>

      </one:OE>

 

 

Since each line of text from the original file becomes its own element, we track the last modified time for each element and assign it a unique ID.  This way, if you (or someone else in a shared notebook) makes a change, we can show who modified which element and at what time the change took place.  It does add some overhead to the file.  Task Manager showed OneNote memory usage would rise about 8MB of memory when displaying the page.  My last quick check was opening the page with the book on it took about 3 seconds.

 

In this case, OneNote has more elements to deal with than necessary.  Even the blank lines between paragraphs become elements.  Since these text files are formatted for display on 80 column terminals (remember them?) with no word wrap capability, I decided to alter the text alignment by removing the bogus line feeds at the end of lines within paragraphs and remove what I feel are unneeded blank lines between paragraphs.  For Chapter 1 alone, this reduced the number of Outline Elements from 515 to 123.  That was a 76% reduction in OE overhead alone.  There are 20 chapters and some preface and footer text, so the number of elements went from about 9600 to about 2400.  Memory usage went to 5.8 MB, down from the original 8MB.  A big improvement to be sure, but there was still some noticeable sluggishness when switching to  a page.

 

It was less than a second, so I couldn't use my watch to time it very well (this is very informal testing), but I still wanted to improve the performance.  I decided to put one chapter per page.  Now OneNote doesn't have to deal with ~2400 elements all at once, and only has to handle 100-200 at a time.  Memory usage was about 100K at the most - it was such a small change that without writing an app to track it, I couldn't measure it.  Switching to the page with an individual chapter is the same speed as any other page.  And I like the formatting changes as well - if I resize the outline horizontally , the text word wraps as I expect.

 

Questions, comments, concerns and criticisms always welcome,

John