Testing OneNote performance with huge amounts of text on a page


 

I was reading the newsgroups a few weeks ago and saw this question: “Will onenote run better if I keep my pages shorter?”  OneNote performance is pretty good overall, but I had also noticed when I was creating my  Project Gutenberg importer that things “went haywire” if I were to paste the entire text of Pride and Prejudice into a single OneNote page.  Testers hate the phrase “went haywire” since it doesn’t really tell us anything specific about what happened – we need to push for details so when we create a bug report, we have some firm data on what actually happened and what we expected to happen.  Normally, this comes from the design specification for the feature which spells out behavior, so I checked the specification and see what testing had been done.


 


And there was no detailed case to paste an entire book of prose onto a single OneNote page.  This did not surprise me too much – this is not a common operation at all, and I doubt any significant number of people would do this.  In test terms, we call this a “corner case” – clearly not a mainstream scenario, but something that needs to be tested and understood at a bare minimum. 


 


When I noticed performance was poor with a huge amount of text, I wanted to enter a bug.  The first step was to define “huge amount of text.”  I use OneNote for my blog entries and have noticed that two pages length of text (two screens worth, if you will) is about 900 words or so.  That seems reasonable.  My notes from the math class I was testing were handwritten, and went four to six screens in length.  The largest page I could find in the real notebooks we use at work was 11 screens long.  I decided to use A Room with a View as my test case.  When copied/pasted as is from the Gutenberg site, it takes 179 pages to display.  My first test was a quick test to compare pasting the entire contents of the book from the clipboard to Notepad and into OneNote.  Notepad took 22 seconds, OneNote 14 seconds.  Not too bad (and I’m not worried about the stats of my machine I’m using to test at this point).


 


Scrolling through notepad went fairly quick, but scrolling in OneNote was noticeably slower.  I looked at the data on the page (using OMSpy) and saw there were 9462 OEs (outline elements).  Each line of text from the file became it’s own individual element when pasted into OneNote.  Here’s what the schema looks like for the first two lines of the novel, with the actual text in red:


<one:OE creationTime=”2008-02-03T17:21:09.000Z” lastModifiedTime=”2008-02-03T17:21:21.000Z” objectID=”{E20F877D-1184-487B-8DB8-F11DF96F03C8}{240}{B0}” alignment=”left”>


        <one:T><![CDATA[The Signora had no business to do it,&quot; said Miss Bartlett, &quot;no]]></one:T>


      </one:OE>


      <one:OE creationTime=”2008-02-03T17:21:09.000Z” lastModifiedTime=”2008-02-03T17:21:21.000Z” objectID=”{E20F877D-1184-487B-8DB8-F11DF96F03C8}{242}{B0}” alignment=”left”>


        <one:T><![CDATA[business at all. She promised us south rooms with a view close]]></one:T>


      </one:OE>


 


And a blank line would be stored in OneNote something like this:


      <one:OE creationTime=”2008-02-03T17:21:09.000Z” lastModifiedTime=”2008-02-03T17:21:21.000Z” objectID=”{E20F877D-1184-487B-8DB8-F11DF96F03C8}{243}{B0}” alignment=”left”>


        <one:T><![CDATA[]]></one:T>


      </one:OE>


 


 


Since each line of text from the original file becomes its own element, we track the last modified time for each element and assign it a unique ID.  This way, if you (or someone else in a shared notebook) makes a change, we can show who modified which element and at what time the change took place.  It does add some overhead to the file.  Task Manager showed OneNote memory usage would rise about 8MB of memory when displaying the page.  My last quick check was opening the page with the book on it took about 3 seconds.


 


In this case, OneNote has more elements to deal with than necessary.  Even the blank lines between paragraphs become elements.  Since these text files are formatted for display on 80 column terminals (remember them?) with no word wrap capability, I decided to alter the text alignment by removing the bogus line feeds at the end of lines within paragraphs and remove what I feel are unneeded blank lines between paragraphs.  For Chapter 1 alone, this reduced the number of Outline Elements from 515 to 123.  That was a 76% reduction in OE overhead alone.  There are 20 chapters and some preface and footer text, so the number of elements went from about 9600 to about 2400.  Memory usage went to 5.8 MB, down from the original 8MB.  A big improvement to be sure, but there was still some noticeable sluggishness when switching to  a page.


 


It was less than a second, so I couldn’t use my watch to time it very well (this is very informal testing), but I still wanted to improve the performance.  I decided to put one chapter per page.  Now OneNote doesn’t have to deal with ~2400 elements all at once, and only has to handle 100-200 at a time.  Memory usage was about 100K at the most – it was such a small change that without writing an app to track it, I couldn’t measure it.  Switching to the page with an individual chapter is the same speed as any other page.  And I like the formatting changes as well – if I resize the outline horizontally , the text word wraps as I expect.


 


Questions, comments, concerns and criticisms always welcome,


John

Comments (6)

  1. Kathy Jacobs says:

    John,

    I would be curious to know how the huge pages look in print preview (or save as PDF for that matter). I am wondering if the old "double line when printing" problem will come back with this scenario….

    Just me being a pain in the ____ again 🙂

  2. My last update was about indirect performance testing of OneNote when I was creating the Gutenberg addin.

  3. JohnGuin says:

    I tested the print preview with an entire text.  It took almost exactly the same amount of time to render the print preview as to open the page initially.  Memory usage went up an equivalent amount.  Good test case, though.

    I did not see the "double line" behavior you mention.  I see that when workign with HTML now and then.  These books are "just text," so should not show that behavior.

    John

  4. My last update was about indirect performance testing of OneNote when I was creating the Gutenberg addin

  5. Allen says:

    What do you mean by ‘went haywire?’ 🙂

    I have this problem where I have tabs that are >100mb with mostly embedded images and sometimes they take forever to sync over the network (seem to hang) or function very slowly.

    I can’t tell if this is an issue with OneNote or our filesystem. I am considering moving the notebook to Sharepoint, but am afraid that performance with WebDAV will be worse.

    Suggestions?

  6. JohnGuin says:

    By "went haywire" I mean performance (responsiveness) of OneNote for that page went downhill to the point using it was difficult.  This is different than your case, though.

    OneNote works mostly in "whole file mode" when using notebook sections on shares.  So if you have a 100MB notebook and add a page, we sync the section (not the entire notebook) when the update happens.  100MB is not unreasonable.

    I don’t have enough data to tell if the perf would improve, degrade or remain the same after a move.  The only thing I can think to do is give it a try to see if you can detect the changes for your system.  I don’t know the speed of your LAN connection and that would have the greatest effect on performance for this scenario.

    Let me know how it turns out, and feel free to shoot me an email on this,

    John