A few performance tips for using the OneNote API

Hello world,

In StackOverflow and twitter, we often hear questions on how to make queries to the API faster. Here are a few recommendations:

Always log the "X-CorrelationId", "Date" and "Request-Processing-Time" headers on the OneNote API response

First - make sure you have the API's performance-related information logged. All requests to the OneNote API have those 3 headers - here's an example:

 HTTP/1.1 200 OK
Cache-Control: no-cache
Pragma: no-cache
Content-Type: application/xml
Expires: -1
Server: Microsoft-IIS/10.0
X-CorrelationId: d0ff1746-6b9d-456f-ae18-3c7bc822d53e
X-UserSessionId: d0ff1746-6b9d-456f-ae18-3c7bc822d53e
X-OfficeFE: deployment29(8).MSOffice.ClassNotebookApiSingleBox_IN_0
X-OfficeVersion: 16.0.8418.3004
X-OfficeCluster: localhost
P3P: CP="CAO DSP COR ADMa DEV CONi TELi CUR PSA PSD TAI IVDi OUR SAMi BUS DEM NAV STA UNI COM INT PHY ONL FIN PUR"
X-Content-Type-Options: nosniff
Request-Processing-Time: 838.9842 ms
OData-Version: 4.0
X-AspNet-Version: 4.0.30319
X-Powered-By: ASP.NET
Date: Fri, 28 Jul 2017 16:54:58 GMT
Content-Length: 32552

X-CorrelationId: This uniquely identifies any API request. In case there are any issues with the API, we can use this for debugging.
Request-Processing-Time: Latency of the API.
Date: The date when the request was made.

Having this in your logs will make debugging performance issues easier.

Use $select to select the minimum set of properties you need

When querying for a resource (say for example, sections inside a notebook), you make a request like:

 GET ~/notebooks/{id}/sections

This retrieves all the properties of the sections - but perhaps you don't really need them all, maybe you only need the id and the name of the section... in that case, it is better to say:

 GET ~/notebooks/{id}/sections?$select=id,name

The same applies to other APIs - the fewer properties you select, the better.

Use $expand instead of making multiple API calls

Suppose you want to retrieve all of the user’s notebooks, sections, and section groups in a hierarchical view.

It can be tempting to do the following:

  1. Call GET ~/notebooks to get the list of notebooks
  2. For every retrieved notebook, call GET ~/notebooks/{notebookId}/sections to retrieve the list of sections
  3. For every retrieved notebook, call GET ~/notebooks/{notebookId}/sectionGroups to retrieve the list of section groups
  4. … optionally recursively iterate through section groups

While this will work (with a few extra sequential roundtrips to our service), there is a much better alternative. See our blog post about expand for a more details.

 GET ~/api/v1.0/me/notes/notebooks?$expand=sections,sectionGroups($expand=sections)

This will yield the same results in one network roundtrip with way better performance – yay!

When getting all pages for a user, do so for each section separately

While we do expose an endpoint to retrieve all pages, this isn't the best way to get all the pages the user has access to - when the user has too many sections, this can lead to timeouts/bad performance. It is better to iterate each section, getting pages for each one separately.

So it is significantly better to do several (especially if you don't need all sections):

 GET ~/sections/{id}/pages

Than several (this API is paged, so you won't be able to fetch them all at once):

 GET ~/pages

When getting page metadata, override default lastModifiedDate ordering

It is faster for us to get pages when we don't have to sort them by lastModifiedDate (which is the default ordering) - you can achieve this by sorting by any other property:

 GET ~/sections/{id}/pages?$select=id,name,creationDate&$orderby=creationDate

Happy coding,
Jorge