I decided to check out the size of Eric’s DIT …

... take some time, measuring the exact dimensions of Eric's DIT
... and I must say, I've seen a fair amount of DITs in my time, and I can say with a
fair amount of certainty
, Eric's DIT is the biggest I've ever seen!  I can't believe what a massively huge a DIT Eric has.

Note: while this is about an Active Directory database, Exchange is
based on the same database technology, so it would (and does) have similar space hierarchy

Table of ESE Space usage:

The black in this table is just the output from 3 of the columns of esentutl /ms adamntds.dit (original report), the blue are columns/rows I've added to break out the space usage in a clearer way:

Name Friendly name Owned Available Owned(GB) Avail(GB) b-tree lvls % of DB % of Total Idxs

<calc: DB real> 268179344 6088 0.000 0.046

268179344 218 2046.046 0.002

<calc: datatable real> 268178751 5689 2046.041 0.043

268178751 2260 2046.041 0.017

<calc: Row Data> 186707905 2260 1424.468 0.017 5 69.62%
<Long Values>

42 18 0.000 0.000 2 0.00%

<sum: Idx Totals> 81470804 3411 621.573 0.026 0 30.38% 100.00%
Int: PDNT + Name 11482791 7 87.607 0.000 5 4.28% 14.09%
Int: NC + objGuid 10870892 5 82.938 0.000 4 4.05% 13.34%
Att: objectGuid 10791866 3 82.335 0.000 4 4.02% 13.25%
Att: cn 9658638 51 73.690 0.000 4 3.60% 11.86%
Att: name 8999729 83 68.662 0.001 4 3.36% 11.05%
Int: Ancestry 7917036 10 60.402 0.000 4 2.95% 9.72%
Int: Repl USN 7083583 251 54.043 0.002 4 2.64% 8.69%
Att: objectCategory 5274627 21 40.242 0.000 4 1.97% 6.47%
Int: Repl Created USN 4479144 34 34.173 0.000 4 1.67% 5.50%
Att: uSNChanged 4279711 0 32.652 0.000 4 1.60% 5.25%
Int: deltime 371267 15 2.833 0.000 4 0.14% 0.46%
Att: isDeleted 261493 2929 1.995 0.022 4 0.10% 0.32%

... deleted about a dozen small indices ...

I'll discuss the permutations I performed on the esentutl /ms output, in the hopes it will be clear ...

First I sum up the owned space for all indices in the datatable, this comes out to
Note the #'s above may not add up exactly because I deleted a dozen or
so super small indices.  I summed up all the indices because it
makes the next calculation easier, and also so we can get the "% of
Total Idxs" column as well.

So first understand that ESE's "owned" space is hierarchical, so the "datatable" owns all the space owned by each of
the indices and the LV B-Tree in the datatable.  But the primary B-Tree for the datatable
also contains (and thus owns) the normal row data.  So the real data that is in the
regular row data for the datatable is
268178751 (datatable) - 42 (datatable's LV B-Tree owned) - 81470804 (owned by sum of all datatable indices) = 186707905 (i.e. the "<calc: Row Data>" line).

I then created a couple columns to turn this page counts into a usable unit (GBs), i.e. <# of pages> * 8 / 1024 / 1024.

Finally I added a friendly name column, so you'd know roughly what the index was indexing.

Some analysis:

From the above table we can easily see the row data 1,424 GBs and all
the indices combined is 621 GBs.  This breaks out like so:

Based on the table above this is showing us a full 30% of this
database is indices!!!  That's a huge amount.  This isn't a
common space breakdown for most AD objects, as the objects making up
Eric's DIT are very very small / light weight.  He was just
creating containers w/ minimal attributes (see Eric's initial post),
and so just the base set of indices on a basic object lead up
to a significant portion of the objects overall "footprint" in the DB.

As for the breakup of the individual index usages, it looks something like this:

Of the secondary indices on the datatable,
10 are always updated!  And another 2 (the very slender ones) are
only updated on delete.  Since there are over 2 billion objects in
this database, that means we inserted about 22 billion B-Tree entries,
kind of neat.

One last, somewhat technical thing that I think a few of you might find
interesting, is that even the
largest 1,424 GB primary B-Tree is only 5 levels deep.  This means that to
locate a specific row (by DNT) will only take 5 disk seeks in the worst
case (cold cache).  B-Trees have this very nice high fan out, that keeps disk seeks minimal.

Interestingly, I dumped the root page, and it only has 3 nodes (TAG 0
doesn't count), what this means is that we could add about 100x more
data to this b-tree and there would be no increase in the # of disk seeks to fetch a row
from this table.

Anyway that seems like enough for now ...

Comments (17)

  1. Lee Flight says:

    Did you look at the performance hit of adding a new index on this data (assuming there was something to index on these lightweight objects)?


  2. BrettSh says:

    Unfortunately we were already over (time) budget by a week, so we weren’t able to try anything else.  We also wanted to see the rate of offline defrag, and a few other experiements.  We didn’t think of this idea, it is a good one, will keep it in mind next time we do a big test.


  3. Some time ago interested thread was started on ActiveDir.org regarding  maximum number of objects supported…

  4. Jim McBee says:

    Yep, that is one whopping big DIT.  Didn’t Compaq/HP create a massive Active Directory using W2K several years back that included every phone directory in the country?   Or was it the State of California.  I forget who did it, but they were trying to create the largest AD database that would feasibly ever be created.  

  5. Some time ago interested thread was started on ActiveDir.org regarding maximum number of objects supported

  6. So I was talking with Brett the other day (yes, that Brett , the one whose blog is only occasionally

  7. 逆援助 says:

    セレブ達は一般の人達とは接する機会もなく、その出会う唯一の場所が「逆援助倶楽部」です。 男性はお金、女性はSEXを要求する場合が多いようです。これは女性に圧倒的な財力があるから成り立つことの出来る関係ではないでしょうか?

  8. 熟女 says:

    熟女だって性欲がある、貴方がもし人妻とSEXしてお金を稼ぎたいのなら、一度人妻ワイフをご利用ください。当サイトには全国各地からお金持ちのセレブたちが集まっています。女性から男性への報酬は、 最低15万円からと決めております。興味のある方は一度当サイト案内をご覧ください

Skip to main content