What is the size of a GMail inbox now, and in the near future?


I just heard that Gmail is now finally open for subscriptions, so I headed to gmail.com. While looking at their web page, one weird thing caught my attention: an ever increasing counter of “megabytes” that was contiuously updated. Being a curious guy, I looked at the source. Would this query in realtime the server? Surprise, no! The counter is actually a “fake” one.


Initially, I said to myself: How lame… But then later I realized something very interesting that can give insights on how GMail will evolve in the near future. Read on.


Here is the relevant code sequence:



var CP = [
 [ 1122879600000, 2450 ],
 [ 1125558000000, 2550 ],
 [ 1136102400000, 2950 ]
];


var quota;


[…]


function OnLoad() {
  gaia_setFocus();


  MaybePingUser();
  el(“gaia_loginform”).Passwd.onfocus = MaybePingUser;


  LogRoundtripTime();
  if (!quota) {
    quota = el(“quota”);
    updateQuota();
  }

}


function updateQuota() {
  if (!quota) {
    return;
  }
 
  var now = (new Date()).getTime();
  var i;
  for (i = 0; i < CP.length; i++) {
    if (now < CP[i][0]) {
      break;
    }
  }
  if (i == 0) {
    setTimeout(updateQuota, 1000);
  } else if (i == CP.length) {
    quota.innerHTML = CP[i – 1][1];
  } else {
    var ts = CP[i – 1][0];
    var bs = CP[i – 1][1];
    quota.innerHTML = format(((now-ts) / (CP[i][0]-ts) * (CP[i][1]-bs)) + bs);
    setTimeout(updateQuota, 1000);
  }
}


It looks like it’s just using the local CP array to compose a somewhat random number which is later displayed on the screen. Right now, it says 2529.482494 and it seems to grow up with about 30 bytes per second. But this algorithm gives some insight on how the storage will grow over time. The key is the CP array. The algorithm above tries to assume a linear growth between the three timestamps below, and interpolate between the associated capacities: 



var CP = [
 [ 1122879600000, 2450 ],
 [ 1125558000000, 2550 ],
 [ 1136102400000, 2950 ]
];


Which means, after a little reverse engineering, that at the first timestamp the size of the GMail mailbox will be 2450 MB, then 2550 MB, then 2950 MB. But what the timestaps are? Let’s write a JScript small program to find out:



var d = new Date();
var t = d.getTime();


var CP = [
 [ 1122879600000, 2450 ],
 [ 1125558000000, 2550 ],
 [ 1136102400000, 2950 ]
];


for (i = 0; i < CP.length; i++) {
  d.setTime(CP[i][0]);
  WScript.Echo(d, ” – “, CP[i][1], ” MB “);
}


This displays the following:



Mon Aug 1 00:00:00 PDT 2005  –  2450  MB
Thu Sep 1 00:00:00 PDT 2005  –  2550  MB
Sun Jan 1 00:00:00 PST 2006  –  2950  MB


Which means that GMail will offer around 3 GB around Jan 1 2006.


Hmm… Interesting way for Google to expose their strategic GMail growth details in this way 🙂

Comments (11)

  1. Joe Chung says:

    gmail.com isn’t open yet, is it?

  2. tzagotta says:

    Cool analysis!

  3. Ron Krauter says:

    —quote—

    Would this query in realtime the server? Surprise, no! The counter is actually a "fake" one.

    Initially, I said to myself: How lame…

    —quote—

    Not at all lame…if it was done in real time, it would turn out to be a perf problem for them so they probably decided to precalculate the values.

  4. VSIDE says:

    Pretty cool stuff. My math might be off, but I think 3GB = 3072MB and since they’re growing at 100MB/month, won’t they actually reach 3GB about one week into Feb. 2006?

    -Jon

    (p.s. please help me improve file tabs for Visual Studio 2007: http://blogs.msdn.com/vside/archive/2005/08/24/455998.aspx)

  5. AdiOltean says:

    You are right, Jon. I "rounded" 2950 MB by 3 GB.

  6. Here is something from Antimail which I do not pretend to understand, except that your gmail inbox will be close to 3GB by Jan 1st. Cool analysis!! Antimail : What is the size of a GMail inbox now, and in…

  7. max says:

    Actually the formula is not at random, it takes the last know size (i-1) it subtracts the next known one (the further step) and then it divides it by the time already elapsed since the last know size went live.

    Basically is a linear progression updated once a second to give the impression that the storage is continuosly getting added.

    The rate at wich the number changes is just a factor of the 2 sizes involved and the 2 dates (both in the CP array).

    I agree that it is lame and useless, might as well just post the fixed number and maybe have a backend routine form time to time (like days) update the script with it.

    Well, if a company that lives on advertising does false advertising …. well.

  8. AdiOltean says:

    Correct – initially I thought that it’s random but then I figured out the linear interpolation trick.

    I think that whoever wrote this code did this intentionally so that the code can be easily "hacked".