Riffing on Raymond - Network performance...

I keep on doing this, clearly it's evidence of a lack of imagination on my part...

Raymond's post a while ago discussed some of the problems with network latency (no, I'm not going to touch that particular can of worms).

It's amazing how many people don't understand how big a deal this problem is.  When I joined the Exchange team back in the mid 1990s, the perf team was spending a HUGE amount of time analyzing the Exchange store RPC traces trying to figure out ways of squeezing out every single byte from the RPC traffic.

They'd defined compressed forms of Exchange EntryIDs, they were considering encoding Unicode strings using some neutral encoding (UTF8 hadn't been invented at that point, so they were trying to roll their own).

I came on the team and looked at what they were doing and was astounded.  They were sweating bricks trying to figure out how to squeeze out individual bytes of data from each packet.

The thing is that the reality was that for the vast majority of cases, all that work didn't actually make a difference.

The reason has to do with the basic nature of Ethernet based networking (token ring and ATM have different characteristics, but my comments here apply to them as well, it's just that the numbers and behavior characteristics are slightly different).

In general for all LAN networks, it takes essentially the same time to send one byte of data as it does to send 1K of data.  When you start sending more than 1K of data, then the numbers will start to grow (because you're sending more than one packet), but even then, the overhead of sending 10K of data isn't significantly higher than sending 1K.

On the other hand, round trips will KILL your performance.  So if you've got a choice between sending 100 messages with 1K in each message and 1 message with a 100K payload, you want to send the 1 100K message all the time.

Needless to say, I'm MASSIVELY glossing over the issues associated with sending data across a network, the above is simply a reasonable rule of thumb - the round trips are what matters, not the bytes being sent.

 

Now, having said all that, when you're dealing with dial-up networks, the rules are completely different.  On a 9600 baud connection, it takes one millisecond to send one byte, which means that every single byte counts.  In the Exchange case, since Exchange was designed for corporations with wired networks, it made sense to design the client/server protocol for the LAN environment.  But when you're designing a feature that's intended to be used over dialup, the rules are totally different.  Among other things to consider, on a dial up network, the modems themselves do compression, so compressing the data before transmission isn't always a benefit (compressing already compressed data tends to increase the size of the compressed data (assuming the compression algorithm's worth its salt)).