Over the past 20 years or so, I’ve written both (I wrote the first NT networking client and I wrote the IMAP and POP3 servers for Microsoft Exchange), so I think I can state this with some authority. I want to be clear – it’s NOT easy to write a server – especially a high performance server. But it’s a heck of a lot easier to write a server than it is to write a client.
Way back when, when I joined the NT project (back in 1989ish), my job was to write the network file system (redirector) for NT 3.1.
Before that work item was assigned to me, it was originally on one of the senior developers on the team’s plate. The server was assigned to another senior developer.
When I first looked at the schedules, I was surprised. The development schedule for both the server AND the client was estimated to be about 6 months of work.
Now I’ve got the utmost respect for the senior developers involved. I truly do. And the schedule for the server was probably pretty close to being correct.
But the client numbers were off. Way off. Not quite an order of magnitude off, but close.
You see, the senior developer who had done the scheduling had (IMHO) forgotten one of the cardinal rules of software engineering:
Writing servers is easy, writing clients is hard.
If you think about it for a while, it actually makes sense. When you’re writing a server, the work involved is just to ensure that you implement the semantics in the specification – that you issue correct responses for the correct inputs.
But when you write a client, you need to interoperate with a whole host of servers. Each of which was implemented to ensure that it implements the semantics in the specification.
But the thing is, the vast majority of protocol specifications out there don’t fully describe the semantics of the protocol. There are almost always implementation specifics that leak through the protocol abstraction. And that’s what makes the life of a client author so much fun.
These leaks can be things like the UW IMAP server not allowing more than one connection to SELECT a mailbox at a time when the mailbox was in the MBOX format. This is a totally reasonable architectural restriction (the MBOX file format doesn’t allow the server to support multiple clients simultaneously connecting to the mailbox), and the IMAP protocol is mute on this (this is not quite true: there are several follow-on RFCs that clarify this behavior). So when you’re dealing with an IMAP server, you need to be careful to only ever use a single TCP connection (or to ensure that you never SELECT the same mailbox on more than one TCP connection).
They can be more subtle. For example the base HTML specification doesn’t really allow for accurate placement of elements. But web site authors often really want to be able to exactly place their visual elements. Some author figured out that if you insert certain elements in a particular order, they can get their web site laid out in the form they want. Unfortunately, they were depending on ambiguity in the HTML protocol (and yes, HTML is a protocol). That ambiguity was implemented in one way with one particular browser.
But every other browser had to deal with that ambiguity in the same way as the first browser if they wanted to render the web site properly. It’s all nice and good to say to the web site author “Fix your darned code”, but the reality is that it doesn’t work. The web site author might not give a hoot about whether the site looks good for your browser, as long as it looks good on the browser that’s listed on the site, they’re happy campers.
The server (in this case the web site author) simply pushes the problem onto the client. It’s easier – if the client wants to render the site correctly, they need to be ambiguity-for-ambiguity compatible with the existing browser.
Ambiguity is a huge part of what makes making clients so much fun. In fact, I’m willing to bet that every single client for every single network protocol implemented by more than one vendor has had to make compromises in design forced by ambiguities in the design of the protocol (this may not be true for protocols like DCE RPC where the specification is so carefully specified, but it’s certainly true for most other protocols). Even a well specified protocol like IMAP has had 114 clarifications made to the protocol between RFC 2060 and RFC3501 (the two most recent versions of the protocol). Not all the clarifications were to resolve ambiguities (some resolved spelling errors and typos), but the majority of them were to deal with ambiguities.
Clients also have to deal with multiple versions of a protocol. For CIFS clients, the client needs to be able to understand how to talk to at least 7 different versions of the protocol, and they need to be able to implement their host OS semantics on every one of those versions. For the original NT 3.1 redirector, more than 3/4ths of the specification for the redirector was taken up with how each and every single Win32 API would be implemented against various versions of the server. And each and every one of those needed specific code paths (and test cases) in the client. For the server, each of the protocol dialects was essentially the same – you needed to know how to implement the semantics of the protocol on the server’s OS.
For the client, on the other hand, you had to pick and choose which of the protocol elements was most appropriate given the circumstances. As a simple example, for the IMAP protocol, clients have two different access mechanisms – you can access the messages in a mailbox by UID or by sequence number. UIDs have some interesting semantics (especially if the client’s going to access the mailbox offline), but sequence numbers have different semantics. The design of the client heavily depends on this choice – there are things you can’t do if you use UIDs but there’s a different set of things you can’t do if you use sequence numbers. It’s a really tough design decision that will quite literally reflect the quality of your client – is your IMAP client nothing more than a POP3 client on steroids, or does it fully take advantage of the protocol? Another decision made by clients: Do they fetch the full RFC 2822 header from the server and parse it on the client, or do they fetch only the elements of the header that they’re going to display?
So when you’re thinking about writing networking software, just remember the rule:
Writing servers is easy, writing clients is hard.
You’ll be happy you did.