Should vs Must

This morning, my co-worker and myself had another disagreement.  This time, it was on the nature of words and their meaning, as well as best practices for mailers.

Now, if my regular readers haven’t figured it out by now, my position when it comes to technology is that it’s a tool used by humans to save time and effort but that it ultimately serves humans.  And, it is prone to failure and that as humans we should expect that technology will be prone to failure.  Because of this, humans will have to go in and diagnose errors and the easier it is to trace, the faster a human can resolve that error.  This means that error logs should be a bit “wordier” than they need to be and easily accessible.  This often means sacrificing storage space or hardware efficiency for the sake of making it easier for humans to diagnose problems after the fact.  For example, if you were logging information in emails that you scan, there is certain information that you might log – sending IP, size of message, etc.  You might decide to save the subject lines of the message, or you might not.  The reason why you wouldn’t do it (outside of preserving privacy) is to save on storage costs.  The more you save from the message, the more hard disk space or database storage you need.  However, if you ever have to go back and figure out why a certain message was or was not classified as spam, and you dig through logs/database tables, then knowing the subject line aids tremendously in this regard.  Thus, sacrificing hardware efficiency (increasing physical storage) makes it easier for a human to troubleshoot any issues, in this case why a message was misclassified.

That sets the backdrop for the context of the discussion.  We were discussing the behavior of a certain MTA made by Microsoft.  Now, I don’t know if my co-worker is right regarding the behavior of this MTA, but we were discussing backscatter and how to fix it.  Now, backscatter is an issue and has been for a long time, but our team addressed it way back in early 2009.  The solution we did was pretty clever but also very simple.  But in regards to this, I was saying that MTAs that bounce messages should include the reason for the bounce and the original message in the bounce and that this is required by the RFC (we are talking about DSNs/NDRs, not rejections in SMTP).  My co-worker insisted that this was not required by the RFC and that inclusion of any part of the original bounce message (either full or headers) does not need to be done at all.  This is where the discussion gets interesting.

According to RFC 3461, which has to do with Delivery Status Notifications, section 6.2:

As described in [5], the first component of a multipart/report content-type is a human readable explanation of the report.  For a DSN, the second component of the multipart/report is of content-type message/delivery-status (defined in [3]).  The third component of the multipart/report consists of the original message or some portion thereof.   When the value of the RET parameter is FULL, the full message SHOULD be returned for any DSN which conveys notification of delivery failure.  (However, if the length of the message is greater than some implementation-specified length, the MTA MAY return only the headers even if the RET parameter specified FULL.)  If a DSN contains no notifications of delivery failure, the MTA SHOULD return only the headers.

I wasn’t really familiar with the RET parameter, but it, too, is defined:

4.3 The RET parameter of the ESMTP MAIL command

The RET esmtp-keyword on the extended MAIL command specifies whether or not the message should be included in any failed DSN issued for this message transmission.  If the RET esmtp-keyword is used, it MUST have an associated esmtp-value, which is one of the following keywords:

- FULL requests that the entire message be returned in any "failed" Delivery Status Notification issued for this recipient.

- HDRS requests that only the headers of the message be returned.

If no RET parameter is supplied, the MTA MAY return either the headers of the message or the entire message for any DSN containing indication of failed deliveries.

To summarize:

  1. If you specify RET=FULL, return the full message unless it’s too long in which case you only return the headers.
  2. If you specify RET=HDRS, return the headers only.
  3. If the DSN contains no notifications of delivery failure (not sure why this would occur), you should return the headers.
  4. If there is no RET value, you can optionally return either the headers or the entire message.

Note that in these cases, you really should return something. (at no time does the RFC ever imply that it is perfectly reasonable to not include at least a portion of the original bounce).  My co-worker, by contrast, takes the position that the term MUST means that it has to be implemented no matter what (the same as me) and that the term SHOULD means that it is optional; it’s not a priority scenario and that if you don’t do it, it’s no big deal.  It’s recommended but anyone may or may not bother to follow said recommendations

I really don’t think that’s what the term SHOULD means.  According to RFC 2119:

1. MUST - This word, or the terms "REQUIRED" or "SHALL", mean that the definition is an absolute requirement of the specification.

3. SHOULD - This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.

5. MAY - This word, or the adjective "OPTIONAL", mean that an item is truly optional.  One vendor may choose to include the item because a   particular marketplace requires it or because the vendor feels that it enhances the product while another vendor may omit the same item.

Going by this language, Recommend is a stronger term than the optional SHOULD that my co-worker interprets it as.  The MTA in question, he insists, need not include either the original full bounce message or the headers.  This is done in order to conserve processing time in the CPU and bandwidth.  Furthermore, even though the RFC says that you SHOULD do this, you don’t have to.  It’s “merely” a recommendation but if you don’t follow it, it is no big deal.  Saving bandwidth is more important than <insert anything I could possibly say in an attempt to show him the error of his ways>.  Furthermore still, the term MAY doesn’t even enter into his vocabulary.  Anything in an RFC that says MAY might as well not even be there.  So long as you do all of the MUST stuff, you are in compliance with the RFC.

That’s not how I see things.  In my view, anything in an RFC that is a SHOULD really ought to be implemented.  The reason that it’s there is because a lot of people have thought this through and it is so much easier to diagnose a problem with the SHOULD stuff then if you leave it out.  In this case, including the headers or the original bounce message is so valuable in diagnosing backscatter or other issues with a DSN.  Yes, you use up more bandwidth and resources but it makes troubleshooting much more flexible and easier to resolve than if you don’t include the bounce.  Being technically complaint with the RFC by leaving out the SHOULDs is playing around with semantics, in my opinion.  The RFC says that if you are going to leave out a SHOULD, you need a very good reason for leaving it out.  You might be able to get by without it when everything is working fine, but technology has errors and breakages in it.  When that happens, you need an audit trail for humans to follow.  And the more information you have, the easier and faster time you will have in resolving it.  Technology is supposed to serve humans.  Perhaps this seems a bit counterintuitive, but redundancy and verboseness can be helpful and what you lose in some areas (bandwidth) you make up in others (troubleshooting).