Why does the access violation error message put the operation in quotation marks? Is is some sort of euphemism?

When an application crashes with an access violation, the error message says something like

The instruction at "XX" referenced memory at "YY". The memory could not be "read".

Why is the operation in quotation marks? Is this some sort of euphemism?

The odd phrasing is a consequence of globalization. The operation name is a verb in the infinitive ("read", "write"), but depending on how the containing message is localized, it may need to take a different form. Since the kernel doesn't understand grammar, it just puts the words in quotation marks to avoid having to learn every language on the planet. Imagine if it tried:

The memory could not be readed.

The kernel tried to form the passive, which is normally done in English by adding "–ed" to the end of the verb. Too bad "read" and "write" are irregular verbs!

The more conventional solution for this type of problem is to create a separate error message for each variant so that the text can be translated independently. rather than building sentences at runtime,

The access violation error message is in a pickle, though, because the underlying status code is STATUS_ACCESS_VIOLATION, and that message contains three insertions, one for the instruction address, one for the address being accessed, and one for the operation. If there were three different status codes, like STATUS_ACCESS_VIOLATION_READ, STATUS_ACCESS_VIOLATION_WRITE, and STATUS_ACCESS_VIOLATION_EXECUTE, then a separate string could be created for each. But that's not how the status codes folks decided to do things, and the translation team was stuck having to use the ugly quotation marks.

Comments (24)
  1. pc says:

    I really have always wondered about the awkward phrasing there. I would always read them as some kind of scare quotes or something. Makes you appreciate how good most of Windows is at being localized, that the rare case like this sticks out at you.

  2. Joshua says:

    I always thought it was intended to warn the end user to not try to interpret it as it is not less than two levels of abstraction below what they are working with.

  3. David says:

    So the other forms of this message are as below?

    The instruction at "XX" referenced memory at "YY". The memory could not be "write".

    The instruction at "XX" referenced memory at "YY". The memory could not be "execute".

  4. Lockwood says:

    Ah, so the "correct" pronunciation of that error is in fact

    The instruction at XX referenced memory at YY. The memory could not be reed

    rather than

    The instruction at XX referenced memory at YY. The memory could not be red

    since it is talking about the operation "to read" rather than the statement "was read"?

  5. Medinoc says:

    This is weird, because I remember the action word actually being in the past tense:

    My memory may be tricking me, but I remember seeing, on a French Windows 9x, "La mémoire ne peut pas être "written"." And Google agrees with me.

    That would mean the problem is not tense, but rather that the action word is not localized.

  6. David Bakin says:

    Lisp string formatting has an code to form a plural, see ~P at http://www.cs.cmu.edu/…/node200.html where the example shows it can even handle "try"/"tries".

    It would sure be convenient to have that facility in C#, but … globalization.

  7. RobSiklos says:

    Why not this:

    "The instruction at "XX" referenced memory at "YY". The "read" operation failed."

  8. I can't remember whether it's ReSharper or Visual Studio's native ability, but one them when suggesting variable names, when using code templates, generally correctly back works plural collection variables to their singular form.  Although I guess working backwards from a plural to a singular might be a bit easier.  I'm tempted to find out what it suggest when the plural and singular are the same.

  9. anyfoo says:

    While I understand the problem in general, in this specific instance, I am wondering why the error message is not phrased to something like

       The instruction at "XX" referenced memory at "YY". The "read"-operation could not be performed due to an access violation.

    Which prevents the problem in the same way, but sounds way less awkward.

  10. I'll jump on the bandwagon.

    Here's my "most natural language" attempt:

    Instruction XX said to (read from/write to/execute) YY, but we couldn't.

    Here's my "most friendly to the status folks and localizer folks" attempt.

    Access violation. Source instruction: XX. Target instruction: YY. Attempted access: (read/write/execute)

    Both of these fix the use of passive tense (ick.)

  11. @anyfoo says:

    > Which prevents the problem in the same way, but sounds way less awkward

    True, but you've access violated so the situation is already awkward and won't be smoothed over by correct grammar. This is hopefully a rare situation. And doesn't Windows now say something generic to the user like 'the application has stopped working… blah blah other stuff about looking for a fix online, reporting it, or whatevs'.

  12. Andy says:

    I'd guess that there's an aversion to changing the text of error messages where the benefit is low, since it reduces the number of hits when searching the knowledge base or the wider internet. I agree that there could have been better choices of phrase originally.

  13. JJJ says:

    @Maurits:  Normally I'm a fan of the royal "we", but I don't think it works in this situation.  Who is "we" here?  It's the operating system.  The operating system didn't try to do the bad operation.

    And it just doesn't sit well with me to present "we" as the nebulous concept of "the computer".  I'm not sure I can articulate why.

    Anyway, my personal preference would be not to try to make such a technical error message friendly.  I'd rather the message say "The program encounted a problem.  Diagnostic info: [at <INSTRUCTION> <ACCESS>:<LOCATION>]".

  14. Anon says:

    The error message is perfectly clear. It has never been unclear.

    Why is everyone re-writing it with awkward English constructs? Some of the re-writes don't even contain all of the necessary data!

  15. lucidfox says:

    Actually, the message for the write operation is The memory could not be "written".

  16. Azarien says:

    @Maurits: this colon-style is often used in Microsoft software in my native Polish, probably to avoid complex noun inflections. Imagine having such messages all over the place: "Found files: 4" instead of more natural "Found 4 files". Just because there's more than one form for plural "files", depending on the number used.

    I understand that localization is a very hard task, but after all those years I think it's time to finally do it right, language by language.

  17. pc says:

    I'll certainly take it over the mess of things like one of my "favorite" errors I used to see regularly years ago, -2147418105 (80010007) "The callee (server [not server application]) is not available and disappeared; all connections are invalid. The call may have executed."

  18. Kythyria says:

    @Azarien: I suspect there's no sane way to build the infrastructure to do that considering how complex, in aggregate, the world's languages are.

    And I prefer "Found: 4" to anything more wordy anyway. Ditto for the original message.

  19. cheong00 says:

    @Medinoc: Agreed in the old days these were not localized, but from what I remember in WinXP or later they have been localized too. (In CHT version anyway.) Anyway since I used only English version of Win2k and WinME, I don't know "how localized" they are.

  20. Myria says:

    It's interesting that Windows NT smashes most general protection faults into STATUS_ACCESS_VIOLATION and not just page faults, despite being quite different at a processor level.

    Windows NT presumably puts considerable effort toward opcode decoding after an exception, because determining whether a #GP should be a STATUS_ACCESS_VIOLATION or a STATUS_PRIVILEGED_INSTRUCTION requires decoding the opcode.  It's the same for turning #DE into STATUS_INTEGER_DIVIDE_BY_ZERO versus STATUS_INTEGER_OVERFLOW, where the difference depends on the value of the operand of the "div" instruction.

  21. MazeGen says:

    @Myria: Correct me if Ï'm wrong but privileged instruction should be INT 6 (#UD) and access violation either INT 13 (#GP) or INT 14 (#PF).

  22. ender says:

    @Kythyria: plural forms can be localized as gettext shows us. Windows 7 used some really awkward phrasing in Slovenian translation because it can't handle plurals.

  23. Matt says:

    @Myria: Yeah. The code behind page faults is messy like that. It decodes the instruction and then resolves the divisor, which might be a register, a memory location or even a SIB byte. It's not pretty, but it's also quite nice in that it gives the programmer a unified status code across different platforms that may have rather different types of "something went wrong" exceptions under the hood.

  24. dave says:

    >It's interesting that Windows NT smashes most general protection faults into

    >STATUS_ACCESS_VIOLATION and not just page faults, despite being quite different

    > at a processor level.

    Processor independence, I suppose.  Generally speaking, the only commonality you get is "the MMU says no".

Comments are closed.