Again, I am remiss in my postings...too many irons in the fire these days. Two weeks ago, I posted a challenge to decompose a set of character data (The ANSI Latin 1 Character Set) into valid and invalid equivalence class subsets in order to test the base filename parameter of a filename passed to COMDLG32.DLL on the Windows Xp platform from the user interface using the File Save As... dialog of Notepad.
As illustrated below the filename on a Windows platform is composed of two separate parameters. Although the file name parameter of the Save As... dialog will accept a base filename, a base filename with an extension, or a path with a filename with or without an extension, the purpose of the challenge was to decompose the limited set of characters into equivalence class subsets for the base filename component only (the part outlined with green). (Of course, complete testing will include testing with and without extensions, but let's first focus on building a foundation of tests to adequately evaluate the base filename parameter first, then we can expand our tests from there to include extensions.)
As suggested in the earlier post, in order to adequately decompose this set of data within the defined, real world context (and not in alternate philosophical alternate universes) a professional tester would need to understand programming concepts, file naming conventions on a Windows platform, Windows Xp file system, basic default character encoding on the Windows Xp operating system (Unicode), some historical knowledge of the FAT file system, and even a bit of knowledge of the PC/AT architecture. The following is a table illustrating how I would decompose the data set into equivalence class subsets.
|Valid Class |
V1 – escape sequence literal strings
V2 – space character (0x20) (but not as
V3 – period character (0x2E) (but not as
V4 – ASCII characters
alpha (0x41 – 0x5A, 0x61 – 0x7A)
V5 – Ox80 through 0xFF
V6 – 0x81, 0x8D, 0x8F, 0x90, 0x9D
V7 – Component length between 1 – 251
V8 – Literal string CLOCK$ (NT 4.0 code
V9 – a valid string with a reserved
I1 – control codes
I2 – escape sequence literal string NUL
I3 – Tab character
I4 – reserved words
I5 – reserved words
I6 – reserved characters (/ : < > | )
I7 – reserved character 0x22 as the only
I8 – a string composed of > 1 reserved character
I9 – a string containing only 2 reserved
I10 – period character (0x2E) as only
I11 – two period characters (0x2E) as only
I12 – > 2 period characters (0x2E) as only
I13 – reserved character 0x5C as the only
I14 – space character (0x20) as only character in
I15 – space character (0x20) as first character in
I16 – space character (0x20) as last character in a
I17 – reserved characters (* ?) (0x2A, 0x3F)
I18 – a string of valid characters that contains at least one reserved characters (* ?) (0x2A, 0x3F)
I19 – a string of valid characters that contains at
I20 – string > 251 characters
I21 – character case sensitivity
I22 – empty
Discussion of valid equivalence class subsets
- Valid subset V1 is composed of the literal strings for control characters (or escape sequences) between 0x01 and 0x1F, and including 0x7F. The literal strings for control characters may cause problems under various configurations or unique situations. The book How to Break Software: A Practical Guide to Testing goes into great detail explaining various fault models for these various character values. The literal strings in this subset should be tested as the base filename component and possibly in a separate test as an extension component. However, on the Windows platform the probability of one particular string in this subset behaving or being handled differently than any of the others is very low negating the need to test every string in this subset; although the overhead to test all would be minimal and once complete would not likely require repeated testing of all literal strings in this subset during a project cycle.
- Valid subset V2 provides guidance on the use of the space character in valid filenames. On the Windows operating system a space character (0x20) is allowed in a base filename, but is not permitted as the only character as a file name. Typical behavior on the Windows platform also truncates the space character if it is used as the first character of a base filename or the last character of a base filename. However, if the extension is appended to the base filename in the Filename edit control on the Save or Save As… dialog a space character can be the last character in the base filename. Also note that a space character by itself or as the first character in a filename is acceptable on a UNIX based operating system. Also, although we can force the Windows platform to save a file name with only a space character by typing “ .txt” (including the quotes) in the Filename edit control on the Save/Save As… dialog this practice is not typical of reasonable Windows users’ expectations.
- Valid subset V3 is the period character (0x2E) which is allowed in a base filename, but it is not a valid filename if it is the only character in the base filename (see Invalid subset for the period character).
- Valid subset V4 is composed of ‘printable’ ASCII characters that are valid ASCII characters in a Windows filename. The subset includes punctuation characters, numeric characters, and alpha characters. We could also decompose this subset further into additional subsets including valid punctuation characters, numbers, upper case, and lower case characters if we wanted to ensure that we test at least one element from the superset at least once.
- Valid subset V5 is the set of character code points between 0x80 and 0xFF.
- Valid subset V6 is a superset of subset V5 and are separated only because they are code points that do not have character glyphs assigned to those code point values. These would be interesting especially if we needed to test filenames for backwards compatibility on Windows 9x platforms.
- Valid subset V7 is the minimum and maximum component length assuming the filename is saved in the root directory (C:\).
- Valid subset V8 is a probably a red-herring. On the NT 4 platform the string CLOCK$ was a reserved word. On an older application first created for the Windows NT 4 platform that does not use the system Save/Save As dialog we might want to test this string just to make sure the developer did not hard code the string in an error handling routine.
- Valid subset V9 is an interesting case because this invalid reserved character (0x22) is handled differently when used in first and last character positions of a base filename. When used in the first and last positions of a base filename the characters are truncated and if the remaining string is valid the filename is saved. If only one 0x22 character is used, or if two or more 0x22 characters are used in a string other than the first and last character positions the result will be an error message.
Discussion of invalid equivalence class subsets
- Invalid subset I1 consists of the control code inputs for escape sequences in the range of 0x01 through 0x1F, and also includes 0x7F. Pressing the control key (CTRL) and any of the control codes keys will cause a system beep.
- Invalid subset I2 is the literal string “nul”. Nul is a reserved word but could be processed differently than other reserved words on the Windows platform because it is also used in many coding languages as a character for string termination.
- Invalid subset I3 is the tab character which can be copied and pasted into the Filename textbox control. Pasting a tab into the and pressing the save button will generate an error message.
- The invalid subset I4 includes literal strings for reserved device names on the PC/AT machine and the Windows platform. Using any string in this subset result in an error message indicating the filename is a reserved device name.
- Invalid subset I5 also includes reserved device names for LPT5 – LPT9 and COM5 – COM9. However these must be separated into a unique subset because using these specific device names as the base filename on the Windows Xp operating system result in an error message indicating the filename is invalid.
- Invalid subsets I6, I7, and I8, include reserved characters on a Windows platform. When characters in this subset are used by themselves or in any position in a string of characters the result is an error message indicating the above file name is invalid.
- Invalid subsets I9, I10, I13, also include reserved characters and the space and period characters. When these subsets are tested as defined no error message displayed and focus is restored to the File name control on the Save/Save As… dialog.
- Invalid subsets I11, I12, also include the reserved character (0x2E) as 2 characters in the string and greater than 2 characters in a string. The state machine changes are different.
- Invalid subsets I15 and I16 define the space character when used in the first or last character position of a string. These are placed in the invalid class because Windows normal behavior is to truncate a leading or trailing space character in a file name. If the leading or trailing space character was not truncated and saved as part of the file name on a Windows platform that would constitute a defect.
- Invalid subset I17 and I18 contains two additional reserved characters; the asterisk and the question mark (0x2A and 0x3F respectively). If these characters are used by themselves or as a character in a string of other valid characters a file will not be saved, and no error message will occur. However, the state of the Save/Save As… dialog does change. If the default file type is .txt and there are text files displayed in the Folder View control on the Save As… dialog the files with the .txt extension will be removed after the Save button is depressed. If the default file type is All files then all files will be removed from the Folder View control on the Save As… dialog after the Save button is depressed.
- Invalid subset I19 is a string of valid characters which contains at least backslash character except as the lead character in the string. (Of course, this assumes the string is random and the position of the backslash character in the string is not in a position which would resolve to a valid path.) The backslash character is a reserved character for use as a path delimiter in the file system. An error message will appear indicating the path is invalid.
- Invalid subset I20 tests for extremely long base file name lengths of greater than 252 characters. Note that an interesting anomaly occurs with string lengths. A base file name string length which tests the boundaries of 252 or 253 valid characters will cause an error message to display indicating the file name is invalid. However, a base file name string length of 254 or 255 characters will actually get saved as file name but is not associated with any file type. Any base file name string longer than 255 characters again instantiates an error message.
- Invalid subset I21 describes the tests for case sensitivity. The Windows platform does not consider character case of characters that have an upper case and a lower case representation. For example, a file name with a lower case Latin character ‘a’ is considered the same as a file name with the upper case Latin character ‘A’.
- Invalid subset I22 is, of course, an empty string
Of course, this is a partial list of the complete data set since the filename on a Windows Xp operating system can be any valid Unicode value of which there are several thousand character code points, including surrogate pair characters.
The first and by far the most complex step in the application of the functional technique of equivalence class partitioning is data decomposition. This requires an incredible amount of knowledge about the system. Data decomposition is an exercise in modeling data. The less one understands the data set, or the system under test the greater the probability of missing something. Next week we will analyze the equivalence class subsets to define are baseline set of tests to evaluate the base filename component.