We recently released a new version of HANGMAN.VBS and published it in SAP note 948633, see attachment below. HANGMAN is not a game. It is a tool to analyze any kind of hanging situations of SAP systems running on SQL Server. It has been used to analyze many issues within the last years. HANGMAN is mentioned and used in 300 official SAP customer messages (service requests) until now.
Giving the tool a more respectable name would probably had been a good idea. However, the tool grew up within years and the name is already well known at SAP. Everything started in 2002 with a quick and dirty Windows batch file called HANGMAN.BAT. In the last years HANGMAN turned from a batch file, which was dependent on several executables, into an easy to use but comprehensive VB-Script.
What is a hanging situation?
Various kinds of issues can result in hanging situations of an SAP system. The user experience always is the same: He/She sees an hour glass and simply talks about a hanging situation or a hiccup of the system. If the issue disappears within a few minutes it’s rather difficult to analyze the root cause of the problem. The same applies, if an inexperienced administrator simply restarts the system to solve the issue and doing so covers the tracks. Hanging situations are one of the most severe issues, but hardest to investigate, especially when they occur frequently, but not periodically.
A usual hanging situation is caused by resource bottlenecks or waits on special, exclusively used resources. The first thing you typically have in mind as a database guy is a blocking database lock. On SAP side a work process could also wait for a semaphore. There could be no free SAP dialog work processes or even no free worker threads in SQL Server. Sometimes a high CPU or disk I/O workload slows down the whole system. It is quite evident that you have to check different areas (SAP, SQL, Windows) on different servers (database and application servers) at the same point in time on order to analyze such an issue.
How does HANGMAN work?
The idea of HANGMAN is to collect information from SAP, SQL and Windows from all servers of an SAP system and consolidate this into a single log file. HANGMAN can be easily started by double-clicking a file once a hanging situation occurs. The log file created by HANGMAN can be analyzed later by an administrator or SAP support.
The minimal requirements for HANGMAN are typically fulfilled by all SAP customers nowadays: Windows 2000 or newer, SQL Server 2000 SP3 or newer, SAP kernel 4.5B or newer and osql.exe (or sqlcmd.exe) installed. Once you double click HANGMAN.VBS you have to enter a view input boxes (SAP CI, SAP SID, database instance,… see SAP note 948633 for details). You can avoid these input boxes by adding parameters to the VB-script or creating a batch file including the parameters. The batch file could look like this:
cscript.exe HANGMAN.VBS -SQLSERVER mysrv -DBNAME PRD -SAPCI mysrv -SAPSID PRD -SAPPORT 3600
cscript.exe HANGMAN.VBS -JAVAONLY -SQLSERVER mysrv -DBNAME PRD
When using the parameter JAVAONLY you do not get the SAP (ABAP) work process list. This is useful for SAP systems without an ABAP stack or even for non-SAP systems.
Once having all parameters, HANGMAN connects to SQL Server figuring out its version. This is necessary since the SQL scripts used for SQL 2000 are totally different from the ones for SQL 2005 which depend on SQL Server 2005 Dynamic Management Views. Then it asynchronously collects the Windows and SAP process list from each SAP application server and writes it into a single log file. The SAP process list is retrieved (remotely) using the COM interface of the SAP dispatcher. Therefore no free SAP dialog work process is needed. The Windows process list is retrieved (remotely) using WMI.
After that all SQL sessions, all blocking requests and all database locks are retrieved using OSQL (or SQLCMD). For the OSQL database connection you need a free worker thread in SQL Server. Alternatively you may start HANGMAN with the DAC parameter. In this case HANGMAN uses SQLCMD and the SQL Server dedicated admin connection. To make sure that HANGMAN itself is not a hanging application, all HANGMAN tasks have an internal timeout. As a result, the list of database locks may be truncated. Typically this happens after a few hundred of thousands database locks.
Finally the SQL Server Error Log and, if configured, an extract of all SAP developer traces of all application servers is added to the HANGMAN log file.
This week we wanted to give an overview what HANGMAN is and how it works in principal. Next week we want to discuss how to analyze a hanging situation by looking into a HANGMAN log file.