The SAP and Windows specific sections of a HANGMAN log file have already been explained in a previous blog entry. Today we want to have a look on the SQL Server specific sections. For the explanation I will use the new terms session/connection/request introduced in SQL Server 2005. However, you may still see the old terms connection/SPID in a HANGMAN log file for SQL Server 2000.
Blocking SQL process list
This list contains all blocking and blocked SQL sessions. The following example is from an SAP JAVA system running on SQL Server 2000. The output of HANGMAN is different on SQL 2000 compared to SQL Server 2005 or newer (due to the different architecture and system tables). For all SQL Server versions you can get the blocked SQL statements and the current SQL statement running on the blocking session. Exclusive locks are held until the end of a transaction, which may consists of many SQL statements. Therefore the current running SQL statement in the blocking session may not necessarily be the SQL statement holding the blocking lock(s). This is a common misunderstanding at customer sites and even sometimes in SAP support. It may even be the case that there currently is no running SQL statement on the blocking session. The only thing a session needs to block another is an open transaction and an lock which conflicts with lock requests of other transactions.
The following list is an extract from a HANGMAN log file of a SQL Server 2000 system. All further lists are extracted from the attached HANGMAN log file of a SQL Server 2005 system.
spid ecid blocked wait[ms] input_buffer stmt_text
—- —- ——- ——– ————– —————————-
153 0 0 0 sp_execute;1 –>SELECT “VSE”.”OBJECTID”,”
144 0 153 3313813 sp_prepexec;1 —->SELECT VSE.OBJECTID, VO
123 0 144 73797 sp_prepexec;1 ——>SELECT VSE.OBJECTID,
139 0 144 1873797 sp_prepexec;1 ——>SELECT VSE.OBJECTID,
143 0 144 2773813 sp_prepexec;1 ——>SELECT VSE.OBJECTID,
The input buffer shows the last statement sent to SQL Server on this particular session. Having a look at the input buffer in the example above you can only see a “sp_execute”. This means that the SQL statement is a prepared statement. It does not tell you which SQL statement is running. SAP always uses prepared statements for JAVA. For ABAP prepared statements are used in certain cases as of NetWeaver release 6.20 and in (almost) all cases as of NetWeaver release 7.00 running against SQL Server 2005 and 2008. To figure out the SQL statement you have to look at the text column (“stmt_text” or “sql_text”). In some cases you will not see the SQL statement. There could simply be no running SQL statement on the blocking session (while there is always a running SQL statement on the blocked session). In rare cases you will not see the SQL statement text while the optimizer is creating an execution plan for this statement.
Blocking SQL process list of a SQL Server 2005 system:
session blocking wait
id id sec wait_type wait_resource sql_text
——- ——– —- ———– ————- —————————–
81 0 NULL NULL NULL > delete from BLK_TEST3
86 81 425 LCK_M_U KEY: 5:720575 –> delete from BLK_TEST3 whe
87 81 423 LCK_M_X KEY: 5:720575 –> insert into BLK_TEST1 val
89 87 422 LCK_M_SCH_S OBJECT: 5:212 —-> (@P0 int,@P1 int)begin
95 89 418 LCK_M_SCH_S OBJECT: 5:212 ——> (@P0 int,@P1 int)begi
97 89 417 LCK_M_SCH_S OBJECT: 5:212 ——> (@P0 int,@P1 int)begi
101 89 415 LCK_M_SCH_S OBJECT: 5:212 ——> (@P0 int,@P1 int)begi
102 89 414 LCK_M_SCH_S OBJECT: 5:212 ——> (@P0 int,@P1 int)begi
88 81 423 LCK_M_X KEY: 5:720575 –> insert into BLK_TEST1 val
90 0 414 WAITFOR > WAITFOR DELAY ’20:00′
92 90 419 LCK_M_S KEY: 5:720575 –> select * from BLK_TEST3 w
104 92 412 LCK_M_S KEY: 5:720575 —-> select * from BLK_TEST3
93 90 419 LCK_M_S KEY: 5:720575 –> select * from BLK_TEST3 w
100 90 416 LCK_M_S KEY: 5:720575 –> select * from BLK_TEST3 w
For SQL Server 2005 and more recent releases the blocking process list also contains the wait type, wait resource and the wait time in seconds. A blocking situation as such is not an issue. It may become a problem once it remains for many seconds or minutes and starts blocking more and more requests. In the example above you can see two independent blocking situations. To figure out who blocks whom you can either lock at the session ID and blocking ID or at the indentation (–>) of the SQL statements. Let’s have a look on the first blocking situation caused by session ID 81. This session is not blocked by anyone else (blocking ID = 0), but blocks sessions 86, 87 and 88. Session 87 blocks session 89 which in turn blocks the sessions 95, 97, 101 and 102. HANGMAN recursively sorts the hanging SQL process list, which makes it easy to identify the independent blocking trees just by looking at the indentation of the SQL statements.
Some people confuse a blocking situation with a deadlock. A blocking situation happens if session B waits for a lock which is hold by session A. Such a blocking situation resolves without further interaction at the point in time session A releases the blocking lock. The blocking situation turns into a deadlock once session A waits for another lock which is hold by session B. SQL Server automatically detects such a deadlock and automatically resolves it by terminating one of the two sessions. In an SAP system you typically see an ABAP short dump as the result of a deadlock. SAP configures the update work processes to be always the victim of a deadlock. The SAP update work process can repeat a deadlocked database transaction without any business impact. Therefore you will not see a short dump or any other error in the SAP system log once a deadlock occurs on an update work process.
SQL process list
For SQL Server 2005 and more recent releases this section consist of 2 independent lists. The first list is actually a join of the SQL Server sessions (S), connections (C) and requests (R) retrieved from the SQL Server dynamic management views. This list is ordered by the SAP work process number (using the program name). It contains the recent SQL statement which was executed on the connection, even when there is currently no active request. The second list contains all SQL requests ordered by the corresponding session ID. This list contains the current SQL statement which is being executed.
For an ABAP system, SAP sets the program name when connecting to SQL Server, for example “R3D00 comm rd ODBC”. The first 2 characters of the program name are “R3” followed by the work process type (“D” for dialog work process, “B” for batch, “U” for update, …). The next two characters show the work process number as you can see in the SAP work process list. The following characters indicate the SQL Server isolation level (“comm rd” for committed read, “unc rd” for uncommitted read) and the used database interface (“ODBC” or “OLEDB”). You should be aware that the type of an SAP work process may change without reconnecting to the database (when changing SAP’s operation mode). Therefore the type displayed in the application name is not always reliable. However, the work process number does not change.
A session is in status running if the work process which opened the session has sent an SQL statement to SQL Server and is waiting for the results. For running sessions a request exists which is using one of SQL Servers worker threads. A sleeping session simply means that there is still an open session, but SQL Server has nothing to do for this session. A sleeping (user) session has no corresponding request.
The number of threads which can concurrently run on Windows depends on the installed CPUs. This limits also the number of SQL Server requests which can be in status ‘running’. For a two processor, dual core, non-hyperthreading machine there can only be 4 requests in status running at one point in time. When looking into a HANGMAN log file one running request always comes from HAMGAN itself, respectively OSQL.EXE. It is consequential that the time the SQL process list is retrieved, it is the retrieving request which runs.
ID Host Program OS Status delta delta delta Status
(S) (S) Name (S) Proc (S) CPU(R)Reads LogReads delta (R)
— —– ——– —– ——- —– —– ——– ——– ——-
1 sleepin 0 0 0 1000 ms backgroun
2 sleepin 0 0 0 1000 ms backgroun
86 WSI64 0 running 0 0 0 1000 ms suspended
87 WSI64 0 running 0 0 0 1000 ms suspended
88 WSI64 0 running 0 0 0 1000 ms suspended
89 WSI64 0 running 0 0 0 1000 ms suspended
95 WSI64 0 running 0 0 0 1000 ms suspended
96 WSI64 0 sleepin
97 WSI64 0 running 0 0 0 1000 ms suspended
101 WSI64 0 running 0 0 0 1000 ms suspended
102 WSI64 0 running 0 0 0 1000 ms suspended
81 WSI64 Microsof 2920 sleeping
106 WSI64 OSQL-32 6808 running 15 0 291 1000 ms running
61 WSI64 R3D00 co 3080 sleepin
62 WSI64 R3D02 co 3096 sleepin
72 WSI64 R3D02 un 3096 running 0 -65 -1356 1000 ms suspended
66 WSI64 R3D03 un 3104 sleepin
56 WSI64 R3D04 co 2540 running 0 1 3
Similar to the Windows process list, a delta of CPU usage, physical reads/writes and logical reads is calculated for each request. The time frame used for the delta values is displayed in a separate column. The currently running hangman/osql request in the example above was consuming 15ms CPU time within the last 1000ms. In this time frame 0 physical reads and 291 logical reads (data buffer accesses) were performed. In rare cases you might see negative values which can be explained with inaccurate values in the SQL Server dynamic management views.
In the section “Blocking SQL process list” above we have seen a blocking situation caused by session 81. The first thing for analyzing this is to have a look at the SQL process list and figure out what is going on in this session. The first surprise is that session 81 is sleeping. Therefore there is no corresponding request which could consume CPU time. Secondly session 81 was opened by the program “Microsoft SQL Server Management Studio”. This blocking situation is therefore not caused by any SAP report.
If a blocking session was opened by a SAP work process then the session could also be sleeping. The SAP work process might be busy with executing ABAP calculations while an open database transaction persists. For further analysis you should check what the SAP work process is doing, in particular whether the work process consumes CPU or not. To figure out which work process opened the session you can use the server name “Host(S)” and the SAP work process number as part of “Program Name(S)”. Alternatively you can also use the Windows process ID from the column “OS Proc(S)”. Unfortunately you can see neither the program name nor the Windows process ID for a JAVA system.
SAP uses multiple active result sets (MARS) of SQL Server 2005 and newer. Therefore you always have at least two SQL connections per SQL session. The first connection is internally used for MARS and will never be related to any SQL request. That’s the reason why HANGMAN does not show this first SQL connection per session in the SQL process list.
Database lock list
The list of all held and requested locks at any given point in time is one of the most powerful features of HANGMAN. For performance reasons (of HANGMAN) the list may be incomplete. However, this is typically not the case when there are less than a few hundred of thousand locks. The list is definitely complete if it ends with the number of locks in the list (“xxx rows affected”). In addition to the object ID you can see the object name (table name) for each database lock. Since the translation form object ID to object name can be very time-consuming, it is only performed for the database HANGMAN is connected to. The database locks of other databases are also contained in the list, but without the object name.
The locks are ordered by the session ID which is holding (status = “G” for GRANTED) or requesting (status = “WAIT”) a database lock. The lock granularity of SQL Server can be a row, a page or a table (type = “OBJECT”). Row locks are either row IDs (type = “RID”) or a hash value calculated on the clustered index key (type = “KEY”). Since almost all tables in SAP have a primary clustered index you will see lots of KEY locks and very seldom PAGE or OBJECT locks in a HANGMAN log file. SQL Server has a special lock mode, called intent locks (mode = “IX”, “IS”, “IU”, …). Intent locks are an indication for an existing lock with lower granularity. For example, a shared lock (“S”) on a row results in an intent shared lock (“IS”) on the page and on the table. The lock list of HANGMAN does not contain granted intent locks, but it contains all waits on intent locks.
In our example above we have seen a blocking situation caused by session 81. Let’s have a closer lock on the related locks: Sessions 87 is requesting an exclusive row lock (KEY 0500d1d065e9) on index ID 1 of table BLK_TEST1. The blocking session 81 is holding an exclusive lock on exactly this row. Hereby we have proven that this works as designed. There are no page or table locks involved. It is not the lock granularity of SQL Server but the application logic, which caused the blocking situation.
session db ind
id object id object name id id type mode status description
—— ———- ———– — — ——– —- —— ————–
81 . . 5 . METADATA Sch-S G principal_id =
81 . . 5 . METADATA Sch-S G schema_id = 6
81 2060495065 BLK_TEST1 5 1 KEY X G (020068e8b274)
81 2060495065 BLK_TEST1 5 1 KEY X G (010086470766)
81 2060495065 BLK_TEST1 5 1 KEY X G (0400b4b7d951)
81 2060495065 BLK_TEST1 5 1 KEY X G (03000d8f0ecc)
81 2060495065 BLK_TEST1 5 1 KEY X G (0500d1d065e9)
81 41011873 BLK_TEST3 5 1 KEY X G (010086470766)
86 . . 5 . METADATA Sch-S G principal_id =
86 . . 5 . METADATA Sch-S G schema_id = 6
86 41011873 BLK_TEST3 5 1 KEY U WAIT (010086470766)
87 . . 5 . METADATA Sch-S G schema_id = 6
87 . . 5 . METADATA Sch-S G class = 0, maj
87 . . 5 . METADATA Sch-S G principal_id =
87 . . 5 . RID X G 1:334232:0
87 2060495065 BLK_TEST1 5 1 KEY X WAIT (0500d1d065e9)
87 2124495293 BLK_TEST2 5 1 PAGE X G 1:334232
87 2124495293 BLK_TEST2 5 1 PAGE X G 4:333927
87 2124495293 BLK_TEST2 5 . OBJECT Sch-M G
87 5 sysrowsets 5 1 KEY X G (0000ca0144f3)
87 7 sysallocun 5 1 KEY X G (0000c96b563e)
88 . . 5 . METADATA Sch-S G schema_id = 6
88 . . 5 . METADATA Sch-S G principal_id =
88 2060495065 BLK_TEST1 5 1 KEY X WAIT (0400b4b7d951)
89 . . 5 . METADATA Sch-S G schema_id = 6
89 2124495293 BLK_TEST2 5 . OBJECT Sch-S WAIT
95 . . 5 . METADATA Sch-S G schema_id = 6
95 2124495293 BLK_TEST2 5 . OBJECT Sch-S WAIT
97 . . 5 . METADATA Sch-S G schema_id = 6
97 2124495293 BLK_TEST2 5 . OBJECT Sch-S WAIT
101 . . 5 . METADATA Sch-S G schema_id = 6
101 2124495293 BLK_TEST2 5 . OBJECT Sch-S WAIT
102 . . 5 . METADATA Sch-S G schema_id = 6
102 2124495293 BLK_TEST2 5 . OBJECT Sch-S WAIT
(11154 rows affected)
Session 87 is holding a schema modification lock (mode = “Sch-M”) on table BLK_TEST2. This lock mode is used when changing the table structure, for example when creating or dropping indexes. It is also used when truncating a table. Session 89 simply wants to select data from table BLK_TEST2 and therefore requests a schema stability lock (mode = “Sch-S”) for the duration of the execution. This lock only can be granted once the schema modification lock held by session 87 on the same table is released. Sessions 95, 97, 101 and 102 also request a schema stability lock on the same table. Logically they also wait for session 87. However, due to the internal monitoring structures of SQL Server they wait on session 89, which in turn waits on session 87. Anyway, schema stability locks can be granted to multiple sessions at the same time.
SQL Server statistics
HANGMAN adds at the end of its log file additional SQL Server statistics regarding schedulers, memory usage, cache hit ratio and the SQL Server Errorlog. The most interesting part is probably the I/O response time per file. In the attached example the response time is horribly bad. Currently HANGMAN calculates the response time as an average since SQL Server startup. There is no delta calculation on the last seconds as it is the case for the CPU usage in the SQL process list. This feature will be added in a future version of HANGMAN.
db file ms per ms per database path
id id read write
—- —- —— —— ——– ——————————
1 1 12.84 58.00 master C:\Program Files\Microsoft SQL
1 2 16.82 25.00 master C:\Program Files\Microsoft SQL
2 1 36.39 200.12 tempdb E:\TEMPDB\tempdb.mdf
2 2 17.11 23.90 tempdb E:\TEMPDB\templog.ldf
3 1 13.30 24.50 model C:\Program Files\Microsoft SQL
3 2 20.86 34.38 model C:\Program Files\Microsoft SQL
4 1 14.41 32.66 msdb C:\Program Files\Microsoft SQL
4 2 18.13 10.78 msdb C:\Program Files\Microsoft SQL
5 1 129.47 136.71 SR3 E:\SR3DATA1\SR3DATA1.mdf
5 2 35.50 32.20 SR3 E:\SR3LOG1\SR3LOG1.ldf
5 3 136.23 159.89 SR3 E:\SR3DATA2\SR3DATA2.ndf
5 4 144.60 151.73 SR3 E:\SR3DATA3\SR3DATA3.ndf
5 5 154.84 149.97 SR3 E:\SR3DATA4\SR3DATA4.ndf
If you want to have a look at the complete HANGMAN log file of our example then just open the attached file HANGMAN__2008-10-07__06-15-37__650.log.