Context in perspective 6: Taking Sessions to Task

A few weeks ago, Kendra Little did a very entertaining Dear SQL DBA episode about stack dumps. That gave me a light bulb moment when I tied the preamble of her stack dump example to a DBCC command that I’ve had on the back burner for a while. The command puts some session-related objects on stage together, and those objects bridge the gap between a DMV-centric view of the SQL Server execution model and the underlying implementation.

Let’s revisit the purpose of a dump. When you catch one of these happening on your server, it’s sensible to be alarmed, but it is rather negative to say “Oh no, something horrible went wrong!” A more positive angle is that a disturbance was detected, BUT the authorities arrived within milliseconds, pumped the whole building full of Halon gas, and then took detailed forensic notes. Pity if innocent folks were harmed by that response, but there was cause to presume the involvement of a bad hombre, and that is the way we roll around here.

Today we’ll be learning lessons from those forensic notes.

Meet DBCC TEC

Boring old disclaimer: What I am describing here is undocumented, unsupported, likely to change between versions, and will probably make you go blind. In fact, the depth of detail exposed illustrates one reason why Microsoft would not want to document it: if end users of SQL Server found a way to start relying on this not changing, it would hamstring ongoing SQL Server improvement and refactoring.

With that out of the way, let’s dive right into DBCC TEC, a command which can dump a significant chunk of the object tree supporting a SQL Server session. This result is the same thing that shows up within a dump file, namely the output of the CSession::Dump() function – it’s just that you can invoke this through DBCC without taking a dump (cue staring match with Kendra). Until corrected, I shall imagine that TEC stands for Thread Execution Context.

In session 53 on a 2014 instance, I start a batch running the boring old WAITFOR DELAY ’01:00:00′, and while it is running, I use another session to run DBCC TEC against this session 53. The DBCC command requires traceflag 3604, and I’m going to ignore the first and third parameters today; my SQL batch then looks like this:

 
DBCC TRACEON(3604);
DBCC TEC(0,53,0);

Here is the output I got, reformatted for readability, and with the bulk of the long output buffer omitted:

TEC:


CSession @0x0000000038772470

m_sessionId = 53                      m_cRef = 13                         
m_rgcRefType[0] = 1                   m_rgcRefType[1] = 1                 
m_rgcRefType[2] = 9                   m_rgcRefType[3] = 1
m_rgcRefType[4] = 0                   m_rgcRefType[5] = 1                 
m_pmo = 0x0000000038772040            m_pstackBhfPool = 0x0000000000000000
m_dwLoginFlags = 0x100083e0           m_fBackground = 0
m_eConnResetOption = 0                m_fUserProc = 1                     
m_fConnReset = 0                      m_fIsConnReset = 0                  
m_fInLogin = 0                        m_fAuditLoginSent = 1
m_fAuditLoginFailedSent = 0           m_fReplRelease = 0                  
m_fKill = 0                           m_ulLoginStamp = 110                
m_eclClient = 7                       m_protType = 6
m_hHttpToken = FFFFFFFFFFFFFFFF    

 

CUEnvTransient @0x0000000038772A60

m_ululRowcount = 0                    m_cidLastOpen = 0                   
m_lFetchStat = 0                      m_lRetStat = 0                      
m_lLastError = 0                      m_lPrevError = 0
m_cNestLevel = 0                      m_lHoldRand1 = 0                    
m_lHoldRand2 = 0                      m_cbContextInfo = 0                 
m_rgbContextInfo = 0000000038772C28 



CDbAndSetOpts @0x0000000038772A80

m_fHasDbRef = 0                       m_fDateFirstSet = 0                 
m_fDateFormatSet = 0                  m_lUserOpt1 = 0x14817000            
m_lUserOpt2 = 0x40                    m_lUserOpt1SetMask = 0xfffff7f7
m_idtInsert.objid = 0                 m_idtInsert.state = 0               
m_idtInsert.dbid = 0                  m_llRowcnt = 0                      
m_lStatList = 0                       m_lTextSize = 2147483647
m_lOffsets = 0x0                      m_ulLockTimeout = 4294967295        
m_ulQueryGov = 0                      m_eDtFmt = 1                        
m_eDayDateFirst = 7                   m_eDdLckPri = 0
m_eIsoLvl = 2                         m_eFipsFlag = 0x0                   
m_sLangId = 0


m_pV7LoginRec

0000000000000000:   00000000 00000000 00000000 00000000 00000000  ....................
0000000000000014:   00000000 00000000 00000000 00000000 00000000  ....................
0000000000000028:   00000000 00000000 00000000 00000000 00000000  ....................
000000000000003C:   00000000 00000000 00000000 00000000 00000000  ....................
0000000000000050:   00000000 00000000 00000000 0000               ..............


CPhysicalConnection @0x0000000038772240

m_pPhyConn->m_pmo = 0x0000000038772040                                   
m_pPhyConn->m_pNetConn = 0x0000000038772DF0                              
m_pPhyConn->m_pConnList = 0x0000000038772440                             
m_pPhyConn->m_pSess = 0x00000000387724C8                                 
m_pPhyConn->m_fTracked = -1
m_pPhyConn->m_cbPacketsize = 4096   
m_pPhyConn->m_fMars = 0             
m_pPhyConn->m_fKill = 0



CBatch @0x0000000038773370

m_pSess = 0x0000000038772470          m_pConn = 0x0000000038773240        
m_cRef = 3                            m_rgcRefType[0] = 1                 
m_rgcRefType[1] = 1                   m_rgcRefType[2] = 1
m_rgcRefType[3] = 0                   m_rgcRefType[4] = 0                 
m_pTask = 0x000000003868F468


EXCEPT (null) @0x000000004AD8F240

exc_number = 0                        exc_severity = 0                    
exc_func = sqllang.dll!0x00007FFD60863940                                


Task @0x000000003868F468

CPU Ticks used (ms) = 31              Task State = 4
WAITINFO_INTERNAL: WaitResource = 0x0000000000000000                     
WAITINFO_INTERNAL: WaitType = 0x20000CA
WAITINFO_INTERNAL: WaitSpinlock = 0x0000000000000000                     
SchedulerId = 0x1                     ThreadId = 0xb60                    
m_state = 0                           m_eAbortSev = 0



EC @0x000000003875E9A0

spid = 0                              ecid = 0                            
ec_stat = 0x0                         ec_stat2 = 0x0                        
ec_atomic = 0x0                       ecType = 0
__pSETLS = 0x00000000387732B0         __pSEParams = 0x0000000038773500   


 
SEInternalTLS @0x00000000387732B0

	m_flags = 0                        	
        m_TLSstatus = 3                    	
        m_owningTask = 0x00000000336CB088
	m_activeHeapDatasetList = 0x00000000387732B0                            
	m_activeIndexDatasetList = 0x00000000387732C0                           	
        m_pDbccContext = 0x0000000000000000
	m_pAllocFileLimit = 0x0000000000000000                                  	
        m_dekInstanceIndex = 0x-1
	m_pImbiContext = 0x0000000000000000


SEParams @0x0000000038773500

	m_lockTimeout = -1                 	
        m_isoLevel = 4096                  	
        m_logDontReplicate = 0
	m_neverReplicate = 0               	  
        m_XactWorkspace = 0x000000003A158040
	m_execStats = 0x000000003A877780   

CCompExecCtxtBasic @0x000000003875E040


CConnection @0x0000000038773240


CNetConnection @0x0000000038772DF0

m_pNetConn->status = 8              m_pNetConn->bNewPacket = 0          
m_pNetConn->fClose = 0              m_pNetConn->pmo = 0x0000000038772040
m_pNetConn->m_lConnId = 0           client threadId = 0x38f11827

m_pNetConn->m_pFirstInputBuffer

0000000000000000:   16000000 12000000 02000000 00000000 00000100  ....................
0000000000000014:   00007700 61006900 74006600 6f007200 20006400  ..w.a.i.t.f.o.r. .d.
0000000000000028:   65006c00 61007900 20002700 30003100 3a003000  e.l.a.y. .'.0.1.:.0.
000000000000003C:   30003a00 30003000 27003b00 0d000a00           0.:.0.0.'.;.....

m_pNetConn->srvio.outbuff

0000000000000000:   04010025 00350100 e40b0000 00030000 0001fe04  ...%.5..ä.........þ.
0000000000000014:   00000000 fd0000f3 00000000 00000000 00000000  ....ý..ó............
... and so on - this is rather long.



CUEnvTransient @0x000000003875E0B8

m_ululRowcount = 0                  m_cidLastOpen = 0                   
m_lFetchStat = 0                    m_lRetStat = 0                      
m_lLastError = 0                    m_lPrevError = 0
m_cNestLevel = 0                    m_lHoldRand1 = 0                    
m_lHoldRand2 = 0                    m_cbContextInfo = 0                 
m_rgbContextInfo = 000000003875E280 




CDbAndSetOpts @0x000000003875E0D8

m_fHasDbRef = 0                     m_fDateFirstSet = 0                 
m_fDateFormatSet = 0                m_lUserOpt1 = 0x14817000            
m_lUserOpt2 = 0x40                  m_lUserOpt1SetMask = 0xfffff7f7
m_idtInsert.objid = 0               m_idtInsert.state = 0               
m_idtInsert.dbid = 0                m_llRowcnt = 0                      
m_lStatList = 0                     m_lTextSize = 2147483647
m_lOffsets = 0x0                    m_ulLockTimeout = 4294967295        
m_ulQueryGov = 0                    m_eDtFmt = 1                        
m_eDayDateFirst = 7                 m_eDdLckPri = 0
m_eIsoLvl = 2                       m_eFipsFlag = 0x0                   
m_sLangId = 0




CExecLvlIntCtxt @0x000000003875E330

m_pBatch = 0x0000000038773370             m_pxlvl = 0x000000003875F800        
m_pOAInfo = 0x0000000000000000            m_pOAUnInitFunc = 0x0000000000000000  
m_pXPUserData = 0x0000000000000000        m_pRpcInfo = 0x0000000000000000
m_pSharedRpcInfo = 0x0000000000000000     m_pCeiInfo = 0x0000000000000000
m_etts = 0x00000000                       m_eQueryProps = 00000000            
m_ulNotificationSweeperMode = 0           m_MsqlXactYieldMode = 2               
m_dbidPubExec = 0                         m_pBulkCtx = 0x0000000000000000
m_pBulkImp = 0x0000000000000000           m_pInsExec = 0x0000000000000000     
m_pbhfdCurrentlyActive = 0x000000003875E438 


                             
IBHFDispenser @0x000000003875E438

m_pbhfdPrev = 0x0000000000000000          m_pbhf = 0x0000000000000000         
m_bErrAction = 0                          m_ulStmtId = 0                          
m_ulCompStmtId = 1                        m_pShowplanTable = 0x0000000000000000
m_pShowplanXMLCtxt = 0x0000000000000000   m_pStmtAlterTable = 0x0000000000000000
m_pxliStackTop = 0x000000003875F9E0       m_pqni = 0x0000000000000000         
m_ullTransactionDescriptor = 0            m_dwOpenResultSetCount = 1          
m_InternalFlags = 0x00001002        



CExecLvlRepresentation @0x000000003875F890


CStmtCurState @0x000000004AD8E670

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

First observation: this is a treasure trove, or perhaps a garbage dump. Second observation: the output could actually get a lot more verbose, depending on what the session under scrutiny is doing.

All in all, the fact of its inclusion in dump files tells us that it contains useful information to support personnel, and helps to add context in a crash investigation.

A glimpse of the object tree

You certainly can’t accuse DBCC TEC of being easy on the eyes. But with a bit of digging (my homework, not yours) it helps to piece together how some classes fit together, tying up tasks, sessions and execution contexts.

Recall that Part 5 demonstrated how one can find the CSession for a given thread. DBCC TEC walks this relationship in the other direction, by deriving tasks and threads – among other things – from sessions.

While this isn’t made obvious in the text dump, below diagram illustrates the classes involved, and their linkages.

The object tree revealed by DBCC TEC

Of course these are very chunky classes, and I’m only including a few members of interest, but you should get the idea.

Observations on DBCC TEC and class relationships

Let’s stick to a few interesting tidbits. CSession is the Big Daddy, and it executes CBatches. Although we’d normally only see a single batch, under MARS there can be more; this manifests by the session actually having a linked list of batches, even though normally the list would contain a single item (I simplified to a scalar on the diagram).

A session keeps state and settings within a CUEnvTransient, which in turn refers to a CDbAndSetOpts. You should recognise many of those attributes from the flattened model we see in DMVs.

However, to make it more interesting, a task (indirectly) has its own CUenvTransient; this is one reflection on the difference between batch-scoped and session-scoped state. And on top of that, even a session actually has two of them, with the ability to revert to a saved copy.

Although a CBatch may execute CStatements, we don’t see this here. The actual work done by a batch is of course expressed as a SOS_Task, which is a generic SQLOS object with no knowledge of SQL Server.

At this point the DBCC output layout is a bit misleading. Where we see EXCEPT (null) it is SOS_Task state being dumped, and we’re not within the CBatch anymore. However, the SOS_Task is actually just the public interface that SQLOS presents to SQL Server, and some of the state it presents live in the associated Worker. The exception information lives within an ExcHandler pointed to by the Worker, and things like CPU Ticks used come directly from the Worker. Task State is the same rather convoluted derivation exposed in sys.dm_os_tasks, as outlined in Worker, Task, Request and Session state. ThreadID is an attribute of the SystemThread pointed to by the task’s Worker, and similarly SchedulerID comes indirectly via the Task’s SOS_Scheduler.

WAITINFO_INTERNAL and (where applicable) WAITINFO_EXTERNAL describe cooperative and preemptive waits respectively. The latter can actually get nested, giving us a little stack of waits, as described in Going Preemptive.

Now we start peering deeper into the task, or rather through it into its associated worker’s LocalStorage as described in Part 5. The first two slots, respectively pointing to a CCompExecCtxtBasic and an ExecutionContext, have their contents dumped at this point.

The ExecutionContext points to one instance each of SEInternalTLS and SEParams. Pause here for a moment while I mess with your mind: should the task in question be executing a DBCC command, for instance DBCC TEC, the m_pDbccContext within the SEInternalTLS will be pointing to a structure that gives DBCC its own sense of context.

Another fun observation: while SEInternalTLS claims with a straight face to have an m_owningTask member, I call bull on that one. Probably it did in the past, and some poor developer was required to keep exposing something semi-meaningful, but the address reported is the address of the ambient task, i.e. the one running DBCC TEC. The idea of ownership is simply not modelled in the class.

Then (still down in the LocalStorage of a Task performed by the Batch under scrutiny) we get the CCompExecCtxtBasic we saw in Part 5. This now shows us a few more members, starting with a CConnection which in turn points to a CNetConnection getting its buffers dumped for out viewing pleasure. This CConnection is in fact the same one pointed to directly by the Batch as m_pConn.

Now, as already mentioned, we see the CUEnvTransient with associated CDbAndSetOpts belonging to the task at hand (through its Worker’s LocalStorage->CCompExecCtxtBasic).

The next CCompExecCtxtBasic member is a pointer to a CEvevLvlIntCtxt which in turn references an IBHFDispenser; in case you’re wondering, BHF stands for BlobHandleFactory, but I’m not going to spill the beans on that yet, because I am fresh out of beans.

Finally, CCompExecCtxtBasic’s performance ends with the duo of CExecLvlRepresentation and a CStmtCurState, neither of whom has much to say.

Whew. That was the superficial version.

Are we all on the same page?

While we could spend a lot of time on what all these classes really do, that feels like material for multiple separate blog posts. Let’s finish off by looking at the pretty memory layout.

Recall that addresses on an 8k SQLOS page will cover ranges ending in 0000 – 1FFF, 2000 – 3FFF, 4000 – 5FFF etc, and that all addresses within a single such page will have been allocated by the same memory object. Now note that the objects which expose a pmo (pointer to memory object) all point to 0x38772040. This sits on (and manages) a page stretching from 0x38772000 to 0x38773FFF.

Have a look through those object addresses now. Do you see how many of them live on that page? This is a great example of memory locality in action. If a standard library’s heap allocation was used, those addresses would be all over the place. Instead, we see that the memory was allocated, and the objects instantiated, with a clear sense of family membership, even though the objects are notionally independent entities.

As a final note on the subject of memory addresses, the eagle-eyed may have noticed that the CPhysicalConnection has a m_pSess pointer which is close to, but not equal to the address of the CSession at the top of the object tree. This is because it is an ILogonSession interface pointer to this CSession instance, with the interface’s vftable living at an offset of 0x58 within the CSession class. See Part 3 for more background.

Further reading

Go on and re-read Remus Rusanu’s excellent How SQL Server executes a query. If you’ve never read it, do so now – it’s both broad and deep, and you can trust Remus on the details.

While reading it, see in how many places you can replace e.g. “Session” with “CSession”, and you will be that bit closer to Remus’s source code-flavoured subtext.

2 thoughts on “Context in perspective 6: Taking Sessions to Task”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.