Unsung SQLOS: the classic SOS_RWLock

Moving along with our bestiary of synchronisation classes, the SOS_RWLock, a reader-writer lock, feels like a logical next stop. It has been in the news recently, it has fairly simple semantics, and it is built upon primitives that we have already explored, namely spinlocks, linked lists and the EventInternal class. Its implementation is quite a leap from the simple SOS_Mutex and there is more scope for alternative implementations providing the same functionality. And, would you believe it, as called out by Bob Dorr, the 2012/2014 implementation has now been found wanting and got rewritten for 2016. Today we’re looking at the “classic” version though, because we then get the chance to understand the 2016 rewrite in terms of concrete design decisions. (Update: I examine the 2016 update here).
Continue reading “Unsung SQLOS: the classic SOS_RWLock”

Unsung SQLOS: the SOS_Mutex

A mutex, short for “mutual exclusion”, is arguably the simplest waitable synchronisation construct you can imagine. It exposes methods for acquisition and release, and the semantics are straightforward:

  • Initially it isn’t “owned”, and anybody who asks to acquire it is granted ownership
  • While owned, anybody else who comes around to try and acquire it must wait her turn
  • When the owner is done with it, the mutex is released, which then transfers ownership to one waiter (if any) and unpends that waiter

A mutex can also validly be referred to as a critical section, in the sense that it protects a critical section of code, or more accurately, data. When programming libraries expose both a mutex and a critical section, as Windows does, it really just reflects different implementations of synchronisation objects with the same semantics. You could also consider a spinlock to be a flavour of mutex: while the name “spinlock” describes the mechanism by which competing threads jostle for exclusive ownership (it can’t be politely waited upon), the idea of mutual exclusion with at most one concurrent owner still applies.

SOS_Mutex class layout and interface

This class is directly derived from EventInternal<SuspendQSlock>, with three modifications:

  1. The addition of an ExclusiveOwner member.
  2. The override of the Wait() method to implement mutex-specific semantics, although the main act of waiting is still delegated to the base class method.
  3. The addition of an AddAsOwner() method, called by Wait(), which crowns the ambient task as the exclusive owner after a successful wait.

Continue reading “Unsung SQLOS: the SOS_Mutex”

Unsung SQLOS: the EventInternal

Today we’re taking a step towards scheduler territory by examining the EventInternal class, the granddaddy of SQLOS synchronisation objects. At the outset, let’s get one formality out of the way: although it is a template class taking a spinlock type as template parameter, we only see it instantiated as EventInternal<SuspendQSLock> as of SQL Server 2014. What this means is that spins on its internal spinlock is always going to be showing up as SOS_SUSPEND_QUEUE.

It’s a very simple class (deceptively so even) which can implement a few different event flavours, doing its waiting via SQLOS scheduler methods rather than directly involving the Windows kernel. The desire to keep things simple and – as far as possible – keep control flow out of kernel mode is a very common goal for threading libraries and frameworks. .Net is a good frame of reference here, because it is well documented, but the pattern exists within OS APIs too, where the power and generality of kernel-mode code has to be weighed off against the cost of getting there.
Continue reading “Unsung SQLOS: the EventInternal”

Unsung SQLOS: the SystemThread

SystemThread, a class within sqldk.dll, can be considered to be at the root of SQLOS’s scheduling capabilities. While it doesn’t expose much obviously exciting functionality, it encapsulates a lot of the state that is necessary to give a thread a sense of self in SQLOS, serving as the beacon for any code to find its way to an associated SQLOS scheduler etc. I won’t go into much of the SQLOS object hierarchy here, but suffice it to say that everything becomes derivable by knowing one’s SystemThread. As such, this class jumps the gap between a Windows thread and the object-oriented SQLOS.
Continue reading “Unsung SQLOS: the SystemThread”

Windows, mirrors and a sense of self

In my previous post, Threading for humans, I ended with a brief look at TLS, thread-local storage. Given its prominent position in SQLOS, I’d like to take you on a deeper dive into TLS, including some x64 implementation details. Far from being a dry subject, this gets interesting when you look at how TLS helps to support the very abstraction of a thread, as well as practical questions like how cleanly any/or efficiently SQLOS encapsulates mechanisms peculiar to Windows on Intel, or for that matter Windows as opposed to Linux.
Continue reading “Windows, mirrors and a sense of self”

Threading for humans

I used to think of threading as a complicated subject that everybody except me had a grip on. Turns out that it’s actually simple stuff with complicated repercussions, and no, many people don’t really get it, so I was in good company all along. Because I’m heading into some SQLOS thread internals, this is a good time to take stock and revisit a few fundamentals. This will be an idiosyncratic explanation, and one that ignores many complexities inside the black box – CPU internals, the added abstraction of a hypervisor, and programming constructs like thread pooling and tasks – for the sake of focusing on functionality directly exposed to the lower levels of software.
Continue reading “Threading for humans”

Unsung SQLOS: linked lists

I’m slowly working towards some more juicy subjects again, but for the moment it is sensible to get fundamentals out of the way. The use of linked lists by SQL Server and SQLOS is a great place to start, because so much is built upon them. Grok this and you’re set for countless hours of fun.

From general exposure to SQLOS scheduling, many of us are familiar with concepts like “the worker is put on the runnable list”. At a high level this is simple to grasp, and the good news is that not much is hidden from us in the implementation detail. Nonetheless, it seems that it’s only systems programmers who deal with these details nowadays, but it’s still useful for the rest of us to get a chance to get comfortable with such internals.

Linked list implementation in SQLOS

A SQLOS doubly linked list follows a common Windows pattern based on a ListEntry structure. This is remarkably simple, containing only two pointer-sized members, i.e. taking up 16 bytes on x64:

  • flink (forward link) – a pointer to the next entry in the list
  • blink (backward link) – a pointer to the previous entry in the list

Continue reading “Unsung SQLOS: linked lists”

Anatomy and psychology of a SQLOS spinlock

SQL Server spinlocks are famously elusive little beasties, tending to stay in the shadows except when they come out to bother you in swarms. I’m not going to add to the documentation of where specific spinlock types show up, or how to respond to contention on different types; the interested reader likely already knows where to look. Hint: Chris Adkin is quite the spinlock exterminator of the day.

In preparation for Bob Ward’s PASS Summit session, I figured it would make sense to familiarise myself a bit more with spinlock internals, since I have in the past found it frustrating to try and get a grip on it. Fact is, these are actually simple structures that are easy to understand, and as usual, a few public symbols go a long way. Being undocumented stuff, the usual caveats apply that one should not get too attached to implementation details.

Spinlock structure

It doesn’t get any simpler. A SQLOS spinlock is just a four-byte integer, embedded as a member variable in various classes, with two states:

  • Not acquired – the value is zero
  • Acquired – the value is the Windows thread ID of the owner

Continue reading “Anatomy and psychology of a SQLOS spinlock”

The road to UTC is paved with DatetimeOffset

Here is a dirty little secret: most of us who live in single-time-zone countries are not comfortable with time zones. Hardly surprising given that we tend to treat wall clock time as a law of nature that only get adjusted twice a year. Now relating one’s day-to-day life to local time is all very well, but is this a good way to store timestamps in your data?

The simple solution is to store all times in UTC, and many systems are set up like that. The beauty of UTC is that it can be unambiguously translated to anything else you’d want as display time. However, once you are in the habit of storing local time – and more importantly, thinking in it – making the move could be a tricky culture shift.

We have a peculiar pitfall here in the UK, where local time happens to correspond to UTC during the winter months. This means that code written during the winter has a chance of showing up with a time zone bug when summer rolls around, either through straightforward confusion about what the code is persisting or as an integration problem where you thought both systems were speaking local time but it turns out that that one was UTC and one was local. Make no mistake, consistent use of UTC is a solid, simple solution, but that is easier said than done outside of greenfield development.
Continue reading “The road to UTC is paved with DatetimeOffset”

Milliseconds 10, ticks 3

Since the year dot, SQL Server’s native Datetime type has given us a resolution of 300 ticks per second. With the introduction of Datetime2 and DatetimeOffset in 2008, we now get up to 100ns resolution. For those who like the finer things in life, this appears to be cause for rejoicing.

Image by frankieleon CC Attribution 2.0

Fun observation: if 100ns corresponded to 1 inch, 3.3ms translates to just over half a mile. That’s a pretty big difference in accuracy. Or should that be precision? Hmmmm, here’s a can of worms.
Continue reading “Milliseconds 10, ticks 3”