Until an hour or two ago, I had written off the idea of contributing to this month’s T-SQL Tuesday, hosted by Matt Gordon with the theme of “Fixing Old Problems with Shiny New Toys”. I’ve lately had a bee in my bonnet about a new blog series, but the introduction keeps receding into the distance as I continue to encounter the dreaded “further research needed” syndrome. (Update: And here is part 1!)
Now it is 8pm UTC on Valentine’s day with four hours to go before the T-SQL Tuesday deadline, and I’m thinking about Sessions. Which is a good thing, because as of today we can all stop worrying about Flynn.
SESSION_CONTEXT() as Swiss Army knapsack
So the shiny new tool I came across is session context, the family-sized successor to the old CONTEXT_INFO. Aaron Bertrand has written a great blog post about it: Phase out CONTEXT_INFO() in SQL Server 2016 with SESSION_CONTEXT().
SESSION_CONTEXT() brings two major innovations. Firstly, it replaces a 128-byte scalar payload with a key-value structure that can accommodate 256kB of data. You can really go to town filling this thing up.
The second change is less glamorous, but possibly more significant: it is possible to set an entry to read-only, meaning that it can safely be used for the kind of contextual payload you don’t want tampered with. This makes me happy, not because I currently have a great need for it, but because it neatly ties in with things I have been thinking about a lot lately.
The rise of the kernel
Something that comes up time and again in multi-layered architectures is the Inner Platform Effect. Just when is it justified to use a programming framework to recreate a function that said framework already fulfils?
An OS kernel is sacred ground. When designed sanely and safely, it doesn’t allow clients (application code) to execute arbitrary code in kernel context. This is because the deeper kernel layer has privileges that could be abused, and user code must be kept at arm’s length within user processes, sandboxed in such a way that they can attack neither the kernel nor each other. Quite simply, we provide the means to enforce the principle of least privilege.
This separation is enforced on the hardware level through things like virtual memory mapping, whereby different processes can’t see each other’s memory. And while the remapping of memory (a simple attack vector) is just another software function exposed by the CPU, the ability to modify these mappings is reserved for a higher privilege level than common application code.
On top of this, we build the notion of threads, each having a distinct identity. This is a far more hazy concept than memory mapping, in that the CPU provides the barest minimum of support functionality to support the illusion. Switching between threads may involve changing memory mappings (when the outgoing and incoming threads belong to different processes), but it always includes changing a tiny bit of thread identity which user code might be able to read, but can’t overwrite. Since user identity, and hence permissions checking, is tied into thread identity, this makes perfect sense. A thread which is allowed to muck about with its own identity, or the identity of other threads, is a security risk.
Now we go and build a multithreaded server application like SQL Server on top of these abstractions. The code which is trusted to have the run of its process space is the code that shipped with the executable, assuming we temporarily blank out the terror of extended stored procedures. This code in turn maintains a cosy environment for user-supplied code in the form of T-SQL, which plays in a memory space consisting of access-controlled global objects (tables), plus some session-scoped objects (temp tables) and batch-scoped ones (variables).
In simple textbook cases, it stops there. Ahmed is restricted to audited querying of North-East region sales data, while Ivanka gets to be security admin.
The database is the application’s kernel
That simple textbook case falls flat when you move permissions and identity out to the application layer, with all application/database users getting represented by a single database user. All understandable, especially in web apps, but now the burden is upon application code, whether inside or outside the database, to find ways of continually answering that thorny “Who Am I?” question.
Outside the database (the web app) this is a solved problem, but externally defined identity and permission sets aren’t accessible to stored procedures and triggers, unless the user identity is pushed through as a parameter. And clearly this is something that can be faked, which brings us neatly back to the potential need for secure session-scoped metadata that is non-editable after being set up.
Think of these stored procedures and triggers as kernel code. We need only the tiniest smidgen of an identifier that represents a trusted identity token. For actual OS kernel code, this can be (and is!) reduced to a single internal CPU register. Anything beyond that can be derived by allocating storage and passing payload through bit by bit, as long as all the communication is over this trusted connection with its identifying metadata token.
The context of context
This frames where I hope to be heading with my next set of blog posts. You already know the punch line: in some form or another, we are always reliant on thread-local storage. It’s just a question of how many extra layers get piled on top of that basic thread abstraction until we get to a SQL Server session.
And then, just when you think you have a good abstraction, along comes a programming pattern that strips the session back to an anonymous connection, and uses SESSION_CONTEXT() to build something new on top. Be that as it may, session context is a great user-visible touch point for some juicy internals!