The Latch Files 4: Superlatch promotion logic

In Part 3 I explained how this beast called the buffer latch cycle-based latch promotion threshold gets calculated and broadly what it means, but I didn’t tackle the obvious question of “who does what with this information?”. This post will tie some global settings together with per-buffer tracking to unravel the mystery of when a buffer latch is deemed hot enough to deserve a promotion. What I describe applies identically to SQL Server 2014 and SQL Server 2016, and it is likely that it wouldn’t have changed much from preceding versions, although I haven’t confirmed this.

Mugatu from Zoolander — Mugatu assesses breferences

Meet the cast: globals

There are quite a few pieces of machinery that are involved in our little drama. First, I’ll introduce some instance-global settings:

A flag that controls whether latch promotion is enabled at all. Although I don’t have any information about this, let’s assume that it will be enabled on any system that “warrants it”.
A flag that controls whether cycle-based promotion is enabled. Again, I can’t currently tell you what determines this setting.
sm_promotionThreshold, the current calculated cycle-based promotion threshold described in Part 3.
sm_promotionUpperBoundCpuTicks, used as a ceiling value to prevent outliers from skewing stats. As described in Part 3, this is simply sm_promotionThreshold * 5.
Trace flag 844, which lowers the threshold for non-cycle-based promotions.
Trace flag 827, which causes each latch promotion to be noted in the SQL Server log (“Latch promotion, page %u:%u in database %u, objid %u.”)

Assume that the first flag is set on our system of interest, otherwise promotions won’t happen and we have nothing to talk about.

Meet the cast: per-page members

Then we have a few members within the BUF structure itself, serving as additional metadata for a specific buffer latch. Strictly speaking the BUF contains the page latch, but an intuitively attractive way to think of the relationship between the two is that a page latch is an extended derivation of the generic (non-page) latch. In this mental model, a buffer latch is a richer form of latch that includes the ability to track how often it is accessed, and how long it takes on average to be acquired, i.e. how contended it is. It is kind of immaterial whether we speak of contention on the page, the BUF structure or the latch – since they have a 1:1:1 relationship for a page in memory, the distinction is academic, even though the latch is the part that is designed to accept the blame.

The BUF members of interest here is a small subset of the complete structure:

breferences is a count of how many times the page has been touched since the counters were last reset. While not strictly true (a situation that warrants its own post) this definition is good enough for today.
bsamplecount is a count of how many latch acquisitions had their acquisition time sampled since the last counter reset. The decision whether or not to sample lies in LatchBase::AcquireInternal(), and it is done about 10% of the time.
bcputicks is the sum of the actual tick counts returned by the aforementioned samples; however, if a given tick count sample exceeds sm_promotionUpperBoundCpuTicks, we add that ceiling value to the aggregate rather than using the actual outlier value.
bUse1 is the timestamp (in seconds) when the above counters were last reset to zero. The actual number isn’t meaningful, only serving as a clock hand, and is maintained by the lazywriter. For our purposes, think of the source domain as a type of wall clock.
FAST_PROMOTE is one of the bitflags in bstat. Again something I can’t tell you much about yet, but when set, the latch qualifies for a discounted promotion threshold.

The logic described here all lives in the method BPool::Touch which is called every time a page – and by implication the associated BUF and latch – is accessed. Most of the time the method only updates those counters and returns. However, if four or more seconds have elapsed since the last counter reset (measured by comparing the saved value of bUse1 against our notional wall clock) we go into the latch promotion decision tree and finish off by resetting the counters, irrespective of whether or not the latch gets promoted. In other words, no more than four seconds of page access history goes into the promotion decision.

Upon entering BPool::Touch, we increment breferences and check whether or not enough time has elapsed since the last counter reset. If so, and latch promotions are enabled whatsoever, we enter one of two decision trees. In both cases, promotion is only considered if the page isn’t dirty (the DIRTY flag in bstat is clear).

Decision tree 1: cycle-based promotion

If cycle-based promotion is enabled, the following additional prerequisites apply:

breferences > 2,000.
bsamplecount > 10.
Average bcputicks per sampled acquisition (i.e. aggregate bcputicks / bsamplecount) is greater than sm_promotionThreshold. However, if FAST_PROMOTE is set for this page, the average only needs to be greater than sm_promotionThreshold / 2, i.e. we get a 50% discount on the threshold.

Decision tree 2: reference-based promotion

If cycle-based promotion is disabled, a simpler set of additional prerequisites apply:

Either breferences > 200,000
Or trace flag 844 is set and breferences > 4,000

Conclusion

And that’s that! Not rocket science, but clearly a good amount of history went into these rules stabilising into their current form. And I’d venture a guess that TF844 reflects a desire to satisfy an actual or anticipated need for customers who are desperate to see latch promotions happening. Whether or not that actually solves a problem…

Again, I have skirted the issue of what happens inside the act of latch promotion, or what a superlatch looks like. Bob Dorr (as is to be expected!) has an excellent exposition on superlatches and I’d consider this the canonical explanation. I might do a further deep dive into superlatches, but I feel it’s time to step into other areas for my next few blog posts.