
Concurrency Control 4: Tree Locking ProtocolsKung and Robinson said we can do optimistic locking on Btrees, since each lookup touches only h pages, and the tree will have f^{h} pages.
What can happen if you proceed optimistically? Problem: Design a locking protocol that allows highly concurrent access to a Btree. Solution 1: release locks early (non2PL!) "Latches". OK,
but when? BLink Trees (Lehman/Yao)A superhigh concurrency solution, at the expense of a little extra complexity in the data structure.
Search current = root; A = get(current); while (current is not a leaf) { current = scannode^{1}(v, A); A = get(current); } while ((t = scannode(v,A)) == link pointer of A) { current = t; A = get(current); } if (v is in A) return(success); else return(failure); Simple! Only trick is to have scannode know about highkeys and rightlinks. (Footnote: The scannode(v,A) routine examines memory page A and finds the appropriate pointer for value v. Note that it may return a rightlink pointer instead of a child pointer.) Insert First, we find a leaf node, and keep a stack of the rightmost node we visited at each level: initialize stack; current = root; A = get(current); while (current is not a leaf) { t = current; current = scannode(v,A); if (current not link pointer in A) push t; A = get(current); } When we get to the leaf level, we may need to search right for the appropriate leaf. The move_right procedure scans right across the bottom, with lock coupling (i.e. if you have to move right, first lock right neighbor, then release lock on current). lock(current); A = get(current); move_right(); Now, assuming the key/ptr pair is not already in the tree, we proceed to insert & possibly split: Doinsertion: if A is safe { insert new key/ptr pair on A; put(A, current); unlock(current); } else { // gonna have to split u = allocate(1 new page for B); redistribute A over A and B; y = max value on A now; make high key of B equal old high key of A; make rightlink of B equal old rightlink of A; make high key of A equal y; make rightlink of A point to B; put (B, u); put (A, current); oldnode = current; new key/ptr pair = (y, u); // high key of new page, new page current = pop(stack); lock(current); A = get(current); move_right(); // at this point we may have 3 locks: oldnode, // and two at the parent level while moving right unlock(oldnode); goto Doinsertion; } Note the worstcase multiple locking here. (Improvement later proposed by Sagiv, and by Lanin & Shasha: unlock current after put(A, current)). Delete: Just remove from the leaf. They punt on underflow – just let leaves get empty, never delete them (hence never do deletion from internal nodes.) If you think your tree is too empty, then reorganize it offline. Why does this all work?
Interesting potential problem: Livelock. What’s missing from this discussion??? Alternative techniques are used in ARIES: ARIES/KVL & ARIES/IM. Don’t require
rightlinks, add a little more constraint on ordering of operations. On occasion need to
"reposition" (i.e. find the appropriate spot on a level to continue.) Papers
handle lots more details than Lehman/Yao, including degree3 consistency, deletion,
logging/recovery, savepoints. Extensions for Rtrees & GiSTsA CS286 class project in ’94, published VLDB ’95 (Kornacker & Banks). Improved for GiST with concurrency, degree3 consistency, deletion, savepoints in SIGMOD ’97 (Kornacker, Mohan & Hellerstein). Main differences to focus on:
Raises 2 questions:
Idea: impose an ordering that has nothing to do with the data. Each page gets a Node Sequence Number (NSN), like a timestamp. On page split, the new right sibling gets the original NSN, and the left sibling gets a new NSN, and parent’s NSN is updated on insertion of pointer to new sibling. Split detection: if child’s NSN is greater than the NSN in the parent entry, child has since split. Limiting righttraversal: only scan until a lower NSN. Some extra details:

© 1998, Joseph M. Hellerstein. Last
modified 08/18/98. 