redbookcover.gif (13955 bytes) Readings in Database Systems, 3rd Edition

Stonebraker & Hellerstein, eds.

Concurrency Control 1: Locking and Degrees of Consistency

Transaction Refresher

Statement of problem:

  • Database: a fixed set of named resources (e.g. tuples, pages, files, whatever)
  • Consistency Constraints: must be true for DB to be considered "consistent". Examples:
    • sum(account balances) = sum(assets)
    • P is index pages for R
    • ACCT-BAL > 0
    • each employee has a valid department
  • Transaction: a sequence of actions bracketed by begin and end statements. Each transaction ("xact") is assumed to maintain consistency (this can be guaranteed by the system).
  • Goal: Concurrent execution of transactions, with high throughput/utilization, low response time, and fairness.


The ACID test for transaction management:

  • Atomicity: Either all actions within a transaction occur, or none do ("all or nothing at all")
  • Consistency: Each transaction takes the database from one consistent state to another. No intermediate states are observable. (We will refine this definition today)
  • Isolation: Events within a transaction must be invisible to other transactions. This allows transactions to be aborted in isolation (i.e. avoids cascaded aborts)
  • Durability: Once a transaction is committed, its results must be preserved even in the case of failures.


C & I guaranteed by concurrency control. A & D guaranteed by recovery.


A transaction schedule:

T0 T1
A = A - 50
temp = A*0.1
A = A - temp
B = B + 50
B = B + temp


The system "understands" only reads and writes; cannot assume any semantics of other operations.

Arbitrary interleaving can lead to:

  • temporary inconsistencies (ok, unavoidable)
  • "permanent" inconsistencies, that is, inconsistencies that remain after transactions have completed.


Some definitions:

  • Schedule: A "history" or "audit trail" of all actions in the system, the xacts that performed them, and the objects they affected.
  • Serial Schedule: A schedule is serial if all the actions of each single xact appear together.
  • Equivalence of schedules: Two schedules S1, S2 are considered computationally equivalent if:
  1. The set of transactions that participate in S1 and S2 are the same.
  2. For each data item Q in S1, if transaction Ti executes read(Q) and the value of Q read by Ti was written by Tj, then the same will hold in S2. [reads are all the same]
  3. For each data item Q in S1, if transaction Ti executes the last write(Q) instruction, then the same holds in S2. [the same writers "win"]
  • Serializability: A schedule S is serializable if there exists a serial schedule S’ such that S and S’ are computationally equivalent.


One way to think about concurrency control – in terms of dependencies:

  1. T1 reads N ... T2 writes N: a RW dependency
  2. T1 writes N ... T2 reads N: a WR dependency
  3. T1 writes N ... T2 writes N: a WW dependency

Can construct a "serialization graph" for a schedule S (SG(S)):

  • nodes are transactions T1, ..., Tn
  • Edges: Ti -> Tj if there is a RW, WR, or WW dependency from Ti to Tj

Theorem: A schedule S is serializable iff SG(S) is acyclic.


A technique to ensure serializability, but hopefully preserve high concurrency as well. The winner in industry.

  • A "lock manager" records what entities are locked, by whom, and in what "mode". Also maintains wait queues.
  • A well-formed transaction locks entities before using them, and unlocks them some time later.

Multiple lock modes: Some data items can be shared, so not all locks need to be exclusive.
Lock compatibility table 1:
Assume two lock modes: shared (S) and exclusive (X) locks.

  S X

If you request a lock in a mode incompatible with an existing lock, you must wait. 

Two-Phase Locking (2PL):

  • Growing Phase: A transaction may obtain locks but not release any lock.
  • Shrinking Phase: A transaction may release locks, but not obtain any new lock. (in fact, locks are usually all released at once to avoid "cascading aborts".)

Theorem: If all xacts are well-formed and follow 2PL, then any resulting schedule is serializable
(note: this is if, not if and only if!)

Gray, et al.: Granularity of Locks

Theme: Correctness and performance

  • Granularity tradeoff: small granularity (e.g. field of a tuple) means high concurrency but high overhead. Large granularity (e.g. file) means low overhead but low concurrency.
  • Possible granularities:
    • DB
    • Areas
    • Files
    • Pages
    • Tuples (records)
    • fields of tuples
  • Want hierarchical locking, to allow "large" xacts to set large locks, "small" xacts to set small locks
  • Problem: T1 S-locks a record in a file, then T2 X-locks the whole file. How can T2 discover that T1 has locked the record?
  • Solution: "Intention" locks
      NL IS IX S SIX X


    • IS and IX locks
    • T1 obtains S lock on record in question, but first gets IS lock on file.
    • Now T2 cannot get X lock on file
    • However, T3 can get IS or S lock on file (the reason for distinguishing IS and IX: if there were only I, T3 couldn’t get an S lock on file)
    • For higher concurrency, one more mode: SIX. Intuitively, you read all of the object but only lock some subparts. Allows concurrent IS locks (IX alone would not). Note: gives S access, so disallows IX to others.
    • requires that xacts lock items root to leaf in the hierarchy, unlock leaf to root
    • generalization to DAG of resources: X locks all paths to a node, S locks at least one.

Some implementation background (not in paper):

  • maintain a lock table as hashed main-mem structure
  • lock/unlock must be atomic operations (protected by critical section)
  • typically costs several hundred instructions to lock/unlock an item
  • suppose T1 has an S lock on P, T2 is waiting to get X lock on P, and now T3 wants S lock on P. Do we grant T3 an S lock?

No! (starvation, unfair, etc.) So...

    • Manage FCFS queue for each locked object with outstanding requests
    • all xacts that are adjacent and compatible are a compatible group
    • The front group is the granted group
    • group mode is most restrictive mode amongst group members
  • Conversions: often want to convert (e.g. S to X for "test and modify" actions). Should conversions go to back of queue?
  • No! Instant deadlock (more notes on deadlock later). So put conversions right after granted group.

Gray, et al.: Degrees of Consistency

First, a definition: A write is committed when transaction if finished; otherwise, the write is dirty.

A Locking-Based Description of Degrees of Consistency:

This is not actually a description of the degrees, but rather of how to achieve them. But it’s easier to understand (?)

  • Degree 0: set short write locks on updated items
  • Degree 1: set long write locks on updated items ("long" = EOT)
  • Degree 2: set long write locks on updated items, and short read locks on items read
  • Degree 3: set long write and read locks

A Dirty-Data Description of Degrees of Consistency

Transaction T sees degree X consistency if...

  • Degree 0: T does not overwrite dirty data of other transactions
  • Degree 1:
    1. T sees degree 0 consistency, and
    2. T does not commit any writes before EOT
  • Degree 2:
    1. T sees degree 1 consistency, and
    2. T does not read dirty data of other transactions
  • Degree 3:
    1. T sees degree 2 consistency, and
    2. Other transactions do not dirty any data read by T before T completes.

Examples of Inconsistencies prevented by Various Degrees

  • Garbage reads:
  • T1: write(X); T2: write(X)

    Who knows what value X will end up being?

    Solution: set short write locks (degree 0)

  • Lost Updates:
  • T1: write(X)

    T2: write(X)

    T1: abort (restores X to pre-T1 value)

    At this point, the update to T2 is lost (note: log contains T1.X.oldval, newval)

    Solution: set long write locks (degree 1)

  • Dirty Reads:
  • T1: write(X)

    T2: read(X)

    T1: abort

    Now T2’s read is bogus.

    Solution: set long X locks and short S locks (degree 2)

    Many systems do long-running queries at degree 2.

  • Unrepeatable reads:
  • T1: read(X)

    T2: write(X)

    T2: end transaction

    T1: read(X)

    Now T2 has read two different values for X.

    Solution: long read locks (degree 3)

  • Phantoms:
  • T1: read range [x - y]

    T2: insert z, x < z < y

    T2: end transaction

    T1: read range [x - y]

    Z is a "phantom" data item (eek!)

    Solution: ??

    NOTE: two-phase well-formed implies degree-3 consistency. (Why?)

Further reading on consistency levels: Berenson, et al., SIGMOD 1995.

Notes on Deadlock

  • In OS world, deadlock usually due to errors or overloads
  • In DB/xact world with 2PL, they’re inherent.
  • Most common causes:
    • Differing access orders

      T1: X-lock P

      T2: X-lock Q

      T1: X-lock Q // block waiting for T2

        T2: X-lock P // block waiting for T1

    • lock-mode upgrades

      T1: S-lock P

      T2: S-lock P

      T1: convert S-lock on P to X-lock // block

      T2: convert S-lock on P to X-lock // block

  • Usual DB solution: deadlock detection (other option: deadlock avoidance)
    • Use "waits-for" graph and look for cycles
    • Empirically, in actual systems the waits-for graph shows:
      • cycles fairly rare
      • cycle length usually 2, sometimes 3, virtually never >3
      • use DFS to find cycles
    • When to look for cycles?
        1. whenever a xact blocks
        2. periodically
        3. never (use timeouts)

Typically, in centralized systems, run deadlock detection whenever blocking occurs.   Arguably this is cheap: most recently blocked transaction T must be the one that caused the deadlock, so just DFS starting from T

In distributed systems, even this can be expensive, so often do periodic detection (more later)

  • Who to restart ("victim selection")
    1. current blocker
    2. youngest XACT
    3. least resources used
    4. fewest locks held (common)
    5. fewest number of restarts

1998, Joseph M. Hellerstein.  Last modified 08/18/98.
Feedback welcomed.