Concurrency Control 1: Locking and Degrees of Consistency
Statement of problem:
- Database: a fixed set of named resources (e.g. tuples, pages, files, whatever)
- Consistency Constraints: must be true for DB to be considered "consistent".
- sum(account balances) = sum(assets)
- P is index pages for R
- ACCT-BAL > 0
- each employee has a valid department
- Transaction: a sequence of actions bracketed by begin and end statements. Each
transaction ("xact") is assumed to maintain consistency (this can be guaranteed
by the system).
- Goal: Concurrent execution of transactions, with high throughput/utilization, low
response time, and fairness.
The ACID test for transaction management:
- Atomicity: Either all actions within a transaction occur, or none do ("all or
nothing at all")
- Consistency: Each transaction takes the database from one consistent state to another.
No intermediate states are observable. (We will refine this definition today)
- Isolation: Events within a transaction must be invisible to other transactions. This
allows transactions to be aborted in isolation (i.e. avoids cascaded aborts)
- Durability: Once a transaction is committed, its results must be preserved even in the
case of failures.
C & I guaranteed by concurrency control. A & D guaranteed by recovery.
A transaction schedule:
|A = A - 50
||temp = A*0.1
||A = A - temp
|B = B + 50
||B = B + temp
The system "understands" only reads and writes; cannot assume any semantics
of other operations.
Arbitrary interleaving can lead to:
- temporary inconsistencies (ok, unavoidable)
- "permanent" inconsistencies, that is, inconsistencies that remain after
transactions have completed.
- Schedule: A "history" or "audit trail" of all actions in the system,
the xacts that performed them, and the objects they affected.
- Serial Schedule: A schedule is serial if all the actions of each single xact appear
- Equivalence of schedules: Two schedules S1, S2 are considered computationally equivalent
- The set of transactions that participate in S1 and S2 are the same.
- For each data item Q in S1, if transaction Ti executes read(Q) and the value of Q read
by Ti was written by Tj, then the same will hold in S2. [reads are all the same]
- For each data item Q in S1, if transaction Ti executes the last write(Q) instruction,
then the same holds in S2. [the same writers "win"]
- Serializability: A schedule S is serializable if there exists a serial schedule S
such that S and S are computationally equivalent.
One way to think about concurrency control in terms of dependencies:
- T1 reads N ... T2 writes N: a RW dependency
- T1 writes N ... T2 reads N: a WR dependency
- T1 writes N ... T2 writes N: a WW dependency
Can construct a "serialization graph" for a schedule S (SG(S)):
- nodes are transactions T1, ..., Tn
- Edges: Ti -> Tj if there is a RW, WR, or WW dependency from Ti to Tj
Theorem: A schedule S is serializable iff SG(S) is acyclic.
A technique to ensure serializability, but hopefully preserve high concurrency as well.
The winner in industry.
- A "lock manager" records what entities are locked, by whom, and in what
"mode". Also maintains wait queues.
- A well-formed transaction locks entities before using them, and unlocks them some time
Multiple lock modes: Some data items can be shared, so not all locks need to be
Lock compatibility table 1:
Assume two lock modes: shared (S) and exclusive (X) locks.
If you request a lock in a mode incompatible with an existing lock, you must
Two-Phase Locking (2PL):
- Growing Phase: A transaction may obtain locks but not release any lock.
- Shrinking Phase: A transaction may release locks, but not obtain any new lock. (in fact,
locks are usually all released at once to avoid "cascading aborts".)
Theorem: If all xacts are well-formed and follow 2PL, then any resulting schedule is
(note: this is if, not if and only if!)
Gray, et al.: Granularity of Locks
Theme: Correctness and performance
- Granularity tradeoff: small granularity (e.g. field of a tuple) means high concurrency
but high overhead. Large granularity (e.g. file) means low overhead but low concurrency.
- Possible granularities:
- Tuples (records)
- fields of tuples
- Want hierarchical locking, to allow "large" xacts to set large locks,
"small" xacts to set small locks
- Problem: T1 S-locks a record in a file, then T2 X-locks the whole file. How can T2
discover that T1 has locked the record?
- Solution: "Intention" locks
- IS and IX locks
- T1 obtains S lock on record in question, but first gets IS lock on file.
- Now T2 cannot get X lock on file
- However, T3 can get IS or S lock on file (the reason for distinguishing IS and IX: if
there were only I, T3 couldnt get an S lock on file)
- For higher concurrency, one more mode: SIX. Intuitively, you read all of the object but
only lock some subparts. Allows concurrent IS locks (IX alone would not). Note: gives S
access, so disallows IX to others.
- requires that xacts lock items root to leaf in the hierarchy, unlock leaf to root
- generalization to DAG of resources: X locks all paths to a node, S locks at least one.
Some implementation background (not in paper):
- maintain a lock table as hashed main-mem structure
- lock/unlock must be atomic operations (protected by critical section)
- typically costs several hundred instructions to lock/unlock an item
- suppose T1 has an S lock on P, T2 is waiting to get X lock on P, and now T3 wants S lock
on P. Do we grant T3 an S lock?
No! (starvation, unfair, etc.) So...
- Manage FCFS queue for each locked object with outstanding requests
- all xacts that are adjacent and compatible are a compatible group
- The front group is the granted group
- group mode is most restrictive mode amongst group members
- Conversions: often want to convert (e.g. S to X for "test and modify"
actions). Should conversions go to back of queue?
- No! Instant deadlock (more notes on deadlock later). So put conversions right after
Gray, et al.: Degrees of Consistency
First, a definition: A write is committed when transaction if finished; otherwise, the
write is dirty.
A Locking-Based Description of Degrees of Consistency:
This is not actually a description of the degrees, but rather of how to achieve them.
But its easier to understand (?)
- Degree 0: set short write locks on updated items
- Degree 1: set long write locks on updated items ("long" = EOT)
- Degree 2: set long write locks on updated items, and short read locks on items read
- Degree 3: set long write and read locks
A Dirty-Data Description of Degrees of Consistency
Transaction T sees degree X consistency if...
- Degree 0: T does not overwrite dirty data of other transactions
- Degree 1:
- T sees degree 0 consistency, and
- T does not commit any writes before EOT
- T sees degree 1 consistency, and
- T does not read dirty data of other transactions
- T sees degree 2 consistency, and
- Other transactions do not dirty any data read by T before T completes.
Examples of Inconsistencies prevented by Various Degrees
- Garbage reads:
T1: write(X); T2: write(X)
Who knows what value X will end up being?
Solution: set short write locks (degree 0)
- Lost Updates:
T1: abort (restores X to pre-T1 value)
At this point, the update to T2 is lost (note: log contains T1.X.oldval, newval)
Solution: set long write locks (degree 1)
- Dirty Reads:
Now T2s read is bogus.
Solution: set long X locks and short S locks (degree 2)
Many systems do long-running queries at degree 2.
- Unrepeatable reads:
T2: end transaction
Now T2 has read two different values for X.
Solution: long read locks (degree 3)
T1: read range [x - y]
T2: insert z, x < z < y
T2: end transaction
T1: read range [x - y]
Z is a "phantom" data item (eek!)
NOTE: two-phase well-formed implies degree-3 consistency. (Why?)
Further reading on consistency levels: Berenson, et al., SIGMOD 1995.
Notes on Deadlock
- In OS world, deadlock usually due to errors or overloads
- In DB/xact world with 2PL, theyre inherent.
- Most common causes:
- Usual DB solution: deadlock detection (other option: deadlock avoidance)
- Use "waits-for" graph and look for cycles
- Empirically, in actual systems the waits-for graph shows:
- cycles fairly rare
- cycle length usually 2, sometimes 3, virtually never >3
- use DFS to find cycles
- When to look for cycles?
- whenever a xact blocks
- never (use timeouts)
Typically, in centralized systems, run deadlock detection whenever blocking occurs.
Arguably this is cheap: most recently blocked transaction T must be the one that
caused the deadlock, so just DFS starting from T
In distributed systems, even this can be expensive, so often do periodic detection
- Who to restart ("victim selection")
- current blocker
- youngest XACT
- least resources used
- fewest locks held (common)
- fewest number of restarts