EECS 262a Advanced Topics in Computer Systems Lecture 9 Transactions and Isolation Levels October 1st, 2014 Alan Fekete Based on slides by Alan Fekete, Uwe Roehm and Michael Cahill (University of Sydney), updated by John Kubiatocwicz and Anthony D. Joseph Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs262 Todays Papers Granularity of Locks and Degrees of Consistency in a Shared Database (2-up version) J.N. Gray, R.A. Lorie, G.R. Putzolu, I.L. Traiger. Appears In IFIP Working Conference on Modeling of Data Base Management Systems. 1975
Serializable Isolation for Snapshot Databases Michael J. Cahill, Uwe Rhm, Alan D. Fekete. Appears in ACM SIGMOD08, June 912, 2008 Thoughts? 10/1/2014 cs262a-S14 Lecture-09 2 Overview Transactions ACID properties Isolation Examples and counter-examples Classic Implementation Techniques Weak isolation issues
10/1/2014 cs262a-S14 Lecture-09 3 Definition A transaction is a collection of one or more operations on one or more databases, which reflects a single real-world transition In the real world, this happened (completely) or it didnt happen at all (Atomicity) Once it has happened, it isnt forgotten (Durability) Commerce examples Transfer money between accounts Purchase a group of products Student record system Register for a class (either waitlist or allocated)
10/1/2014 cs262a-S14 Lecture-09 4 Consistency Each transaction can be written on the assumption that all integrity constraints hold in the data, before the transaction runs It must make sure that its changes leave the integrity constraints still holding However, there are allowed to be intermediate states where the constraints do not hold A transaction that does this, is called consistent This is an obligation on the programmer Usually the organization has a testing/checking and sign-off mechanism before an application program is allowed to get
installed in the production system 10/1/2014 cs262a-S14 Lecture-09 5 Threats to data integrity Need for application rollback System crash Provided by the log Concurrent activity Todays lecture is about this 10/1/2014 cs262a-S14 Lecture-09 6
Concurrency When operations of concurrent threads are interleaved, the effect on shared state can be unexpected Well known issue in operating systems, thread programming see OS textbooks on critical section Java use of synchronized keyword 10/1/2014 cs262a-S14 Lecture-09 7 Famous anomalies Dirty data One task T reads data written by T while T is running, then T aborts (so its data was not appropriate)
Lost update Two tasks T and T both modify the same data T and T both commit Final state shows effects of only T, but not of T Inconsistent read One task T sees some but not all changes made by T The values observed may not satisfy integrity constraints This was not considered by the programmer, so code moves into absurd path 10/1/2014 cs262a-S14 Lecture-09 8 Serializability To make isolation precise, we say that an execution is serializable when
There exists some serial (ie batch, no overlap at all) execution of the same transactions which has the same final state Hopefully, the real execution runs faster than the serial one! NB: different serial txn orders may behave differently; we ask that some serial order produces the given state Other serial orders may give different final states 10/1/2014 cs262a-S14 Lecture-09 9 Serializability Theory There is a beautiful mathematical theory, based on formal languages
Model an execution as a sequence of operations on data items eg r1[x] w1[x] r2[y] r2[x] c1 c2 Serializability of an execution can be defined by equivalence to a rearranged sequence (view serializability) Treat the set of all serializable executions as an object of interest (called SR) Thm: SR is in NP-Hard, i.e. the task of testing whether an execution is serializable seems unreasonably slow Does it matter? The goal of practical importance is to design a system that produces some subset of the collection of serializable executions Its not clear that we care about testing arbitrary executions that dont arise in our system 10/1/2014 cs262a-S14 Lecture-09 10
Conflict serializability There is a nice sufficient condition (ie a conservative approximation) called conflict serializable, which can be efficiently tested Draw a precedes graph whose nodes are the transactions Edge from Ti to Tj when Ti accesses x, then later Tj accesses x, and the accesses conflict (not both reads) The execution is conflict serializable iff the graph is acyclic Thm: if an execution is conflict serializable then it is serializable Pf: the serial order with same final state is any topological sort of the precedes graph Most people and books use the approximation, usually without mentioning it! 10/1/2014
cs262a-S14 Lecture-09 11 ACID Atomic State shows either all the effects of txn, or none of them Consistent Txn moves from a state where integrity holds, to another where integrity holds Isolated Effect of txns is the same as txns running one after another (ie looks like batch mode) Durable Once a txn has committed, its effects remain in the database
10/1/2014 cs262a-S14 Lecture-09 12 Big Picture If programmer writes applications so each txn is consistent And DBMS provides atomic, isolated, durable execution i.e. actual execution has same effect as some serial execution of those txns that committed (but not those that aborted) Then the final state will satisfy all the integrity constraints NB true even though system does not know all integrity constraints!
Automatic lock management DBMS requests the appropriate lock whenever the app program submits a request to read or write a data item If lock is available, the access is performed If lock is not available, the whole txn is blocked until the lock is obtained After a conflicting lock has been released by the other txn that held it 10/1/2014 cs262a-S14 Lecture-09 15 Lock modes Locks can be for writing (X), reading (S)
Refinements have extra modes Standard conflict rules: two X locks on the same data item conflict, so do one X and one S lock on the same data However, two S locks do not conflict X=exclusive S=shared 10/1/2014 Held/Requested X S X Block
Block S Block OK cs262a-S14 Lecture-09 16 Strict two-phase locking Locks that a txn obtains are kept until the txn completes Once the txn commits or aborts, then all its locks are released (as part of the commit or rollback processing) Two phases:
NB. This is different from when locks are released in O/S (while txn runs) or threaded code Locks are being obtained Locks are released (when txn finished) 10/1/2014 cs262a-S14 Lecture-09 17 Serializability If each transaction does strict two-phase locking (requesting all appropriate locks), then executions are serializable However, performance does suffer, as txns
can be blocked for considerable periods Deadlocks can arise, requiring system-initiated aborts 10/1/2014 cs262a-S14 Lecture-09 18 Proof sketch Suppose all txns do strict 2PL If Ti has an edge to Tj in the precedes graph That is, Ti accesses x before Tj has conflicting access to x Ti has lock at time of its access, Tj has lock at time of its access Since locks conflict, Ti must release its lock before Tjs access to x Ti completes before Tj accesses x Ti completes before Tj completes
So the precedes graph is subset of the (acyclic) total order of txn commit Conclusion: the execution has same final state as the serial execution where txns are arranged in commit order 10/1/2014 cs262a-S14 Lecture-09 19 Lock manager A structure in (volatile memory) in the DBMS which remembers which txns have set locks on which data, in which modes It does not allow a request to get a new lock if a conflicting lock is already held by a different txn NB: a lock does not actually prevent access to the data, it only prevents getting a conflicting lock
So data protection only comes if the right lock is requested before every access to the data 10/1/2014 cs262a-S14 Lecture-09 20 Lock manager API Access mainly based on items unique name eg tupleid, or primary key, for records Lock(name, txn, mode) Block until lock is available RemoveTxn(txn) Unlock(name, txn) LockUpgrade(name, txn, newmode)
ConditionalLock(name, txn, mode) Returns error immediately if unsuccessful 10/1/2014 cs262a-S14 Lecture-09 21 Granularity What is a data item (on which a lock is obtained)? Most times, in most modern systems: item is one tuple in a table Sometimes (especially in early 1970s): item is a page (with several tuples) Sometimes: item is a whole table 10/1/2014 cs262a-S14 Lecture-09
22 Granularity trade-offs Larger granularity: fewer locks held, so less overhead; but less concurrency possible false conflicts when txns deal with different parts of the same item Smaller fine granularity: more locks held, so more overhead; but more concurrency is possible System usually gets fine grain locks until there are too many of them; then it replaces them with larger granularity locks 10/1/2014 cs262a-S14 Lecture-09
23 Multigranular locking Care needed to manage conflicts properly among items of varying granularity Note: conflicts only detectable among locks on a given itemname System gets intention mode locks on larger granules before getting actual S/X locks on smaller granules Conflict rules arranged so that activities that do not commute must get conflicting locks on some item 10/1/2014 cs262a-S14 Lecture-09 24
Lock Mode Conflicts Held\Request IS IX S SIX X IS Yes Yes
Yes Yes Block IX Yes Yes Block Block Block S
Yes Block Yes Block Block SIX Yes Block Block Block
Block X Block Block Block Block Block Other modes also used for special purposes, like select-for-later-update, gaplocks for phantom prevention Lock manager may allow higher level to introduce its own conflict table
10/1/2014 cs262a-S14 Lecture-09 25 Lock manager internals Hash table, keyed by hash of item name Each item has a mode and holder (set) Wait queue of requests All requests and locks in linked list from transaction information Transaction table To allow thread rescheduling when blocking is finished Deadlock detection Either cycle in waits-for graph, or just timeouts 10/1/2014
cs262a-S14 Lecture-09 26 Explicit lock management With most DBMS, the application program can include statements to set or release locks on a table Details vary e.g. LOCK TABLE InStore IN EXCLUSIVE MODE 10/1/2014 cs262a-S14 Lecture-09 27
Overview Transactions Classic Implementation Techniques Weak isolation issues Especially Snapshot Isolation 10/1/2014 cs262a-S14 Lecture-09 28 Problems with serializability The performance reduction from isolation is high Transactions are often blocked because they want to read data that another txn has changed For many applications, the accuracy of the data they read is not crucial e.g. overbooking a plane is ok in practice
e.g. your banking decisions would not be very different if you saw yesterdays balance instead of the most up-to-date 10/1/2014 cs262a-S14 Lecture-09 29 A and D matter! Even when isolation isnt needed, no one is willing to give up atomicity and durability These deal with modifications a txn makes Writing is less frequent than reading, so log entries and write locks are considered worth the effort 10/1/2014
cs262a-S14 Lecture-09 30 Explicit isolation levels A transaction can be declared to have isolation properties that are less stringent than serializability However SQL standard says that default should be serializable (Gray75 called this level 3 isolation) In practice, most systems have weaker default level, and most txns run at weaker levels! 10/1/2014 cs262a-S14 Lecture-09 31
Browse SET TRANACTION ISOLATION LEVEL READ UNCOMMITTED Do not set S-locks at all Of course, still set X-locks before updating data If fact, system forces the txn to be read-only unless you say otherwise Allows txn to read dirty data (from a txn that will later abort) 10/1/2014 cs262a-S14 Lecture-09 32 Read Committed Gray75 called this degree 2
SET TRANACTION ISOLATION LEVEL READ COMMMITTED Most common in practice! Set S-locks but release them after the read has happened e.g. when cursor moves onto another element during scan of the results of a multirow query i.e. do not hold S-locks till txn commits/aborts Data is not dirty, but it can be inconsistent (between reads of different items, or even between one read and a later one of the same item) Especially, weird things happen between different rows returned by a cursor 10/1/2014 cs262a-S14 Lecture-09
33 Repeatable read SET TRANACTION ISOLATION LEVEL REPEATABLE READ Set S-locks on data items, and hold them till txn finished, but release locks on indices as soon as index has been examined Allows phantoms, rows that are not seen in a query that ought to have been (or vice versa) Problems if one txn is changing the set of rows that meet a condition, while another txn is retrieving that set 10/1/2014 cs262a-S14 Lecture-09 34
Snapshot Isolation (SI) A multiversion concurrency control mechanism was described in SIGMOD 95 by H. Berenson, P. Bernstein, J. Gray, J. Melton, E. ONeil, P. ONeil Does not guarantee serializable execution! Supplied by Oracle DB for Isolation Level Serializable (also in PostgreSQL before rel 9.1) Available in Microsoft SQL Server 2005 as Isolation Level Snapshot, and in PostgreSQL (since rel 9.1) as Isolation Level Repeatable Read 10/1/2014 cs262a-S14 Lecture-09 35 Snapshot Isolation (SI)
Read of an item may not give current value Instead, use old versions (kept with timestamps) to find value that had been most recently committed at the time the txn started Exception: if the txn has modified the item, use the value it wrote itself The transaction sees a snapshot of the database, at an earlier time Intuition: this should be consistent, if the database was consistent before 10/1/2014 cs262a-S14 Lecture-09 36 First committer wins (FCW) T will not be allowed to commit a
modification to an item if any other transaction has committed a changed value for that item since Ts start (snapshot) T must hold write locks on modified items at time of commit, to install them. In practice, commit-duration write locks may be set when writes execute. These simplify detection of conflicting modifications when T tries to write the item, instead of waiting till T tries to commit. 10/1/2014 cs262a-S14 Lecture-09 37 Benefits of SI Reading is never blocked, and reads dont block writes Avoids common anomalies
No dirty read No lost update No inconsistent read Set-based selects are repeatable (no phantoms) Matches common understanding of isolation: concurrent transactions are not aware of one anothers changes 10/1/2014 cs262a-S14 Lecture-09 38 Is every execution serializable? For any set of txns, if they all run with Two Phase Locking, then every interleaved execution is serializable For some sets of txns, if they all run with SI, then every execution is serializable
Eg the txns making up TPC-C For some sets of txns, if they all run with SI, there can be non-serializable executions Undeclared integrity constraints can be violated 10/1/2014 cs262a-S14 Lecture-09 39 Example Table Duties(Staff, Date, Status) Undeclared constraint: for every Date, there is at least 1 Staff with Status=Y Transaction TakeBreak(S, D) running at SI SELECT COUNT(*) INTO :tmp FROM Duties WHERE Date=:D AND Status=Y; IF tmp < 2 ROLLBACK;
UPDATE Duties SET Status = N WHERE Staff =:S AND Date =:D; COMMIT; 10/1/2014 cs262a-S14 Lecture-09 40 Example (continued) Possible execution, starting when two staff (S101, S103) are on duty for 2004-06-01 Concurrently perform TA: TakeBreak(S101, 2004-06-01) TB: TakeBreak(S103, 2004-06-01) Each succeeds, as each sees snapshot with 2 on duty
No problem committing, as they update different rows! End with no staff on duty for that date! RA(r1) RA(r3) RB(r1) RB(r3) WA(r1) CA WB(r3) CB S101 2004-06-01 Y S102 2004-06-01 N
Write Skew SI breaks serializability when txns modify different items in each others read sets Neither txn sees the other, but in a serial execution one would come later and so see the others impact This is fairly rare in practice Eg the TPC-C benchmark always runs correctly under SI whenever its txns conflict (eg read/write same data), there is also a ww-conflict: a shared item they both modify (like a total quantity) so SI will abort one of them 10/1/2014 cs262a-S14 Lecture-09 42 Interaction effects
You cant think about one program, and say this program can use SI The problems have to do with the set of application programs, not with each one by itself Example where T1, T2, T3 can all be run under SI, but when T4 is present, we need to fix things in T1 Non-serializable execution can involve read-only transactions, not just updaters 10/1/2014 cs262a-S14 Lecture-09 43 Multiversion Serializability Theory From Y. Raz in RIDE93 WW-conflict from T1 to T2
T1 writes a version of x, T2 writes a later version of x In our case, succession (version order) defined by commit times of writer txns WR-conflict from T1 to T2 T1 writes a version of x, T2 reads this version of x (or a later version of x) RW-conflict from T1 to T2 Adya et al ICDE00 called this antidependency T1 reads a version of x, T2 writes a later version of x Serializability tested by acyclic conflict graph 10/1/2014 cs262a-S14 Lecture-09 44
Interference Theory From Fekete et al, TODS 2005 We produce the static dependency graph Node for each application program Draw directed edges each of which can be either Non-vulnerable interference edge, or Vulnerable interference edge Based on looking at program code, to see what sorts of conflict situations can arise More complicated with programs whose accesses are controlled by parameters A close superset of SDG can be calculated automatically in some cases 10/1/2014 cs262a-S14 Lecture-09 45
Edges in the SDG Non-vulnerable interference edge from T1 to T2 Conflict, but it cant arise transactions can run concurrently Eg ww conflict Concurrent execution prevented by FCW Or wr conflict conflict wont happen in concurrent execution due to reading old version Eg T1 = R1(x) R1(y) W1(x) T2 = R2(x) R2(y) W2(x) W2(y)
10/1/2014 Vulnerable interference edge from T1 to T2 Conflict can occur when transactions run concurrently Eg rw without ww: rset(T1) intersects wset(T2), and wset(T1) disjoint from wset(T2) Eg T1 = R1(x) R1(y) W1(x) T2 = R2(x) R2(y) W2(y) Shown as dashed edge on diagram cs262a-S14 Lecture-09 46
Paired edges In SDG, an edge from X to Y implies an edge from Y to X But the type of edge is not necessarily the same Both vulnerable, or Both non-vulnerable, or One vulnerable and one non-vulnerable 10/1/2014 cs262a-S14 Lecture-09 47 Dangerous Structures A dangerous structure is two edges linking three application programs, A, B, C such that There are successive vulnerable edges (A,B) and (B,C) (A, B, C) can be completed to a cycle in SDG
Call B a pivot Special case: pair A, B with vulnerable edges in both directions Path through zero or more edges from C to A A B Pivo t 10/1/2014 C Dangerous structure cs262a-S14 Lecture-09 48
The main result Theorem: If the SDG does not contain a dangerous cycle, then every execution is serializable (with all transactions using SI for concurrency control) Applies to TPC-C benchmark suite 10/1/2014 cs262a-S14 Lecture-09 49 Serializable Isolation for Snapshot Databases [Sigmod08 Best paper, then ACM TODS 2009] Michael Cahill, Alan Fekete, Uwe Rhm University of Sydney
Serializable SI If we can alter the DBMS, we could provide a new algorithm for serializable isolation Online, dynamic Modifications to standard Snapshot Isolation Keep versions, read from snapshot, FCW (like SI) Detect read-write conflicts at runtime Abort transactions with consecutive rw-edges Much less often than traditional optimistic CC Dont do full cycle detection 10/1/2014 cs262a-S14 Lecture-09 51 Challenges
During runtime, rw-pairs can interleave arbitrarily Have to consider begin and commit timestamps: which snapshot is a transaction reading? can conflict with committed transactions Want to use existing engines as much as possible Low runtime overhead But minimize unnecessary aborts 10/1/2014 cs262a-S14 Lecture-09 52 SI anomalies: a simple case pivot commits last
10/1/2014 cs262a-S14 Lecture-09 53 Algorithm in a nutshell Add two flags to each transaction (in & out) Set T0.out if rw-conflict T0 T1 Set T0.in if rw-conflict TN T0 Abort T0 (the pivot) if both T0.in and T0.out are set If T0 has already committed, abort the conflicting transaction
10/1/2014 cs262a-S14 Lecture-09 54 Detection: write before read read old y T1.in = true T0.out = true 10/1/2014 cs262a-S14 Lecture-09 55 Detection: read before write lock x, SIREAD
How can we detect this? SIREAD mode lock doesnt block anything write lock x TN.out = true T0.in = true Just for record keeping Kept even after transaction commits 10/1/2014 cs262a-S14 Lecture-09 56 Main Disadvantage: False positives
no cycle unnecessary abort 10/1/2014 cs262a-S14 Lecture-09 57 Prototype in Oracle InnoDB Not in SIGMOD08; added work for TODS09 Implemented in Oracle InnoDB plugin 1.0.1 Most popular transactional backend for MySQL Already includes multiversion concurrency control Serializable SI, including phantom detection (uses InnoDBs next-key locking)
Also (for comparison) True Snapshot Isolation with firstcommitter-wins (InnoDBs repeatable read isolation has non-standard semantics) Added 230 lines of code to 130K lines in InnoDB Most changes related to transaction lifecycle management 10/1/2014 cs262a-S14 Lecture-09 58 Experimental scenarios sibench synthetic microbenchmark conflict between sequential scan and updating a row table size determines write-write conflict probability and CPU time required for scan TPC-C++ - modified TPC-C to introduce an SI
anomaly added a credit check transaction type to the mix measured throughput under a variety of conditions most not sensitive to choice of isolation level, but we found a mix favoring stock level transactions that demonstrates the tradeoff 10/1/2014 cs262a-S14 Lecture-09 59 sibench: 10 reads per write 60 10/1/2014 cs262a-S14 Lecture-09 60
But SI is NOT serializable! 63 10/1/2014 cs262a-S14 Lecture-09 63 Serializable SI: Lessons New algorithm for serializable isolation Online, dynamic, and general solution Modification to standard Snapshot Isolation Keeps the features that make SI attractive: Readers dont block writers, much better scalability than S2PL In most cases, performance is comparable with SI Never worse than locking serializable isolation Feasible to add to an RDBMS using Snapshot Isolation (such as Oracle) with modest changes
PostgreSQL release 9.1 did this Isolation Level Serializable now executes serializably! See Serializable Snapshot Isolation in PostgreSQL by D. Ports and K. Grittner, PVLDB 5(12):1850-1861 (2012) 10/1/2014 cs262a-S14 Lecture-09 64 Were these good papers? What were the authors goals? What about the evaluation / metrics? Did they convince you that this was a good system /approach? Were there any red-flags? What mistakes did they make? Does the system/approach meet the Test of Time challenge?
How would you review this paper today? 10/1/2014 cs262a-S14 Lecture-09 65 Further Reading Big picture: Principles of Transaction Processing by P. Bernstein and E. Newcomer Theory: Transactional Information Systems by G. Weikum and G. Vossen The gory details: Transaction Processing by J. Gray and A. Reuter Also, Architecture of a Database System by J. Hellerstein, M. Stonebraker, and J. Hamilton, 10/1/2014
MAGI. Modified Adjusted Gross Income Methodology. This will be used to determine how income is counted and how household composition and a family size is constructed when determining eligibility for Medicaid/NCHC. For Medicaid purposes, the applicant's countable income will be...
In the last three years, we have seen the growth of Affinity Groups. Right now, there are three types of approved Affinity Groups: Women in Engineering, Consultants' Networks and Graduates of the Last Decade, or GOLD. Like Chapters, the Affinity...
Characterization of magnetic nanoparticles with potential applications in cancer hyperthermia Jeffrey Camp Advisor: Dr. Cindi Dennis, NIST Hysteresis loop model fitting The Langevin Function: M = N*µ*( coth( µ*H/ kb*T) - 1/( µ*H/ kbT)) N = number of atoms per...
Psalm 24 verse 1 (NRSV) What does the Old Testament say about the environment and stewardship? Then God said, 'Let us make humankind in our image, according to our likeness; and let them have dominion over the fish of the...
BSL 1 handshape variation study This study replicates work on ASL (Bayley, Lucas & Rose, 2002) in the BSL signing community Variation in the handshape of BSL signs that are produced in citation form with a 1/G handshape (e.g., PEOPLE,...
Thermo King units use color coded sensing tubes. Carriers differentiate tubes by referring one as the high pressure and the other as the low side pressure.Testing and adjusting a Carrier air defrost switch is identical to the procedure previously described,...
Ch. 11 - Chemical Bonds I. Why Atoms Combine (p.298-302) Chemical Formula Chemical Bond Stability H2O 2 hydrogen atoms 1 oxygen atom A. Chemical Formula Shows: 1) elements in the compound 2) ratio of their atoms B. Chemical Bond Strong...
Ready to download the document? Go ahead and hit continue!