TRANSACTION PROCESSING

 

Transaction: is a logical unit of database processing that includes one or more database access operations (Insert, Delete, Modify and Retrieval). Database Applications may be embedded or not embedded like PL/SQL, Pro*C, Pro*Cobol etc.

 

Transaction Processing Systems: Large databases and hundreds of concurrent users managing transactions.

 

Single & Multi-users Systems

 

 

 

 

 

 

 

 

 


Volatile Storage: Contents will not survive a system crash. e.g., cache and/ or main memory.

 

Non-volatile storage: Contents will normally survive processing crashes. e.g., disk and/or tape but could also be battery backed RAM.

 

      TAPE > DISK > Battery backed RAM

 

Stable Storage: A theoretical ideal where data content is never lost.

 

 

 

 

 

 

 

 

 

 


                                    Main memory                                         Disk

 

                                                            Storage Model

Standard Definitions

 

input(X): which transfers the physical block in which data item X resides to main memory.

output(X): which transfers the buffer block on which X resides to the disk and replaces the appropriate physical block there.

 

read(X): Reads a database item X into a program variable.

write(X): Writes the value of program variable X into the database item named X.

 

Transaction boundaries: begin transaction and end transaction.

Basic unit of data transfer from disk to main memory is one block.

 

Executing read(X) command includes:

  1. Find the address of the disk block that contains item X.
  2. Copy that disk block into a buffer in main memory.
  3. Copy item X from the buffer to the program variable named X.

 

Executing write(X) command includes:

  1. Find the address of the disk block that contains item X.
  2. Copy that disk block into a buffer in main memory.
  3. Copy item X from the program variable named X into its correct location in the buffer.
  4. Store the update block from the buffer back to disk (either immediately or at some later point in time).

 

NOTE: If the system crashes after the write(X) operation was executed but before output(X) was executed, the new value of X is never written to disk, thus, is lost.

 

 

Transaction Model:

 

            Customers C1, C2 have balances stored in accounts B1, B2 (on disk):

 

                        T:read(B1,X)

                        X := X-100

write(B1,X)

read(B2,Y)

Y := Y+100

write(B2,Y)

 

  1. If output(B1) is executed and system failed before execution of output(B2). Then value $100 will destroy. Then B1+B2 is no longer preserved.
  2. If we replace the values of B1 and B2, to get B1+B2=3000 then the state is called consistent.

 

Then either both output()s must be performed neither to make the database consistent.

 

If the system fails at a point between the two output()s the database will be in an inconsistent.

 

Atomic if both sides of the transfer take place or neither half of the transfer take place.

 

CONCURRENCY CONTROL

 

Consider

 

(a)        T1:       read(X)                        (b)        T2:       read(X)

                        X=X-N                                                X=X+M

                        write(X)                                               write(X)

                        read(Y)

                        Y=Y+N

                        write(Y)

Several problems can occur when concurrent transactions execution in an uncontrolled manner.

 

The Lost Update Problem

 

Consider above transaction scenario.

 

If          X = 80 (reservations)

            N = 5   (T1 transfers from flight X to Y) and

            M = 4  (T2 reserves 4 seats on X)

Final result should be 79, but following fig. contains X = 84 and T1 transaction is lost.

 

      T1

      T2

read(X)

 

X := X - N

 

 

read(X)

 

X := X + M

write(X)

 

read(Y)

X is incorrect - overwritten

 

 

write(X)

Y := Y +  N

 

write(Y)

 

 

The Temporary Update Problem (Dirty Read):

 

This problem occurs when one transaction updates a database item and then the transaction fails fo some reason.

      T1

      T2

read(X)

 

X := X - N

 

write(X)

 

 

 

 

read(X)

 

X := X +M

 

write(X)

read(Y)

 

T1 fails and must change the value of X back to its old value; meanwhile T2 has read the temp. incorrect value of X

 

 

 

 

 

 

The values of item X that is read by T2 is called dirty data that transaction is not yet committed.

 

Incorrect summary problem:

If one transaction is calculating an aggregate summary function on a number of records while other transactions are updating some of these records, the aggregate function may calculate some values before they are updated and others after they are updated.

 

T3 reads X after N is subtracted and reads Y before N is added; a wrong summary is the result (off by N).

 

 

      T1

      T2

 

sum := 0

 

read(A)

 

sum := sum + A

 

……….

read(X)

 

X := X - N

 

write(X)

 

 

read(X)

 

sum := sum + X

 

read(Y)

 

sum := sum + Y

read(Y)

 

Y := Y + N

 

write(Y)

 

 

 

Recovery

 

  1. Hardware Failure
  2. A transaction or system error
  3. Local error or exception conditions detected by the transaction
  4. Concurrency control enforcement
  5. Disk failure
  6. Physical problems and catastrophes

 

 

Operations and Transaction States

 

For “successful completion”, the transaction should have five transaction states.

 

Active, is processing

 

Partially committed, after the last statement has been executed

 

Abort, after the transaction has been rolled back and the database restored to its state prior to the start of

the transaction

 

Committed, after successful completion.

 

 

 

These transactions are allowed as

 

                                                            Part. Commit                                        Commit

 


 


                        Active

 

 


                                                                Failed                                               Abort

 

 

 

REDO: If results are calculated and then fail useful to be able to REDO

 

UNDO: To go from failed state to abort must be able to UNDO

 

The System Log

 

LOG BASED RECOVERY

 

            The system as a whole must be able to ensure that it is possible to recover to a consistent state.

 

            Technique to implement recovery is the use of a file of log records

 

            The types of record will be

 

                        <T, start>                     transaction T has started

           

                        <T, X, old, new>          T has changed the value of x from old to new

 

                        <T, commit>                T has committed

 

            Decide where to keep log, when creates

 

1.      To recover from loss of volatile data, keep the log in non-volatile storage

2.      I loss of non-volatile must be recovered then log must be on stable storage

3.      A comprehensive policy must allow for further crash during recovery

 

EXAMPLE-1:

 

            Customers C1, C2 have balances stored in accounts B1, B2 (on disk), T transfers $100 from B1 to B2. Then evaluate

 

1.      Log-based recovery

2.      Immediate update

3.      Deferred update

 

Where transaction T is

 

 

                        T:         read(B1, X)

                                    X := X – 100

                                    write(B1, X)

                                    read(B2, Y)

                                    Y := Y + 100

                                    write(B2, Y)

 

Log-based recovery

 

            T:   BEGIN Create-log(<T, start>);

read(B1, X);

oldx := X;

                                    X := X – 100;

                                    Create-log(<T, B1, oldx, X>);

                                    write(B1, X);

                                    read(B2, Y);

oldy := y;

                                    Y := Y + 100;

Create-log(<T, B1, oldx, X>);

                                    write(B2, Y);

                               Create-log(<T, commit>);

                 END;

 

 

Immediate Update

 

            T:   BEGIN Create-log(<T, start>);

read(B1, X);

oldx := X;

                                    X := X – 100;

                                    Create-log(<T, B1, oldx, X>);

                                    write(B1, X);

                                    output(B1);

                                    read(B2, Y);

oldy := y;

                                    Y := Y + 100;

Create-log(<T, B1, oldx, X>);

                                    write(B2, Y);

                                    output(B2);

                               Create-log(<T, commit>);

                 END;

                       

On the loss of volatile storage, it might have to UNDO(T) or REDO(T)

 

If the log contains <T, start> but not <T, commit> then must have UNDO(T).

 

 

Deferred Update

 

            T:   BEGIN Create-log(<T, start>);

read(B1, X);

                                    X := X – 100;

                                    Create-log(<T, B1, X>);

                                    write(B1, X);

                                    read(B2, Y);

                                    Y := Y + 100;

Create-log(<T, B1, X>);

                                    write(B2, Y);

                               Create-log(<T, commit>);

                                    output(B1);

                                    output(B2);

                 END;

 

After a loss of volatile storage, REDO(T) performed if and only if log contains both

<T, start> and <T, commit>.

 

 

Scenario (Immediate)

 

            (a)                                (b)

            <T, start>                     <T, start>

            <T, B1, oldx, X>          <T, B1, oldx, X>

            <T, B2, oldy, Y>          <T, B2, oldy, Y>

                                                <T, commit>

 

Assume that the crash occurs just after the log record for the step write(B2,y).

In (a), when the systems comes back T must be UNDO i.e., UNDO(T) values of B1 & B2 unchanged.

 

Assume that the crash occurs just after the record for the step <T, cimmit>.

In (b), when the system comes back T must be REDO i.e., REDO(T)

 

As there is no other transaction so there is no UNDO operation.

 

 

 

 

EXERCISE:

 

                        T0:       read(A, a1)

                                    a1 := a1 – 50

                                    write(A, a1)

                                    read(B, b1)

                                    b1 := b1 + 50

                                    write(B, b1)

                        T1:       read(C, c1)

                                    c1 := c1 – 100

                                    write(C, c1)     

           

            where A = $1000, B = $2000, C = $700

 

Evaluate

            Log-based Recovery

            Immediate update

            Deferred update and

            Possible scenario for Immediate update