Spring Boot HandBook

    Concurrent Transaction Management and Transaction Isolation Levels

    Introduction#

    Concurrent Transaction Management ensures multiple transactions can run simultaneously without conflicts, maintaining data integrity and system performance.

    Transaction Isolation Levels determine how transactions are isolated from changes made by others to ensure data consistency. In Spring Boot, the @Transactional annotation supports various isolation levels to manage concurrent transaction issues like dirty reads, non-repeatable reads, and phantom reads.

    1. Read Uncommitted: Least isolation, fastest, but allows dirty reads.
    2. Read Committed: Prevents dirty reads by allowing only committed data to be read.
    3. Repeatable Read: Prevents inconsistent data reads during a transaction.
    4. Serializable: Highest isolation, avoids all anomalies by running transactions sequentially.

    These mechanisms are crucial for efficient and reliable database operations.

    Isolation Levels#

    Isolation Level
    1. Read Uncommitted:
      • Key Characteristics:
        • Allows dirty reads, where uncommitted changes by other transactions are visible.
        • Offers the least amount of isolation, leading to high performance at the cost of consistency.
      • Example:
        • Transaction A updates a value but hasn’t committed it. Transaction B reads this value. If A rolls back, B has read invalid data.
      • Implications:
        • Best for analytics or logging systems where stale or inconsistent data doesn't harm outcomes.
      • Behavior: Dirty reads are allowed.
        • Transaction A:
          • SQL command:
            BEGIN TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
            UPDATE accounts SET balance = 50 WHERE id = 1;
            -- Transaction is not committed yet
        • Transaction B (running concurrently):
          • SQL command:
            BEGIN TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
            SELECT balance FROM accounts WHERE id = 1; 
            -- Output: 50 (dirty read; uncommitted data is visible)
        • If Transaction A rolls back:
          • SQL command:
            ROLLBACK;

    The update is undone, but Transaction B has already seen the incorrect value.

    Note: "PostgreSQL does not implement dirty reads; though it will accept a request for READ UNCOMMITTED isolation level, it maps it to READ COMMITTED."

    Read Committed (psql)
    1. Read Committed:
      • Key Characteristics:
        • Prevents dirty reads by only allowing access to committed data.
        • Does not prevent non-repeatable reads, meaning data read once may differ on subsequent reads within the same transaction.
      • Example:
        • Transaction A commits changes to a record. Transaction B reads the record and sees the committed value. However, if B reads again later, A may have changed the value again.
      • Use Case:
        • Order processing: Reading only confirmed orders but tolerating potential updates during the transaction.
      • Behavior: Dirty reads are prevented.
        • Transaction A:
          • SQL command:
            BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;
            UPDATE accounts SET balance = 50 WHERE id = 1;
            -- Transaction is not committed yet
          • Transaction B (running concurrently):
            • SQL command:
              BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;
              SELECT balance FROM accounts WHERE id = 1;
              -- Output: 100 (dirty read prevented; only committed data is visible)
          • If Transaction A commits:
            • SQL command:
              COMMIT;
          • Now, Transaction B will see the updated value:
            • SQL command:
              SELECT balance FROM accounts WHERE id = 1;
              -- Output: 50
    • Example Behavior in PostgreSQL (Read Committed):
    Read Committed
    1. Repeatable Read:
      • Key Characteristics:
        • Prevents dirty reads and non-repeatable reads, ensuring the same value is read multiple times within a transaction.
        • Does not prevent phantom reads where other transactions insert or delete rows that alter query results.
      • Example:
        • Transaction A queries for all rows in a table. Transaction B inserts a new row. If A queries again, the new row may or may not appear, depending on the query structure.
      • Use Case:
        • Inventory management: Ensures consistent views of product quantities during updates.
      • Behavior: Dirty reads and non-repeatable reads are prevented.
        • Transaction A:
          • SQL command:
            BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
            UPDATE accounts SET balance = 50 WHERE id = 1;
            -- Transaction is not committed yet
        • Transaction B (running concurrently):
          • SQL command:
            BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
            SELECT balance FROM accounts WHERE id = 1;
            -- Output: 100 (dirty read prevented)
            -- Transaction A commits
            COMMIT;
            SELECT balance FROM accounts WHERE id = 1;
            -- Output: 100 (non-repeatable read prevented; consistent view within transaction)
    • Example Behavior in PostgreSQL (Repeatable Read):
    Repeatable Read
    1. Serializable:
      • Key Characteristics:
        • The most strict and consistent isolation level.
        • Transactions are executed as if they were completely sequential.
        • Prevents dirty reads, non-repeatable reads, and phantom reads.
      • Example:
        • Two transactions trying to update the same inventory stock. One will block until the other is completed, ensuring sequential consistency.
      • Use Case:
        • Financial transactions: Prevents anomalies in banking systems where high data integrity is required.
      • Behavior: Dirty reads, non-repeatable reads, and phantom reads are all prevented.
        • Transaction A:
          • SQL command:
            BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
            INSERT INTO accounts (balance) VALUES (300);
             
        • Transaction B (running concurrently):
          • SQL command:
            BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
            SELECT SUM(balance) FROM accounts;
            -- Output: 300 (before A commits)
            --Transaction A commits
            COMMIT;
            SELECT SUM(balance) FROM accounts; -- Output: 300 (phantom read prevented; Transaction B does not see new rows).
        • If Transaction B tries to insert or modify the same rows, it will block until Transaction A completes.
    • Example Behavior in PostgreSQL (Serializable Isolation):
    Serializable

    Each isolation level provides a trade-off between data consistency and performance, and the choice depends on the specific needs of the application.

    How Serializable Isolation works#

    Serializable Isolation Level is the highest level of transaction isolation, ensuring that transactions execute in a way that the results would be the same as if they were executed sequentially, one after the other. It prevents all anomalies, such as dirty reads, non-repeatable reads, and phantom reads.

    • PostgreSQL (Serializable Snapshot Isolation - SSI):
      • PostgreSQL uses Serializable Snapshot Isolation (SSI), which does not rely on traditional locking mechanisms. Instead, it uses a snapshot of the database for each transaction.
      • SSI ensures transactions are serializable by detecting conflicts that would lead to non-serializable behavior (such as conflicting updates or inserts) and aborting the conflicting transactions.
      • Advantage: This approach avoids the performance overhead caused by traditional locking and maintains high concurrency without compromising transaction isolation.
    • Oracle, MySQL, SQL Server, DB2 (Lock-based Isolation):
      • These databases rely on row and range locks to enforce serializable behavior. When a transaction accesses data, it locks the rows or ranges to prevent other transactions from modifying them concurrently.
      • Advantage: Guarantees serializable behavior by ensuring no other transaction can access locked data.
      • Disadvantage: The need for locks can reduce concurrency, as transactions may have to wait for locks to be released, leading to performance degradation in high-traffic systems.

    When multiple transactions run simultaneously, they may conflict, leading to issues like:

    Dirty Read#

    • Definition: A dirty read occurs when a transaction reads data that has been modified by another transaction but not yet committed. If the second transaction is rolled back, the first transaction has read data that doesn't exist in the final state of the database.
    • Example: Transaction A updates a record but has not yet committed the changes. Transaction B reads that record before Transaction A commits. If Transaction A is rolled back, Transaction B has read invalid data.
    Dirty read

    Non-repeatable Read#

    • Definition: A non-repeatable read occurs when a transaction reads the same row multiple times but gets different values each time because another transaction modifies the data in between the reads.
    • Example: Transaction A reads a record, and Transaction B updates that record. When Transaction A reads the same record again, the value has changed due to Transaction B's update.
    Non Repeatable Read

    Phantom Read#

    • Definition: A phantom read happens when a transaction executes the same query twice but retrieves a different set of rows each time because another transaction has inserted, updated, or deleted records that match the query criteria.
    • Example: Transaction A queries all records with a certain condition (e.g., age > 30). Transaction B inserts a new record that meets this condition while Transaction A is still running. When Transaction A repeats the query, it retrieves a different set of records due to the new insertion by Transaction B.
    Phantom Read

    Conclusion#

    This article discusses the importance of selecting the right transaction isolation level in concurrent transaction management. It covers how Read Uncommitted, Read Committed, Repeatable Read, and Serializable balance data consistency with performance. By understanding these levels, developers can optimize transaction management to address concurrency issues like dirty reads, non-repeatable reads, and phantom reads.

    Last updated on Jan 14, 2025