DB-008: Phantom Read
· One min read
Transaction Timeline: Phantom Read Anomaly
T1 scans for 'Engineering' employees. T2 inserts and commits a new 'Engineering' employee. T1's subsequent identical scan finds this new 'phantom' row.
Step: 0 / 7
Time: 0
Committed Database State
employees
id | name | department | salary |
---|---|---|---|
1 | Alice | Engineering | 90000 |
2 | Bob | Sales | 75000 |
4 | David | Engineering | 85000 |
Uncommitted Inserts
None
Scan Log
No scans yet
Current Operation
None
Transaction Timelines
0
10
20
30
40
50
60
T1
B
S
S
C
T2
B
I
C
Legend
B
BeginS
Scan (Range Query)I
InsertC
CommitA
AbortUnderstanding Phantom Reads
A Phantom Read occurs when a transaction executes a range query (e.g., "find all employees in Engineering") twice, and the set of rows returned by the second query is different from the first. This happens because another transaction has inserted or deleted rows that match the query's criteria and committed those changes in between the two queries of the first transaction. The newly appeared (or disappeared) rows are called "phantoms."
Key Observation Points:
- Step 2: T1's first scan of 'Engineering' employees finds Alice and David.
- Step 4 & 5: T2 inserts 'Charlie' into 'Engineering' and commits this change.
- Step 6: T1's second scan of 'Engineering' employees now finds Alice, David, AND 'Charlie'. 'Charlie' is the phantom row.
- This anomaly can lead to inconsistencies if T1 was performing calculations or making decisions based on the assumption that the set of rows matching its criteria would not change during its execution.
- Phantom reads are typically prevented by higher isolation levels like Serializable, or by specific locking strategies (e.g., predicate locks, though not widely implemented, or careful range locking).