🚨 ORA-01555 on Standby (Active Data Guard)

📘 RCA + Troubleshooting + Prevention Guide

🎯 Objective

To analyze and resolve ORA-01555 snapshot too old error occurring on a physical standby (Active Data Guard) and establish preventive best practices.

🧭 1. Problem Summary

ORA-01555 observed on standby database
Same queries working on primary
Occurring intermittently across multiple SQL IDs
Query duration varied from seconds to hours

🔍 2. Key Observations

Multiple SQL IDs → ❌ Not SQL-specific issue
Query duration inconsistent → ❌ Not purely long-running query issue
Same SQL works on primary → ✅ Indicates standby-specific behavior
Undo retention = 900 sec (15 mins)

🧠 3. Core Concept

🔑 What is ORA-01555?

Occurs when Oracle cannot reconstruct consistent read (CR) due to missing undo.

👉 Based on consistent read mechanism

⚠️ 4. Why ORA-01555 Happens on Standby

❗ Important Difference

Primary	Standby
User DML generates undo	Redo apply updates undo
Controlled locally	Driven by primary

👉 Even without user activity:

Redo apply modifies undo blocks
Old undo gets overwritten

🔥 5. Critical Insight (Most Important)

🚨 UNDO_RETENTION on standby has NO EFFECT

✔ Effective undo retention = Primary database setting

📊 6. Root Cause Analysis

🧩 Root Cause 1: Insufficient Undo Retention

Query duration > undo retention
Example:
- Query: 316846 sec (~88 hours)
- Undo retention: 900 sec

👉 Clearly insufficient

🧩 Root Cause 2: Redo Apply Lag (SCN Jump Issue)

❗ Problem Scenario:

Missing / insufficient Standby Redo Logs (SRLs)
No Real-Time Apply
Batch redo apply

📉 SCN Jump Behavior

Primary SCN:   200 → 210 → 220 → 230 → 240 → 250
Standby SCN:   200 ───────────────→ 250 (jump)

Impact:

Query starts at old SCN (200)
Undo for SCN 200 already overwritten
❌ ORA-01555 triggered

⚙️ 7. Real-Time Apply Importance

👉 With proper configuration:

Continuous SCN progression
Better read consistency
Reduced ORA-01555 risk

🧪 8. Scenario Explanation

🟢 Primary

Query SCN: 240
Undo available → Query succeeds

🔴 Standby

Query SCN: 200
Undo expired → Query fails

🛠️ 9. Troubleshooting Checklist

✅ Step 1: Validate Data Guard Configuration

Use:

DGMGRL> validate database verbose <standby_db>;

🚨 Check:

Insufficient SRLs
Apply lag
Sync status

✅ Step 2: Verify Standby Redo Logs

✔ Requirements:

Same size as online redo logs
At least +1 group per thread

✅ Step 3: Check Real-Time Apply

SELECT PROCESS, STATUS FROM V$MANAGED_STANDBY;

✔ Expected:

MRP running in real-time apply mode

✅ Step 4: Check Undo Retention

SHOW PARAMETER undo_retention;

👉 Change on PRIMARY ONLY

✅ Step 5: Check Tuned Undo Retention

SELECT MAX(TUNED_UNDORETENTION) FROM V$UNDOSTAT;

✅ Step 6: Check Apply Lag

SELECT NAME, VALUE FROM V$DATAGUARD_STATS;

🔧 10. Resolution Steps

✔ Fix 1: Configure SRLs Properly

Add missing standby redo logs
Match size with primary redo logs

✔ Fix 2: Enable Real-Time Apply

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;

✔ Fix 3: Increase Undo Retention (PRIMARY)

ALTER SYSTEM SET undo_retention=3600 SCOPE=BOTH;

✔ Fix 4: Resize Undo Tablespace

Ensure space supports higher retention

🚀 11. Prevention Strategy

Area	Action
Data Guard	Always use Real-Time Apply
SRL	Proper sizing & count
Undo	Size for longest query
Monitoring	Track apply lag
Reporting	Avoid extremely long queries

💡 12. Key Takeaways

ORA-01555 on standby ≠ normal undo issue
Undo retention controlled by primary only
Redo apply lag can trigger early failures
SCN jumps break consistent reads

🧠 13. Interview-Ready Explanation (🔥 Must Use)

“We encountered ORA-01555 on Active Data Guard standby where queries were failing despite working on primary.
Root cause was twofold: insufficient undo retention on primary for long-running queries, and improper standby redo log configuration causing SCN jumps due to batch redo apply.
We resolved it by configuring SRLs correctly, enabling real-time apply, and resizing undo tablespace on primary to support longer retention. This stabilized consistent reads on standby.”

🏁 Conclusion

In **Oracle Active Data Guard environments:

👉 Consistent read depends on:

Primary undo availability
Redo apply behavior

👉 Proper configuration ensures:

Reliable reporting
No ORA-01555 surprises

Monday, April 13, 2026