Oracle Data Guard is one of the most critical disaster recovery solutions used in enterprise environments. But one issue every DBA eventually faces is:

“Standby Redo Logs are shipping, but logs are not applying.”

This problem can quickly lead to:

Apply lag
Data synchronization delays
Increased RPO risk
Failover inconsistency
Production escalation calls at 2 AM

In this article, we’ll walk through:

Real-time production scenario
MRP troubleshooting
Standby Redo Log (SRL) validation
Apply lag diagnosis
Parallel apply tuning
Step-by-step fixes
Best practices used in production

Real-Time Production Scenario

A banking application running on Oracle 19c RAC + Data Guard suddenly reports:

Standby database lag increasing continuously
Archive logs shipping successfully
But redo apply is stuck

Primary database generates heavy redo during:

End-of-day batch jobs
Large payment processing
ETL loads

The monitoring team raises a critical alert:

ORA-16853: apply lag has exceeded specified threshold

At first glance:

Transport lag = 0
Network looks fine
Archive logs are arriving

But standby is still behind by 45 minutes.

This is where deeper Data Guard troubleshooting begins.

Understanding the Redo Flow

Before troubleshooting, let’s quickly understand the redo flow.

Primary Database

LGWR writes redo
Redo shipped via LNS process
RFS receives redo on standby

Standby Database

Redo written into Standby Redo Logs (SRL)
MRP (Managed Recovery Process) applies redo

If any component breaks:

Apply lag increases
Recovery stops
Synchronization fails

Step 1 — Check Data Guard Status

First step is always checking overall Data Guard health.

SELECT 
    DEST_ID,
    STATUS,
    ERROR,
    DESTINATION
FROM V$ARCHIVE_DEST
WHERE STATUS <> 'INACTIVE';

Check Apply Lag

SELECT 
    NAME,
    VALUE,
    TIME_COMPUTED
FROM V$DATAGUARD_STATS
WHERE NAME IN ('apply lag','transport lag');

Typical output:

apply lag      +00 00:45:12
transport lag  +00 00:00:00

This confirms:

Redo shipping is fine
Apply process is delayed

Step 2 — Verify MRP Process

MRP is the heart of redo apply.

Check recovery process:

SELECT 
    PROCESS,
    STATUS,
    THREAD#,
    SEQUENCE#
FROM V$MANAGED_STANDBY;

Problematic output:

MRP0 WAIT_FOR_LOG

or:

MRP0 APPLYING_LOG

but sequence not progressing.

Common Reasons MRP Stops Applying

1. Missing Standby Redo Logs

One of the most common issues.

Check SRL configuration:

SELECT 
    GROUP#,
    THREAD#,
    SEQUENCE#,
    ARCHIVED,
    STATUS
FROM V$STANDBY_LOG;

Production Issue Example

In a 2-node RAC environment:

Primary redo logs:

Thread 1 → 4 redo groups
Thread 2 → 4 redo groups

But standby had only:

4 standby redo logs total

This is incorrect.

Oracle recommends:

(Number of Online Redo Log Groups per thread + 1)
for each thread

Correct SRL Configuration

Example:

If each RAC thread has:

4 redo log groups

Then standby should have:

5 SRLs for Thread 1
5 SRLs for Thread 2

Add Missing SRLs

ALTER DATABASE ADD STANDBY LOGFILE THREAD 1
GROUP 11 ('/u01/oradata/STD/srl11.log') SIZE 2G;

Repeat for all required groups.

Step 3 — Check Alert Logs

Always inspect standby alert log.

Common errors:

ORA-16037: user requested cancel of managed recovery

ORA-00313: open failed for members of log group

ORA-01153: an incompatible media recovery is active

Step 4 — Restart MRP Properly

Sometimes MRP hangs internally.

Cancel and restart recovery:

Stop Recovery

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

Start Recovery Again

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;

For Oracle 12c and above:

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT NODELAY;

Real-Time Incident Example

In one production incident:

Redo transport was healthy
SRLs existed
No network issue

But MRP was frozen on one sequence for hours.

Root cause:

Corrupted archived log received on standby

Solution:

Remove corrupted archive
Re-transfer archive from primary
Restart MRP

Apply resumed immediately.

Step 5 — Check Apply Rate

Heavy redo generation can overwhelm standby apply.

Monitor apply performance:

SELECT 
    ITEM,
    SOFAR,
    UNITS
FROM V$RECOVERY_PROGRESS;

Identify Apply Bottleneck

Look for:

Slow disk I/O
CPU saturation
Single-threaded apply
Excessive checkpoints
Storage latency

Step 6 — Enable Parallel Apply

One major improvement for high redo environments is parallel apply.

Enable parallel recovery:

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE PARALLEL 16 DISCONNECT;

Or dynamically:

ALTER SYSTEM SET RECOVERY_PARALLELISM=16;

Real Production Performance Improvement

A telecom production standby experienced:

Apply lag = 2 hours
Redo generation = 150 GB/hour

After enabling:

RECOVERY_PARALLELISM=32

Apply lag reduced to:

Less than 5 minutes

Step 7 — Validate Real-Time Apply

Check whether standby is using:

Archive apply
or
Real-time apply

Query:

SELECT RECOVERY_MODE FROM V$ARCHIVE_DEST_STATUS;

Desired output:

MANAGED REAL TIME APPLY

If not enabled:

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;

Step 8 — Monitor Archive Gap

Check for missing archive logs.

SELECT * FROM V$ARCHIVE_GAP;

If gaps exist:

Transfer missing archives manually
Register logs if needed

ALTER DATABASE REGISTER LOGFILE '/tmp/1_45566.arc';

Advanced Troubleshooting Scenarios

Scenario 1 — Standby Redo Logs in ACTIVE State Forever

Cause:

MRP stuck
SRL corruption
I/O freeze

Fix:

Restart MRP
Clear standby logfile

ALTER DATABASE CLEAR LOGFILE GROUP 11;

Scenario 2 — High Apply Lag but No Transport Lag

Cause:

Standby CPU bottleneck
Slow storage
Insufficient parallelism

Fix:

Increase parallel apply
Tune I/O
Add faster storage

Scenario 3 — Frequent Log Switches

Cause:

Small redo logs

Fix:

Increase redo log size.

Recommended:

15–20 minute log switch frequency

Scenario 4 — MRP Waiting for Log

Output:

WAIT_FOR_LOG

Possible causes:

Network delay
Archive gap
Transport issue
RFS not receiving logs

Useful Data Guard Monitoring Queries

Check Apply Status

SELECT PROCESS, STATUS, THREAD#, SEQUENCE# FROM V$MANAGED_STANDBY;

Check Lag

SELECT NAME, VALUE FROM V$DATAGUARD_STATS;

Check SRLs

SELECT GROUP#, THREAD#, STATUS FROM V$STANDBY_LOG;

Check Archive Gap

SELECT * FROM V$ARCHIVE_GAP;

Best Practices for Data Guard Apply Performance

1. Configure Proper SRLs

Always follow Oracle recommendation:

Redo Log Groups + 1 per thread

2. Use Real-Time Apply

Reduces failover data loss significantly.

3. Enable Parallel Apply for Heavy Workloads

Especially for:

Batch systems
Banking
Telecom
Large OLTP databases

4. Monitor Lag Continuously

Use:

OEM
Custom scripts
Data Guard Broker

5. Separate Redo and Data Disks

Avoid I/O contention.

Data Guard Broker Validation

Using DGMGRL:

show configuration;

show database verbose standbydb;

Look for:

Apply Lag
Transport Lag
Real Time Query
LogXptMode

Final Thoughts

Standby redo apply issues are among the most common Oracle Data Guard production incidents.

The key to fast resolution is understanding the entire redo pipeline:

LGWR
LNS
RFS
SRL
MRP

Most apply lag issues usually come down to:

Improper SRL configuration
MRP hangs
Archive gaps
Insufficient parallelism
Storage bottlenecks

A strong Oracle DBA not only fixes the issue quickly but also proactively designs Data Guard environments to avoid these failures entirely.

In enterprise systems where every second matters, properly tuned redo apply can make the difference between:

Seamless failover
or
Major business outage.

Monday, May 25, 2026

Standby Redo Logs Not Applying in Oracle Data Guard

Real-Time Production Scenario

Understanding the Redo Flow

Primary Database

Standby Database

Step 1 — Check Data Guard Status

Check Apply Lag

Step 2 — Verify MRP Process

Common Reasons MRP Stops Applying

1. Missing Standby Redo Logs

Production Issue Example

Correct SRL Configuration

Add Missing SRLs

Step 3 — Check Alert Logs

Step 4 — Restart MRP Properly

Stop Recovery

Start Recovery Again

Real-Time Incident Example

Step 5 — Check Apply Rate

Identify Apply Bottleneck

Step 6 — Enable Parallel Apply

Real Production Performance Improvement

Step 7 — Validate Real-Time Apply

Step 8 — Monitor Archive Gap

Advanced Troubleshooting Scenarios

Scenario 1 — Standby Redo Logs in ACTIVE State Forever

Scenario 2 — High Apply Lag but No Transport Lag

Scenario 3 — Frequent Log Switches

Scenario 4 — MRP Waiting for Log

Useful Data Guard Monitoring Queries

Check Apply Status

Check Lag

Check SRLs

Check Archive Gap

Best Practices for Data Guard Apply Performance

1. Configure Proper SRLs

2. Use Real-Time Apply

3. Enable Parallel Apply for Heavy Workloads

4. Monitor Lag Continuously

5. Separate Redo and Data Disks

Data Guard Broker Validation

Final Thoughts

No comments:

Post a Comment

Contact Form

Total Pageviews