Thursday, April 9, 2026

Automated Health Check Script (Shell + HTML report)

I’ve created a production-ready automated health check script in the canvas for you ๐Ÿ‘

https://github.com/shashidba1208/shashidba/blob/master/oracle_automated_health_check_script_shell_html

๐Ÿ”ฅ What this script does:

  • Connects via sqlplus / as sysdba

  • Generates a timestamped HTML report

  • Covers:

    • ✅ Uptime

    • ✅ DB Status

    • ✅ Tablespace usage (with color coding ๐ŸŸข๐ŸŸก๐Ÿ”ด)

    • ✅ ASM usage

    • ✅ RMAN backup status

    • ✅ Active sessions

    • ✅ Invalid objects

    • ✅ Data Guard stats


๐Ÿš€ How to Use

1. Save script

vi db_health_check.sh

2. Give permission

chmod +x db_health_check.sh

3. Run manually

./db_health_check.sh

4. Schedule via cron

crontab -e

Example (run daily at 8 AM):

0 8 * * * /path/db_health_check.sh

๐Ÿ“Š Output

  • HTML report generated in:

/tmp/db_health/

Open in browser:

xdg-open report.html


Oracle Database Health Check & Monitoring


# ๐Ÿ“Š Oracle Database Health Monitoring SOP


## ๐ŸŽฏ Objective


To proactively monitor Oracle database health, detect issues early, and ensure high availability, performance, and data integrity across environments (Standalone / RAC / Data Guard).


---


# ๐Ÿงญ 1. Daily Health Check Overview


| Category    | Check                             | Frequency |

| ----------- | --------------------------------- | --------- |

| Instance    | Uptime, Status                    | Daily     |

| Database    | Role, Open Mode                   | Daily     |

| PDB         | Open Status                       | Daily     |

| Backup      | Last RMAN Backup                  | Daily     |

| Storage     | ASM, Tablespace                   | Daily     |

| Performance | Active Sessions                   | Daily     |

| DR          | Data Guard Sync                   | Daily     |

| Integrity   | Block Corruption, Invalid Objects | Daily     |


---


# ๐ŸŸข 2. Instance & Database Status


## ✅ Database Uptime


```sql

SELECT 

    'UPTIME: ' || 

    FLOOR(SYSDATE - STARTUP_TIME) || ' days, ' ||

    FLOOR(MOD((SYSDATE - STARTUP_TIME) * 24, 24)) || ' hours, ' ||

    FLOOR(MOD((SYSDATE - STARTUP_TIME) * 24 * 60, 60)) || ' minutes'

FROM GV$INSTANCE;

```


### ✔️ Expected:


* Should be stable (no frequent restarts)


### ๐Ÿšจ Alert If:


* Unexpected restart observed


---


## ✅ Database Role & Status


```sql

SELECT database_role, name, db_unique_name, open_mode, log_mode, 

       flashback_on, protection_mode, protection_level 

FROM gv$database;

```


### ✔️ Expected:


* PRIMARY → READ WRITE

* STANDBY → READ ONLY / MOUNTED


---


# ๐Ÿงฉ 3. PDB Health Check


## ✅ PDB Status


```sql

SELECT inst_id, con_id, name, open_mode, restricted,

       TO_CHAR(open_time,'DD-MON-YYYY HH24:MI:SS') open_time

FROM gv$pdbs;

```


### ✔️ Expected:


* All required PDBs in **READ WRITE**


---


## ✅ PDB Size


```sql

SELECT con_id, name, open_mode,

       total_size/1024/1024/1024 "PDB_SIZE_GB"

FROM v$pdbs;

```


---


# ๐Ÿ’พ 4. Backup Monitoring


## ✅ Last Backup Status


```sql

SELECT 

    'LAST BACKUP: ' || 

    INPUT_TYPE || ' - ' || 

    STATUS || ' - ' || 

    TO_CHAR(START_TIME, 'DD-MON-YYYY HH24:MI') || ' to ' ||

    TO_CHAR(END_TIME, 'DD-MON-YYYY HH24:MI')

FROM V$RMAN_BACKUP_JOB_DETAILS

WHERE ROWNUM = 1;

```


### ✔️ Expected:


* Status = COMPLETED


### ๐Ÿšจ Alert If:


* FAILED / RUNNING too long / No backup


---


# ⚡ 5. Performance Monitoring


## ✅ Active Sessions


```sql

SELECT count(*), inst_id, status 

FROM gv$session 

GROUP BY inst_id, status;

```


### ✔️ Expected:


* Balanced ACTIVE vs INACTIVE sessions


### ๐Ÿšจ Alert If:


* Too many ACTIVE sessions → Possible load issue


---


# ๐Ÿ’ฝ 6. Storage Monitoring


## ✅ ASM Diskgroup Usage


```sql

SELECT NAME,

       TOTAL_MB/1024 TOTAL_GB,

       FREE_MB/1024 FREE_GB,

       (FREE_MB/TOTAL_MB)*100 FREE_PER

FROM v$asm_diskgroup;

```


### ✔️ Threshold:


* Free space > 20%


---


## ✅ Tablespace Usage


```sql

SELECT file_name, tablespace_name, bytes/1024/1024 MB, status 

FROM dba_data_files;

```


---


## ✅ TEMP Tablespace


```sql

SELECT tablespace_name, 

       GB_total, 

       GB_used, 

       GB_free

FROM ...

```


---


## ✅ UNDO Tablespace


```sql

SELECT tablespace_name, 

       tot_ts_size_GB, 

       free_ts_size_GB, 

       usedspace 

FROM ...

```


---


# ๐Ÿ” 7. Data Guard Monitoring


## ✅ Archive Sync Status


```sql

SELECT a.thread#, b.last_seq, a.applied_seq,

       b.last_seq-a.applied_seq ARC_DIFF

FROM ...

```


### ✔️ Expected:


* ARC_DIFF ≈ 0


### ๐Ÿšจ Alert If:


* Lag increasing


---


## ✅ Data Guard Stats


```sql

SELECT name, value, unit, time_computed 

FROM v$dataguard_stats;

```


---


# ๐Ÿ”’ 8. Locks & Blocking Sessions


## ✅ Blocking Locks


```sql

SELECT * FROM v$lock WHERE block != 0;

```


### ๐Ÿšจ Action:


* Identify blocking session

* Kill if required


---


# ⚠️ 9. Database Integrity Checks


## ❌ Block Corruption


```sql

SELECT * FROM v$database_block_corruption;

```


### ๐Ÿšจ Action:


* Restore block using RMAN


---


## ❌ Invalid Objects


```sql

SELECT owner, object_type, COUNT(*) 

FROM dba_objects 

WHERE status = 'INVALID'

GROUP BY owner, object_type;

```


### ๐Ÿšจ Action:


```sql

EXEC UTL_RECOMP.RECOMP_SERIAL();

```


---


# ๐Ÿšจ 10. Alert Threshold Summary


| Metric           | Warning      | Critical            |

| ---------------- | ------------ | ------------------- |

| ASM Free Space   | < 20%        | < 10%               |

| Tablespace Usage | > 80%        | > 90%               |

| Active Sessions  | Sudden spike | System slowdown     |

| Data Guard Lag   | > 5 logs     | Continuous increase |

| Backup           | Delayed      | Failed              |


---

##################


Sunday, April 5, 2026

ORACLE DATA GUARD 19c Daily Monitoring Guide

ORACLE DATA GUARD 19c Daily Monitoring Guide

 

 

1.  Check standby database role and status (run on STANDBY)

select INST_ID,INSTANCE_NAME, name DB_NAME, db_unique_name,HOST_NAME,DATABASE_ROLE,OPEN_MODE,status, to_char(STARTUP_TIME,'DD-MON-YYYY HH24:MI:SS') "DB UP TIME" from v$database,gv$instance;

 

2.  Check if redo apply (MRP) is running on standby

 

3.  Check apply lag (run on STANDBY)

4.  Check lag using v$managed_standby (MRP process)

 

 

 

5.  Archive log GAP

 

6.  Check all Data Guard processes (run on STANDBY)

 

7.  Verify Data Guard Configuration Status

 

Dgmgrl / as sysdba

Show configuration;

8.  DGMGRL>Check Database Status (Primary & Standby)

dgmgrl -silent / <<EOF >> $LOGFILE

SHOW CONFIGURATION;

DGMGRL> SHOW DATABASE VERBOSE <PRIMARY_DB>;

DGMGRL> SHOW DATABASE VERBOSE <STANDBY_DB>;

 

9.  DGMGRL>Validate Database Health

dgmgrl -silent / <<EOF >> $LOGFILE

SHOW CONFIGURATION;

DGMGRL> VALIDATE DATABASE VERBOSE <PRIMARY_DB>;

DGMGRL> VALIDATE DATABASE VERBOSE <STANDBY_DB>;

 

 

10.              Daily Data Guard Monitoring Checklist

 

 

SQL Script (dg_check.sql)

SET LINES 200

SET PAGES 200

PROMPT === DATABASE STATUS ===

SELECT NAME, DATABASE_ROLE, OPEN_MODE FROM V$DATABASE;

 

PROMPT === DATAGUARD STATS ===

SELECT NAME, VALUE FROM V$DATAGUARD_STATS;

 

PROMPT === ARCHIVE DEST ===

SELECT DEST_ID, STATUS, ERROR FROM V$ARCHIVE_DEST;

 

PROMPT === MRP STATUS ===

SELECT PROCESS, STATUS FROM V$MANAGED_STANDBY;

 

PROMPT === ARCHIVE GAP ===

SELECT * FROM V$ARCHIVE_GAP;

 

PROMPT === FRA USAGE ===

SELECT SPACE_LIMIT, SPACE_USED FROM V$RECOVERY_FILE_DEST;

 

Data Guard Daily Monitoring Automation Script

 

Shell Script (dg_health_check.sh)

 

#!/bin/bash

 

# Set Oracle Environment

export ORACLE_SID=YOUR_DB_SID

export ORACLE_HOME=/u01/app/oracle/product/19.0.0/dbhome_1

export PATH=$ORACLE_HOME/bin:$PATH

 

# Date

DATE=$(date +"%Y-%m-%d_%H-%M")

LOGFILE=/tmp/dg_health_$DATE.log

 

echo "=========================================" >> $LOGFILE

echo " Data Guard Daily Health Check - $DATE " >> $LOGFILE

echo "=========================================" >> $LOGFILE

 

# Run SQL Checks

sqlplus -s / as sysdba<<EOF >> $LOGFILE

 

SET LINESIZE 200

SET PAGESIZE 200

 

PROMPT === DATABASE ROLE & STATUS ===

SELECT NAME, DATABASE_ROLE, OPEN_MODE, SWITCHOVER_STATUS FROM V\\$DATABASE;

 

PROMPT === DATAGUARD STATS ===

SELECT NAME, VALUE, UNIT FROM V\\$DATAGUARD_STATS;

 

PROMPT === ARCHIVE DEST STATUS ===

SELECT DEST_ID, STATUS, ERROR FROM V\\$ARCHIVE_DEST WHERE STATUS <> 'INACTIVE';

 

PROMPT === ARCHIVE GAP ===

SELECT * FROM V\\$ARCHIVE_GAP;

 

PROMPT === MANAGED STANDBY ===

SELECT PROCESS, STATUS, THREAD#, SEQUENCE# FROM V\\$MANAGED_STANDBY;

 

PROMPT === STANDBY REDO LOGS ===

SELECT GROUP#, THREAD#, SEQUENCE#, STATUS FROM V\\$STANDBY_LOG;

 

PROMPT === FLASHBACK STATUS ===

SELECT FLASHBACK_ON FROM V\\$DATABASE;

 

PROMPT === FRA USAGE ===

SELECT NAME, SPACE_LIMIT, SPACE_USED FROM V\\$RECOVERY_FILE_DEST;

 

EXIT;

EOF

 

# Run DGMGRL Checks

echo "=== DGMGRL CONFIGURATION ===" >> $LOGFILE

 

dgmgrl -silent / <<EOF >> $LOGFILE

SHOW CONFIGURATION;

DGMGRL> SHOW DATABASE VERBOSE <PRIMARY_DB>;

DGMGRL> SHOW DATABASE VERBOSE <STANDBY_DB>;

DGMGRL> VALIDATE DATABASE VERBOSE <PRIMARY_DB>;

DGMGRL> VALIDATE DATABASE VERBOSE <STANDBY_DB>;

EXIT;

EOF

 

# Basic Alert Check

echo "=== ALERT LOG ERRORS (last 50 lines) ===" >> $LOGFILE

tail -50 $ORACLE_BASE/diag/rdbms/*/*/trace/alert_*.log >> $LOGFILE

 

# Output location

echo "Health check completed. Log: $LOGFILE"


Thursday, April 2, 2026

ACE Program


Announcement! Thrilled and proud to share that I’ve been selected as an Oracle ACE Apprentice!

 


Friday, March 27, 2026

Top Oracle GoldenGate Commands Every DBA Must Master

 GoldenGate is not just replication.


It’s real-time data movement with precision.
But most DBAs only scratch the surface.
If you’re serious about production-grade replication, these are the commands you must know inside GGSCI ๐Ÿ‘‡


1. INFO ALL
Your first diagnostic command.
GGSCI> INFO ALL
Shows status of Extract, Replicat, Manager.
Also gives checkpoint lag.
๐Ÿ‘‰ If lag increases, you investigate immediately.
This is your heartbeat monitor.


2. VIEW REPORT
When something fails, this is where truth lives.
GGSCI> VIEW REPORT EXT1
Gives detailed execution logs.
Error codes, trail issues, checkpoint problems.
๐Ÿ‘‰ Never guess. Always read report files.


3. STATS
Understand throughput and performance.
GGSCI> STATS EXTRACT EXT1
Shows number of operations processed.
Insert / Update / Delete counts.
๐Ÿ‘‰ Helps validate replication correctness during testing.


4. LAG
Real-time latency check.
GGSCI> LAG EXTRACT EXT1
or
GGSCI> LAG REPLICAT REP1
๐Ÿ‘‰ Critical in real-time systems (banking, telecom).
Even seconds matter.


5. START / STOP
Basic but dangerous if misused.
GGSCI> STOP EXTRACT EXT1
GGSCI> START EXTRACT EXT1
๐Ÿ‘‰ Always coordinate with downstream systems.
Stopping Extract blindly can cause backlog explosion.


6. ALTER EXTRACT / REPLICAT
Used for repositioning.
GGSCI> ALTER EXTRACT EXT1, BEGIN NOW
or
GGSCI> ALTER REPLICAT REP1, EXTSEQNO 15, EXTRBA 12345
๐Ÿ‘‰ Used during recovery scenarios.
Requires precision. One mistake = data inconsistency.


7. INFO DETAIL
Deep dive into process state.
GGSCI> INFO EXTRACT EXT1, DETAIL
๐Ÿ‘‰ Shows checkpoint, trail file, SCN position.
Essential during troubleshooting.


8. SEND Command
Real-time interaction with running process.
GGSCI> SEND EXTRACT EXT1 STATUS
๐Ÿ‘‰ No need to stop process.
Useful for live debugging.


9. LOGDUMP (Advanced)
Not GGSCI, but critical tool.
logdump> open ./dirdat/aa
๐Ÿ‘‰ Lets you read trail files.
This is where elite DBAs operate.


10. DELETE / ADD / REGISTER
Lifecycle management.
GGSCI> ADD EXTRACT EXT1, INTEGRATED TRANLOG
GGSCI> REGISTER EXTRACT EXT1 DATABASE
๐Ÿ‘‰ Especially important in integrated capture mode.


๐Ÿ”น Quick Takeaway / Summary
GoldenGate is simple… until it breaks.
These commands are not optional—they’re survival tools.
Master them, and you control replication.
Ignore them, and replication controls you.

Sunday, March 22, 2026

Oracle database is slow

when someone says:

๐Ÿšจ “Oracle database is slow”


⚡ Oracle Performance Troubleshooting – Step-by-Step Playbook


๐Ÿง  Step 0: Stay Calm & Don’t Guess

Before doing anything:

๐Ÿ‘‰ Don’t jump into AWR immediately
๐Ÿ‘‰ Don’t restart database
๐Ÿ‘‰ Don’t assume it’s “CPU issue”

First rule:

Find the bottleneck, not the symptom


๐Ÿ” Step 1: Identify Where Time is Spent (Your Query ๐Ÿ‘‡)

SELECT event, total_waits, time_waited
FROM v$system_event
ORDER BY time_waited DESC
FETCH FIRST 5 ROWS ONLY;

๐Ÿ‘‰ What to check:

If you seeMeaning
db file sequential readIndex I/O
db file scattered readFull table scan
log file syncCommit issue
CPU time highCPU pressure
enq: TX row lockBlocking

✅ Now you know WHERE to focus


Step 2: Check Active Sessions (Real-Time)

SELECT sid, serial#, username, sql_id, event, wait_class, seconds_in_wait
FROM v$session
WHERE status = 'ACTIVE'
ORDER BY seconds_in_wait DESC;

๐Ÿ‘‰ Identify:

  • Long running sessions

  • Blocking sessions

  • Top wait events


๐Ÿšซ Step 3: Check Blocking / Locks

SELECT blocking_session, sid, serial#, wait_class, event
FROM v$session
WHERE blocking_session IS NOT NULL;

If blocking exists:

๐Ÿ‘‰ That’s your issue (most critical)


๐Ÿงฎ Step 4: Identify Top SQL (Heavy Queries)

SELECT sql_id, executions, elapsed_time/1000000 total_sec
FROM v$sql
ORDER BY elapsed_time DESC
FETCH FIRST 10 ROWS ONLY;

๐Ÿ‘‰ Look for:

  • High elapsed time

  • High executions

  • Same SQL repeated


๐Ÿ“Š Step 5: Check CPU Usage

SELECT stat_name, value
FROM v$sysstat
WHERE stat_name LIKE '%CPU%';

๐Ÿ‘‰ Or OS level:

top

If CPU is high:

  • Bad SQL

  • Missing indexes

  • High concurrency


๐Ÿ’พ Step 6: Check I/O Bottleneck

SELECT file#, phyrds, phywrts
FROM v$filestat;

๐Ÿ‘‰ OR check wait events:

  • db file sequential read → index reads

  • db file scattered read → full scans


๐Ÿ”„ Step 7: Check Redo / Commit Issues

SELECT event, time_waited
FROM v$system_event
WHERE event = 'log file sync';

๐Ÿ‘‰ If high:

  • Too many commits

  • Slow disk


๐ŸŒ Step 8: Check Network Issues

Look for:

  • SQL*Net message from client

  • SQL*Net more data

๐Ÿ‘‰ Usually application-side delay


๐Ÿ“ˆ Step 9: Check AWR (If Needed)

Only AFTER initial diagnosis:

@?/rdbms/admin/awrrpt.sql

Focus on:

  • Top SQL

  • Wait events

  • Load profile


๐Ÿงช Step 10: Drill Down into Problem SQL

SELECT *
FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('<SQL_ID>', NULL, 'ALLSTATS LAST'));

๐Ÿ‘‰ Check:

  • Full table scans

  • Bad joins

  • Wrong cardinality


๐Ÿš€ Step 11: Take Action (Based on Root Cause)

IssueAction
BlockingKill session / fix transaction
Bad SQLTune query / add index
CPU highOptimize SQL
I/O highAdd indexes / check storage
Redo issueReduce commits
Stats staleGather stats

๐Ÿ”ฅ Real DBA Flow (Golden Rule)

When DB is slow:

  1. Check wait events

  2. Check active sessions

  3. Check blocking

  4. Identify top SQL

  5. Drill into execution plan


๐Ÿง  Interview-Level Answer (Important)

If interviewer asks:

๐Ÿ‘‰ “Database is slow, what will you do?”

Say:

“First I check system wait events to identify where time is spent, then I analyze active sessions and blocking, followed by identifying top SQL and reviewing execution plans before taking corrective action.”


Pro Tips (From Real Production Experience)

  • 80% issues = bad SQL or missing index

  • 10% = locking

  • 10% = infrastructure (CPU/I/O)


๐ŸŽฏ Final Thought

“Performance tuning is not about fixing everything…
It’s about fixing the right thing first.”



Real-Time Troubleshooting Scenario



๐Ÿšจ Real-Time Troubleshooting Scenario (Hands-On Practice)

๐ŸŽฏ Situation

Time: 11:15 AM
Environment: Production
Users complaint:

“Application is very slow. Pages are hanging.”

You have no prior info.


⏱️ Your Goal (as DBA)

  • Identify root cause in 5–10 minutes

  • Avoid guesswork

  • Take correct action


๐Ÿ” Step 1: Where is time spent?

You run:

SELECT event, total_waits, time_waited
FROM v$system_event
ORDER BY time_waited DESC
FETCH FIRST 5 ROWS ONLY;

๐Ÿ“Š Output:

EVENT                          TIME_WAITED
----------------------------   -----------
enq: TX - row lock contention   9854321
db file sequential read         234567
CPU time                        123456

Question 1 (Your Move)

๐Ÿ‘‰ What does this indicate?

✅ Expected Thinking:

  • Top wait = row lock contention

  • This is blocking issue, not CPU or I/O


๐Ÿ”Ž Step 2: Find Blocking Session

You run:

SELECT blocking_session, sid, serial#, event
FROM v$session
WHERE blocking_session IS NOT NULL;

๐Ÿ“Š Output:

BLOCKING_SESSION   SID   EVENT
---------------   ----  -----------------------------
145               201   enq: TX - row lock contention
145               305   enq: TX - row lock contention
145               410   enq: TX - row lock contention

Question 2

๐Ÿ‘‰ What do you conclude?

✅ Answer:

  • Session 145 is blocking multiple sessions

  • This is a critical production issue


๐Ÿ” Step 3: Identify Blocking SQL

SELECT sid, serial#, username, sql_id, status
FROM v$session
WHERE sid = 145;

Output:

SID   SERIAL#   USERNAME   SQL_ID     STATUS
145   6789      APPUSER    abc123xyz  INACTIVE

Question 3

๐Ÿ‘‰ Why is this dangerous?

✅ Answer:

  • Session is INACTIVE but holding locks

  • Likely:

    • Uncommitted transaction

    • Application issue


๐Ÿ”ฅ Step 4: Business Decision Time

You have 2 options:

Option A:

๐Ÿ‘‰ Wait for user to commit/rollback

Option B:

๐Ÿ‘‰ Kill the session


Question 4

๐Ÿ‘‰ What will you do?

✅ Best Answer:

  • If production is impacted → Kill session


Step 5: Kill Blocking Session

ALTER SYSTEM KILL SESSION '145,6789' IMMEDIATE;

๐Ÿ“Š Step 6: Validate Fix

Check again:

SELECT blocking_session FROM v$session
WHERE blocking_session IS NOT NULL;

Output:

No rows selected

๐ŸŽ‰ Result

  • Blocking removed

  • Application restored

  • Issue resolved in minutes


๐Ÿง  Now Let’s Level Up (Scenario 2)


๐Ÿšจ New Situation

Users say:

“Reports are very slow”


You run:

SELECT event, time_waited
FROM v$system_event
ORDER BY time_waited DESC;

Output:

db file scattered read   9999999
CPU time                 8888888

Question 5

๐Ÿ‘‰ What does this indicate?

✅ Answer:

  • Heavy Full Table Scans

  • Likely:

    • Missing indexes

    • Bad SQL


๐Ÿ” Next Step

SELECT sql_id, elapsed_time, executions
FROM v$sql
ORDER BY elapsed_time DESC
FETCH FIRST 5 ROWS ONLY;

Question 6

๐Ÿ‘‰ What will you do next?

✅ Answer:

  • Pick top SQL_ID

  • Check execution plan:

SELECT *
FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('<SQL_ID>', NULL, 'ALLSTATS LAST'));

๐ŸŽฏ Final Diagnosis

  • Full table scan detected

  • Missing index


Fix

CREATE INDEX idx_col ON table_name(column_name);

๐Ÿ Final Learning

๐Ÿ”ฅ Golden Troubleshooting Flow:

  1. Check wait events

  2. Identify type of issue

  3. Find sessions / SQL

  4. Take targeted action


๐ŸŽฏ Mock Interview Question (Based on Scenario)

๐Ÿ‘‰ “How will you handle blocking in production?”

Perfect Answer:

“I first check wait events to confirm locking, identify blocking session using V$SESSION, verify the SQL and session state, and if it’s impacting users, I kill the blocking session and inform application team.”