🔧 1. Configure Metric Thresholds (First & Most Important)
👉 This is where most alert noise comes from.
📍 Navigation:
Targets → Databases → Select your DB → Monitoring → Metric and Collection Settings
🪜 Steps:
Search for a metric (e.g., CPU Utilization, Tablespace Used (%))
Click Edit (pencil icon)
Set thresholds properly:
Warning: e.g., 80%
Critical: e.g., 90%
Set Occurrences (very important):
Example: Trigger only if 3 consecutive collections fail
Click OK → Save
💡 Tip:
Use “Occurrences > 1” to avoid false alerts (flapping)
🔁 2. Configure Incident Rules (Event Grouping / Compression)
👉 This is the real “alert compression engine”
📍 Navigation:
Setup → Incidents → Incident Rules
🪜 Steps:
Click Create Rule Set
Name it (e.g.,
DB_ALERT_COMPRESSION)Click Create Rule
🔹 Rule Configuration:
Condition:
Target Type = Database Instance
Severity = Critical / Warning
Actions:
✔ Create Incident
✔ Add to existing open incident (IMPORTANT)
✔ Set Incident Priority
👉 This ensures:
Same issue → 1 incident instead of many alerts
🔕 3. Configure Notification Rules (Avoid Spam Emails)
📍 Navigation:
Setup → Notifications → Notification Rules
🪜 Steps:
Click Create
Define:
Target (DB / Host)
Event Type (Metric Alert)
Severity (Critical only recommended)
Configure:
Send email only for Critical
Suppress Warning alerts (optional)
💡 Pro Tip:
Send:
Warning → Dashboard only
Critical → Email/SMS
⛔ 4. Configure Blackouts (Maintenance Mode)
👉 Prevent alerts during planned work
📍 Navigation:
Enterprise → Monitoring → Blackouts
🪜 Steps:
Click Create Blackout
Select Target (DB / Host)
Define:
Duration (e.g., 2 hours)
Enable:
✔ Stop monitoring
✔ Suppress alerts
✅ Result:
No alerts during:
Patching
Restart
Maintenance
🔄 5. Configure Corrective Actions (Auto-Healing)
👉 Stops repeated alerts automatically
📍 Navigation:
Targets → Database → Monitoring → Metric and Collection Settings
🪜 Steps:
Select a metric (e.g., Listener Down)
Click Edit
Go to Corrective Actions
Add script:
Example:
lsnrctl start
💡 Result:
Issue auto-fixed
Alert doesn’t repeat
🔍 6. Enable Event De-duplication & Correlation
👉 Mostly automatic but configurable
📍 Navigation:
Setup → Incidents → Incident Rules → Advanced Settings
🪜 Steps:
Enable:
✔ Event de-duplication
✔ Event correlation
Define time window (e.g., 5–10 mins)
💡 Example:
Same alert every minute
➡️ Only one incident shown
📊 7. Validate Configuration
📍 Navigation:
Enterprise → Monitoring → Incidents
Check:
Alerts grouped properly
No duplicate incidents
Reduced alert count
🎯 Real Production Setup (Recommended)
| Feature | Setting |
|---|---|
| Threshold Occurrence | 3 |
| Incident Grouping | Enabled |
| Notifications | Critical only |
| Blackouts | Mandatory |
| Auto-healing | Enabled |
🧠 Interview-Ready Answer
👉 “How do you reduce alert noise in OEM?”
Answer:
“I tune metric thresholds with occurrence settings, configure incident rules to group alerts, use notification filtering to avoid unnecessary emails, apply blackouts during maintenance, and enable corrective actions to auto-resolve recurring issues.”
🚀 Final Thought
“OEM is powerful… but without tuning, it becomes noisy.
A good DBA makes OEM quiet but intelligent.”
No comments:
Post a Comment