Monitoring and Observability
Overview
Monitoring Philosophy
Databricks Monitoring Overview
Built-in Monitoring
Pipeline Monitoring
DLT System Tables
Monitoring Pipeline Health
Alerting on Pipeline Failures
Job Monitoring
Job Run Metrics
Model Performance Monitoring
Model Serving Metrics
Model Prediction Quality
Cost Monitoring
Tracking Compute Costs
Cost Optimization Opportunities
Log Aggregation
Centralized Logging Strategy
Accessing Logs
Alert Configuration
Alerting Strategy
Setting Up Alerts (Databricks SQL)
Dashboard Creation
Recommended Dashboards
1. Pipeline Health Dashboard
2. Model Performance Dashboard
3. Cost Dashboard
Key Metrics to Track
Pipeline Metrics
Metric
Description
Target
Alert Threshold
Model Metrics
Metric
Description
Target
Alert Threshold
Data Metrics
Metric
Description
Target
Alert Threshold
Production Monitoring Checklist
Monitoring Tools and Integrations
Databricks Native Tools
External Tools (Recommended)
Integration Examples
Troubleshooting Monitoring Issues
Alerts Not Triggering
Dashboard Loading Slowly
Missing Metrics
Related Documentation
Last updated