Delta Live Tables (DLT) Pipeline Development Guide
Overview
DLT Benefits
Why Use DLT?
When to Use DLT
Pipeline Structure (Bronze/Silver/Gold)
Medallion Architecture
Directory Structure
Schema Definitions Using table_schemas Package
Bronze Schema Example
Using Schemas in DLT
Benefits of Centralized Schemas
Data Quality with Expectations
Expectation Types
Common Expectations
Monitoring Expectations
ai_query Integration Patterns
The ai_query Challenge
Two-Stage AI Processing Pattern
Benefits of Two-Stage Pattern
AI Query Configuration
Handling AI Query Errors
Streaming vs Batch Pipelines
Streaming Pipelines
Batch Pipelines
Hybrid Approach
Catalog and Schema References
Dynamic Catalog Resolution
SQL Catalog References
Cross-Catalog Reads
Error Handling
Rescue Columns (Bronze Layer)
Quarantine Tables
Testing DLT Pipelines
Unit Testing Schema Definitions
Integration Testing Pipelines
Testing in Sandbox
Common Pitfalls and Solutions
1. Schema Evolution Failures
2. Catalog Variable Not Propagating
3. Streaming State Conflicts
4. ai_query Schema Errors
5. Memory Errors with Large Batches
6. Slow Performance
Best Practices
1. Pipeline Organization
2. Naming Conventions
3. Comments and Documentation
4. Performance Optimization
5. Data Quality Strategy
Pipeline Configuration Reference
Related Documentation
External Resources
Last updated