ADR-001: Four-Tier Deployment Architecture
Status: Accepted
Date: 2025-09-15
Decision Makers: CTO
Technical Story: Initial platform architecture design
Context
The ML pipelines platform needed a deployment architecture that would:
Enable rapid developer iteration without conflicts
Provide a shared integration testing environment
Support pre-production validation identical to production
Ensure production stability and safety
Key constraints:
Multiple developers working simultaneously
Need for isolated experimentation
Regulatory requirements for production data protection
Cost sensitivity (can't duplicate entire environments per developer)
Unity Catalog supports catalog-level isolation
Decision
Implement a four-tier deployment architecture with environment-specific Unity Catalog catalogs:
Sandbox:
{username}_sandbox- Individual developer catalogsDev:
dev- Shared development environmentStaging:
staging- Pre-production validationProd:
prod- Production workloads
Each environment has:
Dedicated catalog for data isolation
Environment-specific service principal for CI/CD
Workspace deployment (sandbox uses ref-dev workspace, others have dedicated workspaces)
Consequences
Positive
Zero developer conflicts: Each developer has isolated sandbox catalog
Cost efficient: Sandbox shares data from dev (no duplication)
Clear promotion path: Sandbox → Dev → Staging → Prod
Production safety: Multiple validation stages before prod
Governance compliant: Clear data ownership and access controls
Scalable: Adding developers doesn't impact others
Negative
Catalog proliferation: One sandbox per developer
Slightly complex: More environments to manage than 2-tier or 3-tier
Permission management: Need to configure access for each environment
Cost: More environments = more infrastructure (mitigated by serverless)
Neutral
Learning curve: Developers need to understand 4 environments
Documentation: Requires clear documentation (addressed)
Cleanup: Sandbox catalogs need periodic cleanup (addressed in runbooks)
Alternatives Considered
Option 1: Three-Tier (Dev → Staging → Prod)
Pros:
Simpler (one less tier)
Industry standard for many teams
Cons:
No developer isolation
Shared dev environment causes conflicts
Slower iteration cycles
Why rejected: Developer conflicts and lack of isolation would significantly slow down development velocity. The cost of an additional tier (sandbox) is minimal with Unity Catalog's zero-copy data sharing.
Option 2: Two-Tier (Dev → Prod)
Pros:
Simplest architecture
Minimal infrastructure
Cons:
No pre-production validation
High risk of prod issues
No developer isolation
Why rejected: Insufficient safety for production deployments. Staging is critical for validating changes before prod.
Option 3: Per-Developer Full Environments
Pros:
Complete isolation
Each developer has own dev/staging/prod
Cons:
Extremely expensive (3x environments × N developers)
Data duplication issues
Complex to manage
Why rejected: Cost prohibitive and unnecessary complexity. Sandbox + shared dev provides sufficient isolation at much lower cost.
Implementation Notes
Sandbox Data Sharing:
Sandbox reads from
dev.bronze.*anddev.silver.*(no data copy)Sandbox writes to
{username}_sandbox.gold.*(isolated)Implemented via Unity Catalog permissions
Environment Promotion:
Service Principal Isolation:
ml-pipelines-dev: Full access to dev catalogml-pipelines-staging: Full access to staging catalogml-pipelines-prod: Full access to prod catalog
Cost Optimization:
Sandbox/dev/staging pipelines PAUSED by default
Production pipelines UNPAUSED (continuous)
Serverless compute for all environments
Related Decisions
References
Internal architecture discussions (Sept 2025)
Last updated