Naming Conventions
Overview
This document provides comprehensive naming standards for all assets in the ML Pipelines project. Consistent naming improves discoverability, maintainability, and organizational standards.
General Principles
Be Descriptive: Names should clearly indicate purpose
Be Consistent: Follow the same pattern across similar resources
Be Concise: Avoid unnecessary verbosity
Use Standard Formats: Follow industry conventions (snake_case, PascalCase, kebab-case)
Include Context: Add environment/scope when needed
Unity Catalog Naming
Catalogs
Format: {environment} or {username}_{environment}
Environments:
dev- Shared development catalogstaging- Pre-production catalogprod- Production catalog{username}_sandbox- Personal developer catalog (e.g.,taylorlaing_sandbox)
Examples:
Rules:
Environment catalogs: lowercase, no underscores
Sandbox catalogs:
{username}_sandboxformatNo special characters except underscore
Schemas
Format: {layer} or {domain}
Medallion Architecture Layers:
bronze- Raw data, minimal transformationsilver- Cleaned and validated datagold- Business-ready aggregates and features
Domain Schemas:
models- ML model artifacts and metadatamonitoring- Monitoring and observability tablestesting- Test data and validation results
Examples:
Rules:
Lowercase only
No underscores in layer names
Domain names can use underscores if needed
Tables
Format: {domain}_{entity} or {descriptive_name}
Pattern: Use snake_case for multi-word names
Examples:
Bronze Layer (source system tables):
Silver Layer (processed data):
Gold Layer (business aggregates):
Rules:
snake_case format
Plural for collection tables (e.g.,
messages)Singular for dimension tables (e.g.,
user_profile)Include layer context if ambiguous
Avoid abbreviations unless standard (e.g.,
idok,msgnot ok)
Models (MLflow)
Format: {catalog}.models.{model_name}
Pattern: Descriptive name indicating model purpose
Examples:
Model Aliases:
champion- Current production modelchallenger- Model being testedarchive- Previous production model
Rules:
snake_case format
Descriptive of model's function
Internal models: descriptive name (e.g.,
sentiment_analysis)External models:
{source}_{model_name}(e.g.,roberta_base_go_emotions)
Volumes
Format: {catalog}.{schema}.{volume_name}
Pattern: Descriptive name for data storage location
Examples:
Rules:
snake_case format
Prefix with layer if helpful (e.g.,
raw_messages)Match corresponding table name when possible
Pipeline Naming
DLT Pipeline Configurations
File Format: {pipeline_name}.pipeline.yml
Pipeline Name Format: {descriptive_name} (in YAML) → {descriptive_name}-{environment} (deployed)
Examples:
File names:
Deployed pipeline names:
Rules:
File names: snake_case
Deployed names: kebab-case with environment suffix
Environment suffix added automatically by databricks.yml variables
Descriptive of pipeline's primary function
Pipeline Scripts
Format: {pipeline_name}.py
Examples:
Rules:
Match corresponding pipeline YAML file name
snake_case format
Located in
/resources/pipelines/{schema}/{pipeline_name}/
Job Naming
Job Configurations
File Format: register_{model_name}.job.yml or {job_type}_{purpose}.job.yml
Job Name Format: {resource_prefix}_{job_name} (deployed)
Examples:
File names:
Deployed job names:
Rules:
File names: snake_case
Model registration:
register_{model_name}.job.ymlInference:
inference_{model_name}.job.ymlDeployed names:
{resource_prefix}_{job_name}(prefix from databricks.yml)
Job Scripts
Format: register_{model_name}.py or {job_type}_{purpose}.py
Examples:
Rules:
Match corresponding job YAML file name
snake_case format
Located in
/resources/jobs/{category}/{job_name}/
Code Naming
Python Modules and Packages
Package Names: lowercase, no underscores
Examples:
Module Names: snake_case
Examples:
Python Variables and Functions
Variables: snake_case
Examples:
Functions: snake_case, verb-based
Examples:
Boolean Variables: Prefix with is_, has_, should_
Examples:
Python Classes
Format: PascalCase
Examples:
Rules:
Use nouns or noun phrases
Descriptive of class purpose
Avoid generic names like
Manager,Handlerunless specific
Python Constants
Format: UPPER_SNAKE_CASE
Examples:
Configuration Naming
Environment Variables
Format: UPPER_SNAKE_CASE
Examples (in databricks.yml):
Usage in code:
Spark Configuration
Format: Dot notation, lowercase
Examples:
File and Directory Naming
Python Files
Format: snake_case.py
Examples:
YAML Files
Format: kebab-case.yml
Examples:
Notebooks
Format: snake_case.ipynb or snake_case.py
Examples:
Rules:
snake_case format
Descriptive of notebook purpose
Include date for temporary notebooks (e.g.,
debug_2025_01_15.ipynb)
Directories
Format: snake_case (lowercase)
Examples:
Rules:
Lowercase only
No special characters except underscore
Descriptive of contents
Follow project structure conventions
Git Branch Naming
Format: ref-{ticket-number}-{brief-description}
Branch names are generated by Linear based on ticket ID and title.
Examples:
Rules:
Always starts with
ref-prefixTicket number from Linear
kebab-case for description
Brief but descriptive (2-5 words)
Lowercase only
Pull Request Naming
Format: ref-{ticket} | {Title}
PR titles use Linear ticket reference followed by descriptive title.
Examples:
Rules:
Always starts with
ref-{ticket}followed by|Title case for PR title
Descriptive but concise
Workspace Paths
Bundle Deployment Paths
Format: /Workspace/{location}/.bundle/{bundle_name}/{target}
Examples:
Rules:
Sandbox: Under user workspace directory
Shared environments: Under
/Shared/Bundle name from databricks.yml
Target name from deployment
S3 Bucket Paths
Format: s3://{bucket}/{catalog}/volumes/{schema}/{volume}/
Examples:
Rules:
Bucket name includes environment
Catalog-specific paths for isolation
Follow Unity Catalog volume structure
Quick Reference Table
Catalog
{environment}
dev, prod
Sandbox Catalog
{username}_sandbox
taylorlaing_sandbox
Schema
{layer}
bronze, silver, gold, models
Table
{domain}_{entity}
sentiment_analysis
Model
{model_name}
sentiment_analysis
Pipeline File
{name}.pipeline.yml
sentiment_analysis.pipeline.yml
Job File
register_{model}.job.yml
register_sentiment_analysis.job.yml
Python File
{name}.py
sentiment_analysis.py
Class
PascalCase
SentimentAnalysisModel
Function
snake_case
calculate_sentiment_score
Variable
snake_case
sentiment_score
Constant
UPPER_SNAKE_CASE
MAX_MESSAGE_LENGTH
Branch
{type}/{description}
feature/add-emotion-detection
Validation Checklist
Before merging code, verify naming follows conventions:
Related Documentation
Code Standards - Coding conventions and style guide
Configuration Reference - databricks.yml and configuration
CLI Commands - Makefile and Databricks CLI usage
Last updated