ML Pipelines Overview

Welcome to the ML Pipelines documentation. This documentation is organized by audience to help you quickly find what you need.

Quick Start

New to the project? Start with the Project READMEarrow-up-right for a quick overview, then proceed to the Developer Getting Started Guide.

Deploying to production? See the Operations Deployment Guide.

Understanding the architecture? Read the System Architecturearrow-up-right document at the root level (ARCHITECTURE.md).


Documentation by Role

Developers

Start here: Getting Started Guide

Daily development workflows, coding standards, and troubleshooting:

Architects

Start here: System Architecturearrow-up-right (root level)

System design, data flow, and architectural decisions:

Jobs and Orchestration

Start here: Jobs Overviewarrow-up-right

Orchestration, scheduling, and batch job documentation:

DevOps/SRE

Start here: Deployment Guide

Deployment, infrastructure, and operational procedures:

Executives

Start here: Platform Overview

High-level understanding of platform value and governance:


Reference Documentation

Quick reference for configurations, commands, and terminology:


Common Tasks

Local Development

Deploying to Production

Monitoring Orchestration

Troubleshooting


Additional Resources

This Repository

For complete system understanding, see documentation in related repositories:

  • infra-corearrow-up-right - Terraform infrastructure, VPC, networking, Databricks workspace setup

    • Path: /Users/taylorlaing/Development/refresh-os/infra-core/

    • Manages: VPC, subnets, security groups, service principals, Unity Catalog

  • api-corearrow-up-right - Backend REST API services

    • Path: /Users/taylorlaing/Development/refresh-os/api-core/

    • Consumes: ML pipeline outputs (sentiment scores, features, insights)

    • Provides: API endpoints for web application

  • app-webarrow-up-right - Frontend web application

    • Path: /Users/taylorlaing/Development/refresh-os/app-web/

    • Displays: Analytics dashboards, real-time insights from ML pipelines


Documentation Principles

This documentation follows these principles:

  1. Audience-First: Organized by who needs the information

  2. Task-Oriented: Focused on what you need to accomplish

  3. Current: Updated to reflect the actual codebase

  4. Cross-Referenced: Links between related documents

  5. Searchable: Clear headings and consistent terminology

Need Help?

  • Documentation Issues: Create an issue in the repository

  • Technical Questions: Reach out to the development team

  • Emergency: See Troubleshooting


Last updated: October 2025

Last updated