← Back to Case Studies

How We Automated Pipeline Orchestration to Save 40+ Hours/Month

Manual data delivery was eating up engineering time and creating reliability problems. We built an automated orchestration system that handles daily delivery to multiple clients — hands-free.

The Challenge

The client was manually preparing and delivering data files to multiple pharma partners on a daily basis. Each partner had different security requirements, file formats, and delivery schedules. The process was error-prone and consumed 40+ hours of engineering time per month.

What We Built

  • Modular Python DAGs (Directed Acyclic Graphs) to standardize ETL tasks across all clients.
  • YAML-based pipeline configuration for easy modification, scaling, and onboarding new clients without code changes.
  • Scheduled jobs (daily/historical) that join de-identified claims and tokenized tables to produce client-ready analytic datasets.

Security & Infrastructure

  • Secure Cross-Account Delivery: Flexible authentication patterns to meet diverse client security needs:
    • AWS STS AssumeRole for temporary, least-privilege access
    • Secrets Manager integration for secure key rotation
    • Scoped IAM policies restricting access to specific S3 paths
  • Automated Event Scheduling: AWS EventBridge triggers Lambda functions for reliable, unattended data shipments.

Results

  • 40+ hours/month saved — daily automated, secure data transfer for multiple clients.
  • New client onboarding reduced from days to hours — no code changes needed for new schedules or data splits.
  • Improved compliance and auditability across all data deliveries.
Tech: Apache Airflow (MWAA) Python YAML AWS Lambda S3 EventBridge IAM