DATAFOREST logo
Home page  / Cases
Medical Lab Achieves 50% Compute Savings via Databricks Migration

Medical Lab Achieves 50% Compute Savings via Databricks Migration

Sagis Diagnostics, a leading U.S. pathology lab, replaced its fragmented Azure SQL setup with a unified Databricks Lakehouse built by Dataforest. The migration consolidated 21 data sources, automated analytics, and ensured HIPAA compliance — delivering full data transparency, pay-per-use efficiency, and a ~50% reduction in compute costs.

~

50

%

compute cost reduction through optimized architecture

21

Integrated data sources unified under Medallion Architecture

3

Genie spaces deployed for self-service BI

Sagis Diagnostics, a US-based, physician-owned subspecialty diagnostic pathology laboratory, partners with healthcare providers and insurance companies to deliver precise diagnostic data analysis and pathology services.

Python

Python

Spark

Spark

Azure SQL

Azure SQL

Databricks

Databricks

Genie (LLM)

Genie (LLM)

THE CHALLENGE

Migrating from Azure SQL to Databricks for Scalability, Compliance, and Cost Efficiency

Sagis Diagnostics needed to migrate from a legacy Azure SQL Server environment to Databricks to unify diagnostics and billing data, enable advanced analytics, and ensure compliance with healthcare data standards. 

The Azure system covered only 20–25% of the required functionality, lacked scalability for growing data volumes, and cost around $20,000 per year while utilizing only a small portion of its capacity.

The new solution also needed to support long-term data growth, improve observability, and consolidate all BI, AI, and compliance workflows into a single Lakehouse platform.

icon 1
Transform Legacy SQL Scripts into Functional Jobs

The previous environment relied on static SQL scripts that lacked automation and consistency across data workflows. This limited scalability and increased maintenance overhead.

icon 2
Ensure Data Compliance in Databricks (Patient Data)

Sagis Diagnostics processes sensitive patient data, requiring strict compliance with HIPAA and healthcare data protection standards.

icon 3
Achieve Full Observability Across Data Pipelines

Before the migration, the client lacked visibility into how data was transformed, validated, and consumed, making it difficult to track issues or verify accuracy. Limited visibility from ingestion to BI/AI output incurs blind spots on freshness, schema drift, quality regressions, and downstream blast radius.

icon 4
Implementation Challenges During Migration

At the start of the project, there was no clear documentation of the existing Azure setup. The client also faced limited access rights, missing connectors, and Databricks platform updates that required continuous adaptation and consultation with Databricks support.

Explore how modern data platforms like Databricks can transform your analytics and compliance workflows.

Start your data migration — get pricing and timelines today.

Get pricing

THE SOLUTION

Unified Migration to a Modern, Compliant Lakehouse Architecture

We successfully  migrated all data and pipelines from Azure SQL to  Databricks-based enterprise data warehouse. Implemented a Medallion Architecture (Bronze/Silver/Gold), and rewrote legacy SQL scripts into automated, production-ready Databricks jobs. The new platform provides governed data storage, real-time observability, and cost-efficient compute scaling.

icon 5
Automated Databricks Job Conversion

All legacy SQL scripts were converted into fully automated Databricks jobs with error handling, scheduling, and integration into the business logic layer. 

Key Deliverables:

  • Delta Lake-based medallion architecture with CDC ingestion
  • Production-ready Databricks environment
  • 3-tier medallion pipeline (Bronze/Silver/Gold) processing 30+ tables
  • 3 fully functional LLM (Genie) spaces tailored for AI/BI business needs
  • Cost-monitoring dashboard for precise compute control
icon 7
Data Lineage and Monitoring Dashboards

Dataforest implemented automated data lineage and monitoring dashboards within Databricks, featuring real-time data refresh tracking, anomaly detection, and event-based alerts. This provided full transparency, faster troubleshooting, and greater confidence in data reliability. Clear runbooks and domain-level SLOs ensured faster incident resolution, safer change management, and reliable, compliant analytics.

icon 6
Compliance-First Databricks Environment

Patient data used for AI/BI and ML training was fully anonymized to maintain HIPAA compliance while enabling advanced analytics.

icon 8
Incremental Implementation and Knowledge Transfer

Our engineers reconstructed undocumented logic by analyzing legacy SQL patterns and rebuilding missing connectors. Access management was standardized, and an adaptive update policy was implemented to synchronize with Databricks’ frequent releases. Continuous documentation and proactive communication ensured smooth handover and maintainability.

THE RESULT

Unified, Compliant, and Scalable Data Platform with Pay-per-use compute reducing costs from $20k to ~$10k annually

Sagis Diagnostics migrated from Azure SQL to a Databricks-based enterprise data warehouse, unifying diagnostics and billing data in a single governed environment. The Medallion Architecture (Bronze/Silver/Gold) ensured data quality, scalability, and traceability.

All BI, AI, and ETL workflows were consolidated into Databricks, enabling transparent data management and efficient collaboration across teams. AutoML pipeline enabled triage denial prediction model with automated training for underutilized claims prediction. 

The result is an AI-ready Lakehouse that accelerates reporting, enhances visibility, and reduces pay-per-use compute costs by nearly 50%, establishing a future-ready foundation for predictive analytics and LLM-powered BI in healthcare.

Key outcomes included:
  • Integration of 21 data sources into a governed Medallion Architecture (Bronze/Silver/Gold).
  • Consolidation of all data from two vendors into one governed platform with real-time CDC ingestion.
  • Deployment of 2 dashboards, providing functional SQL code for 20 dashboards and widgets, and creation of 3 Genie spaces for self-service BI.
  • ML-ready feature store with automated denial prediction model, ready for evaluation and deployment.
  • Full data observability and compliance readiness with automated lineage tracking and schema documentation for future AI-driven analytics.
  • ~50% compute cost reduction through optimized, pay-per-use architecture
Additional Value Delivered (Client Feedback):
  • Fast onboarding: Seamless adoption of the Databricks environment with guided support.
  • Proactive documentation: All pipelines and jobs were fully documented without the need for client follow-ups.
  • Engineering excellence: High technical quality, structured communication, and timely delivery ensured a smooth migration and reliable system performance.

KPIs
~50%

compute cost reduction through optimized, pay-per-use architecture

21

Integrated data sources unified under Medallion Architecture

3

Genie spaces deployed for self-service BI.

Why Sagis Diagnostics Chose Dataforest as Their Software Development Partner

“Dataforest got us off the ground really quickly, and they even provided documentation without us having to ask for it — that was really impressive.”

case slide 1
case slide 1
case slide 2
case slide 1
case slide 2
case slide 3
case slide 1
case slide 2
case slide 3
case slide 4
case slide 1
case slide 2
case slide 3
case slide 4
gradient quote marks

Medical Lab Achieves 50% Compute Savings via Databricks Migration

How we provide data integration solutions

Consultation icon

Step 1 of 5

Free consultation

It's a good time to get info about each other, share values and discuss your project in detail. We will advise you on a solution and try to help to understand if we are a perfect match for you.
Analysis icon

Step 2 of 5

Discovering and feasibility analysis

One of our core values is flexibility, hence we work with either one page high level requirements or with a full pack of tech docs.  

At this stage, we need to ensure that we understand the full scope of the project. We receive from you or perform a set of interviews and prepare the following documents: integration pipeline (which data we should get and where to upload), process logic (how system should work); use cases and acceptance criteria; solution architecture. Ultimately we make a project plan which we strictly follow.
Solutions icon

Step 3 of 5

Solution development

At this stage, we build ETL pipelines and necessary APIs to automate the process. We attract our DevOps team to build the most efficient and scalable solution. Ending up with unit tests and quality assurance tests to ensure that the solution is working properly. Focus on Results is one of our core values as well.
Data delivery icon

Step 4 of 5

Solution delivery

After quality assurance tests are completed, we deliver solutions to the client. Though we have over 15 years of expertise in data engineering, we are expecting client’s participation in the project. While developing the integration system, we provide midterm results so you can always see where we are and provide us with feedback. By the way, a high-level of communication is also our core value.
Support improvement icon

Step 5 of 5

Support and continuous improvement

We understand how crucial the solutions that we code for our clients are! Our goal is to build long-term relations, so we provide guarantees and support agreements. What is more, we are always happy to assist with further developments and statistics show that for us, 97% of our clients return to us with new projects.

Success stories

Streamlined Data Analytics

We helped a digital marketing agency consolidate and analyze data from multiple sources to generate actionable insights for their clients. Our delivery used a combination of data warehousing, ETL tools, and APIs to streamline the data integration process. The result was an automated system that collects and stores data in a data lake and utilizes BI for easy visualization and daily updates, providing valuable data insights which support the client's business decisions.
1.5 mln

DB entries

4+

integrated sources

Charlie White

Charlie White

Senior Software Developer Team Lead LaFleur Marketing, digital marketing agency
View case study
Streamlined Data Analytics
gradient quote marks

Their communication was great, and their ability to work within our time zone was very much appreciated.

Optimise e-commerce with modern data management solutions

An e-commerce business uses reports from multiple platforms to inform its operations but has been storing data manually in various formats, which causes inefficiencies and inconsistencies. To optimize their analytical capabilities and drive decision-making, the client required an automated process for regular collection, processing, and consolidation of their data into a unified data warehouse. We streamlined the process of their critical metrics data into a centralized data repository. The final solution helps the client to quickly and accurately assess their business's performance, optimize their operations, and stay ahead of the competition in the dynamic e-commerce landscape.
450k

DB entries daily

10+

sources integrations

Lesley D.

Lesley D.

Product Owner E-commerce business
View case study
Optimise e-commerce with modern data management solutions
gradient quote marks

We are extremely satisfied with the automated and streamlined process that DATAFOREST has provided for us.

Reporting Solution for the Financial Company

Dataforest created a valuable and convenient reporting solution for the financial company that successfully helped lower the manual daily operations, changed how access was shared, and maintained more than 200 reports.
1

solution to handle more than 200 reports

5

seconds to load a report

Reporting Solution for the Financial Company
gradient quote marks

Enra Group is the UK's leading provider and distributor of specialist property finance.

Streamlined Data Analytics

We helped a digital marketing agency consolidate and analyze data from multiple sources to generate actionable insights for their clients. Our delivery used a combination of data warehousing, ETL tools, and APIs to streamline the data integration process. The result was an automated system that collects and stores data in a data lake and utilizes BI for easy visualization and daily updates, providing valuable data insights which support the client's business decisions.
1.5 mln

DB entries

4+

integrated sources

Charlie White

Charlie White

Senior Software Developer Team Lead LaFleur Marketing, digital marketing agency
View case study
Streamlined Data Analytics
gradient quote marks

Their communication was great, and their ability to work within our time zone was very much appreciated.

Optimise e-commerce with modern data management solutions

An e-commerce business uses reports from multiple platforms to inform its operations but has been storing data manually in various formats, which causes inefficiencies and inconsistencies. To optimize their analytical capabilities and drive decision-making, the client required an automated process for regular collection, processing, and consolidation of their data into a unified data warehouse. We streamlined the process of their critical metrics data into a centralized data repository. The final solution helps the client to quickly and accurately assess their business's performance, optimize their operations, and stay ahead of the competition in the dynamic e-commerce landscape.
450k

DB entries daily

10+

sources integrations

Lesley D.

Lesley D.

Product Owner E-commerce business
View case study
Optimise e-commerce with modern data management solutions
gradient quote marks

We are extremely satisfied with the automated and streamlined process that DATAFOREST has provided for us.

Reporting Solution for the Financial Company

Dataforest created a valuable and convenient reporting solution for the financial company that successfully helped lower the manual daily operations, changed how access was shared, and maintained more than 200 reports.
1

solution to handle more than 200 reports

5

seconds to load a report

Reporting Solution for the Financial Company
gradient quote marks

Enra Group is the UK's leading provider and distributor of specialist property finance.

Latest publications

All publications

Latest publications

All publications

We’d love to hear from you

Share project details, like scope or challenges. We'll review and follow up with next steps.

form image
top arrow icon