Step 1 of 5
Free consultation
It's a good time to get info about each other, share values, and discuss your project in detail. We will advise you on a solution and help you understand if we are a perfect match for you.
A leading podcast platform partnered with Dataforest to replace manual recommendations with an AI-powered personalization engine. The new system analyzes user behavior and context in real time to deliver tailored suggestions in under 0.5 seconds, handling up to 20 recommendations per second. This resulted in 7× higher user engagement, enhancing listener experience and significantly increasing the client’s revenue.
7
×
higher user engagement
<
0.5
secs
average recommendation delivery speed
~
20
recommendations/sec throughput

Databricks
TensorFlow
Spark
PostgreSQL
Databricks vector search
THE CHALLENGE
The podcast platform relied on static, manually curated recommendations that couldn’t adapt to user behavior. This restricted engagement, slowed revenue growth, and left the platform behind competitors who leveraged dynamic personalization.
Recommendations weren’t responsive to user preferences, resulting in low engagement and poor discovery of new content.
The system lacked pipelines to process and unify data across sources, preventing real-time insights and consistent content delivery. This also led to data duplication — with content coming from multiple sources, users often received recommendations for podcasts they had already seen on other platforms.
With no recommendation system for users without history, new listeners had a poor first experience.
The legacy approach couldn’t support rapid growth in users and content, limiting future expansion.
THE SOLUTION
We built a flexible recommendation model that processes diverse user signals in real time.
The solution architecture included four interconnected recommendation models based on two-tower architectures, each designed to process different user signals and content types.
It delivers highly relevant podcast suggestions, improving user engagement by 7x and enabling scalable growth.
We developed automated ETL pipelines that continuously collect, clean, and synchronize data from all media channels — including the website, social platforms, and radio. This eliminated data duplication and enabled instant processing for consistent, high-quality recommendations across all user touchpoints.
For new listeners without behavioral history, we introduced a context-aware recommendation module that leverages signals like time of day, device type, location, and language. This ensured that even first-time users received meaningful, personalized recommendations — improving retention from the very first session.
To address low adaptability to user behavior, we implemented four recommendation models built on two-tower architectures. These models analyze multiple user signals such as listening history, engagement patterns, and metadata, dynamically adjusting to user preferences and delivering relevant podcast suggestions in real time.
We designed the system as a modular, multi-model framework capable of expanding with the platform’s growing user base and content library. Each recommendation model functions independently but integrates seamlessly through a centralized ranking layer, ensuring consistent performance and scalability as data volume increases.
THE RESULT
A leading podcast platform in Saudi Arabia and the MENA region needed to replace its static, manually curated recommendations to drive growth and user satisfaction. DATAFOREST delivered a scalable AI-powered recommendation engine that personalizes podcast suggestions in real time based on user behavior, preferences, and context.
We developed a modular system with a learning-based ranking model and a real-time data pipeline to process user activity efficiently. Key challenges included scaling the architecture, integrating diverse interaction signals, handling data duplication across multiple content sources, and solving the cold start problem for new users. These were addressed by unifying data streams through real-time ETL pipelines, applying deduplication logic to eliminate repeated content, and combining behavioral data with contextual metadata (e.g., time, language, location). Recommendations now auto-update every 48 hours, ensuring ongoing relevance and eliminating manual work.
This transformation enabled the podcast to personalize content at scale, increase user satisfaction, boost revenue, and future-proof its platform with a flexible, data-driven solution.
average recommendation delivery speed
personalized recommendations processed per second
higher user engagement compared to the manual system (A/B tested)


Share project details, like scope or challenges. We'll review and follow up with next steps.
