Data Forest logo
Article preview image
May 18, 2023
14 min

Practical Data Warehousing: Successful Cases

May 18, 2023
14 min
LinkedIn icon
Article preview image

Table of contents:

No matter how smooth the plan may be in theory, practice will certainly make adjustments. Because each real case has its own characteristics, which in the general case cannot be taken into account. Let's see how the world's leading brands have adapted to their needs a well-known way of storing information — data warehousing. If you think this is your case, then arrange a call.

Global Data Warehousing Market By Application

The Reason for Making Decisions

The need to make business decisions based on data analysis has long been beyond doubt. But to get this data, it needs to be collected, sorted and prepared for analytics.

Operating Supplement

We developed an ETL solution for a manufacturing company that combined all required data sources and made it possible to analyze information and identify bottlenecks of the process.
See more...

supplier integrations


cost reduction

David Schwarz photo

David Schwarz

Product Owner Biomat, Manufacturing Company
How we found the solution
Operating Supplement case image
gradient quote marks

DATAFOREST has the best data engineering expertise we have seen on the market in recent years.

This is what data warehousing specialists do. To focus on the best performance, it makes sense to consider how high-quality custom assemblies came out of this constructor.

Data warehousing interacts with a huge amount of data

A data warehousing is a digital storage system that integrates and reconciles large amounts of data from different sources. It helps companies turn data into valuable information and make informed decisions based on it. Data warehousing combines current and historical data and acts as a single source of reliable information for business.

After raw data mining (extract, transform, load) info enters the warehouse from operating systems, such as an enterprise data resource planning system or a customer relationship management system. Sources also include databases, partner operational systems, IoT devices, weather apps, and social media. Infrastructure can be on-premises or cloud-based, with the latter option predominating in recent times.

Data warehousing is necessary not only for storing information, but also for processing structured and unstructured data: video, photos, sensor indicators. Some data warehousing options use built-in analytics and in-memory database data technology (info is stored in RAM rather than on a hard drive). This is necessary to access reliable data in real time.

After data is sorted, it is sent to data marts for further analysis by BI or data science.

Why consider data warehousing cases

Consideration of known options for data warehousing is necessary, first of all, in order not to keep making the same mistakes. Based on a working solution, you can improve your own performance. If you want to always be on the cutting edge of technology, book a call.

  • When using data warehouses, executives access data from different sources, they do not have to decide blindly.
  • Data warehousing is needed for quick retrieval and analysis. When using warehouses, you can quickly request large amounts of data without involving personnel for this.
  • Before uploading to the warehouse, the system creates data cleansing tasks and puts them for further processing, ensuring converting the data into a consistent format for subsequent analyst reports.
  • The warehouse contains large amounts of historical data and allows you to study past trends and issues to predict events and improve the business structure.

Blindly repeating other people's decisions is also impossible. Your case is unique and probably requires a custom approach. At best, well-known storage solutions can be taken as a basis. You can do it yourself, or you can contact DATAFOREST specialists for professional services. We have a positive experience and positive customer stories of data warehousing creating and operating.

Data warehousing cases

Case 1: How the Amazon Service Does Data Warehousing

Amazon is one of the world's largest and most successful companies with a diversified business: cloud computing, digital content, and more. As a company that generates vast amounts of data (including data warehousing services), Amazon needs to manage and analyze its data effectively.

Two main businesses

Amazon's data warehousing needs are driven by the company's vast and diverse data sources, which require sophisticated tools and technologies to manage and analyze effectively.

1. One of the main drivers of Amazon's business is its e-commerce platform, which allows customers to purchase a wide range of products through its website and mobile apps. Amazon's data warehousing needs in this area are focused on collecting, storing, and analyzing data related to customer behavior, purchase history, and other metrics. This data is used to optimize Amazon's product recommendations engine, personalize the shopping experience for individual customers, and identify growth strategies.

2. Amazon's other primary business unit is Amazon Web Services (AWS), which offers cloud computing managed services to businesses and individuals. AWS generates significant amounts of data from its cloud data infrastructure, including customer usage and performance data. To manage and analyze this modern data effectively, Amazon relies on data warehousing technologies like Amazon Redshift, which enables AWS to provide real-time analytics and insights to its customers.

3. Beyond these core businesses, Amazon also has significant data warehousing needs in digital content (e.g., video, music, and books). Amazon's advertising business relies on data analysis to identify key demographics and target ads more effectively to specific audiences.

By investing in data warehousing and analytics capabilities, Amazon through digital transformation can maintain its competitive edge and continue to grow and innovate in the years to come.

Do you want to streamline your data integration?

CTA icon
Contact us to learn how we can help.
Book a call

Obstacles on the way to the goal

Amazon faced several specific implementation details and challenges in its data warehousing efforts.

• The brand needed to integrate data from various sources into a centralized data warehouse. It required the development of custom data pipelines to collect and transform data into a standard format.

• Amazon's data warehousing needs are vast and constantly growing, requiring a scalable solution. The company distributed data warehouse architecture center using technologies like Amazon Redshift, allowing petabyte-scale data storage and analysis.

• As a company that generates big data, Amazon would like to ensure that its data warehousing solution could provide real-time data analytics and insights. Achieving high performance requires optimizing data storage, indexing, and querying processes.

• Amazon stores sensitive customer data in its warehouse, prioritizing data security. To protect against security threats, the brand implements various security measures, including encryption, access controls, and threat detection.

• Building and maintaining a data warehousing solution can be expensive. Amazon leverages cloud-based data warehousing solutions (Redshift) to minimize costs, which provide a cost-effective, pay-as-you-go pricing model.

Amazon's data warehousing implementation required careful planning, significant investment in technology and infrastructure, and ongoing optimization and maintenance to ensure high performance and reliability.

Change for the better

When Amazon considered all the needs, found the right tools, and implemented a successful data warehouse, the company got the following main business outcomes:

• Improved data driven decision

• Better customer enablement

Cost effective decision

• Improved performance

• Competitive advantage

• Scalability

Amazon's data warehousing implementation has driven the company's growth and success. Not surprisingly, a data storage service provider must understand data storage. The cobbler's children don't need to have no shoes.

Case 1: How the Amazon Service Does Data Warehousing

Case 2: Data Warehousing Adventure with UPS

United Parcel Services (UPS) is an American parcel delivery and supply chain management company founded in 1907 with an annual revenue of 71 billion dollars and logistics services in more than 175 countries. In addition, the brand distributes goods, customs brokerage, postal and consulting services. UPS processes approximately 300 million tracking requests daily. This effect was achieved, among others, thanks to intelligent data warehousing.

One mile for $50 million

In 2013, UPS stated that it hosted the world's largest DB2 relational database in two United States data centers for global operations. Over time, global operations began to increase, as did the amount of semi structured data. The goal was to use different forms of storage data to make better users business decisions.

One of the fundamental problems was route optimization. According to an interview with the UPS CTO, saving 1 mile a day per driver could save 1.5 million gallons of fuel per year or $50 million in total savings.

However, the data was distributed in DB2; some included repositories, some local, and some spreadsheets. UPS needed to solve the data infrastructure problem first and then optimize the route.

Four letters "V."

The big data ecosystem efficiently handles the four "Vs": volume, validity, velocity, and variety. UPS has experimented with Hadoop clusters and integrated its storage details and computing system into this ecosystem. They upgraded data warehousing and computing power to handle petabytes of data, one of UPS's most significant technological achievements.

The following Hadoop components were used:

• HDFS for storage

• Map Reduce for fast processing

• Kafka streaming

• Sqoop (SQL-to-Hadoop) for ingestion

• Hive & Pig for structured queries on unstructured data

• monitoring system for data nodes and names

But that's just speculation because, due to confidentiality, UPS didn't declassify the tools and technologies they used in their big data ecosystem.

Constellation of Orion

The result was a four-year ORION (On-Road Integrated Optimization and Navigation) route optimization project. Costs — about one billion dollars a year. ORION used the results to data stores and calculate big data and got analytics from more than 300 million data points to optimize thousands of routes per minute based on real-time information. In addition to the economic benefits, the Orion project shortened approximately 100 million shipping miles and a 100,000-ton reduction in carbon emissions.

Case 2: Data Warehousing Adventure with UPS

Case 3: 42 ERP Into One Data Warehouse

In general, the topic of specific cases of data warehousing implementation is sufficiently secret. There may be cases of consent and legitimate interests in the contracts. There are open-source examples of work, but the vast majority are on paid libraries. The subject is so relevant that you can earn money from it. Therefore, sometimes there are "open" cases, but the brand name is not disclosed.

Brand X needs help

World leader in industrial pumps, valves, actuators, controls, etc., needed help extracting data from disparate ERP systems. They wanted it from 42 ERP instances, standardized flat files, and collected all the information in one data warehouse. The ERP systems were from different vendors (Oracle, SAP, BAAN, Microsoft, PRMS) to complicate future matters.

The client also wanted a core set of metrics and a central dashboard to combine all the information from different locations worldwide. The project resulted from a surge in demand for corporate data from database management. The company knew its data warehousing needed a central repository for all data from its locations worldwide. Requests often came from top to bottom, and when an administrator required access to the correct data, there were logistical extracting problems. And the project gets started.

Are you interested in enhanced insights through data aggregation?

banner icon
Get in touch to schedule a consultation today.
Book a consultation

The foundation stone

The hired third-party developer center has made a roadmap, according to which ERP data was taken from 8 major databases and placed in a corporate data warehouse. It entailed integrating 5 Oracle ERP instances with 3 SAP ERP. Rapid Marts have also been integrated into Oracle ERP systems to improve the project's progress.

One of the main challenges was the need for more standardization of fields or operational data definitions in ERP systems. To solve this problem, the contractor has developed a data service tool that allows access to the back end of the database and displays info suitably. Since then, the customer has known which fields to use and how to set them each time a new ERP instance is encountered. These data definition patterns were the project's foundation stone and completely changed how customer data is handled. It was a point to launch consent.

All roads lead to data warehousing

The company has one common and consistent way to obtain critical indicators. The long-term effect of the project is the ease of obtaining information. What was once a long and inconsistent process of getting relevant information at an aggregate level is now streamlined to store data in one central repository with one team controlling it.

Case 3: 42 ERP Into One Data Warehouse

Data Warehousing: Different Cases — General Conclusions

Each data warehouse organization has unique methods and tools because business needs differ. In this case, data warehousing can be compared with a mosaic and a children's constructor. You can make different figures from the same parts, arranging the elements especially. And if one part is lost or broken, you need to make a new one or find another one and "process it with a rasp."

Generalities between different cases of data warehousing

There are several common themes and practices among successful data warehousing implementations, including:

• Successful data warehousing implementations start with clearly understanding the business objectives and how the warehouse (or data lake) can support those objectives.

• The data modeling process is critical to the success of data warehousing.

• The data warehouse is only as good as the data it contains.

• Successful data warehousing requires efficient data integration processes that can operate large volumes of data and ensure consistency and accuracy.

• Data warehousing needs ongoing performance tuning to optimize query performance.

• A critical factor in data warehousing is a user-friendly interface that makes it easy for end users to access the data and perform complex queries and analyses.

• Continuous improvement is essential to ensure the data warehouse remains relevant and valuable to the business.

Competent data warehousing implementations combine technical expertise and a deep understanding of business details and user needs.

Your case is not mentioned anywhere

When solving the problem of organizing data warehousing, one would like to find a description of the same case and do everything according to plan. But the probability of this event is negligible — you will have to adapt to the specifics of the customer's business and consider your knowledge and capabilities, as well as the technical and financial conditions of the project. Then it would help if you took a piece of the puzzle or parts of the constructor and built your data warehouse. Minus — you have to work. Plus — it will be your decision on data storage and only your implementation.

Data Warehouse-as-a-Service Market Size Global Report, 2022 - 2030

Data Warehousing Is Like a Trampoline

Changes in data warehousing, like any technological and methodological changes, are carried out to improve the data collection, storage, and analysis level. It takes the customer to a new level in his activity and the contractor — to his own. Like a jumper and a trampoline: separately, it is just a gymnast and just equipment, and in combination, they give a certain third quality — the possibility of a sharp rise.

If you are faced with the problem of organizing a new data warehousing system, or you are simply interested in what you read, let's exchange views with DATAFOREST.


What is the benefit of data warehousing for business?

A data warehouse is a centralized repository that contains integrated data from various sources and systems. Data warehousing provides several benefits for businesses: improved decision-making, increased efficiency, better customer insights, operational efficiency, and competitive advantage.

What is the definition of a successful data warehousing implementation?

The specific definition of a successful data warehouse implementation will vary depending on the goals of the organization and the particular use case for data warehousing. Some common characteristics are: meeting business requirements, high data quality, scalability, user adoption, and positive ROI.

What are the general considerations for implementing data warehousing?

Implementing data warehousing involves some general considerations: business objectives, data sources, quality and modeling, technology selection, performance tuning, user adoption, ongoing maintenance, and support.

What are the most famous examples of the implementation of data warehousing?

There are many famous examples of the implementation of data warehousing across industries:

• Walmart has one of the largest data warehousing implementations in the world

• Amazon's data warehousing solution is known as Amazon Redshift

• Netflix uses a data warehouse to store and analyze data from its streaming platform

• Coca-Cola has a warehouse to consolidate data from business units and analyze it

• Bank of America analyzes customer data by data warehousing to improve customer experience

What are the challenges while implementing data warehousing, and how to overcome them?

Based on the experiences of organizations that have implemented data warehousing, some common challenges and solutions are:

• Ensuring the quality of the data that is being stored and analyzed. You must establish data quality standards and implement data validation and cleansing by data types.

• Integrating from disparate data sources. Establishing a clear data integration strategy that considers the different data sources, formats, and protocols involved is vital.

• As the amount of data stored in a data warehouse grows, performance issues may arise. A brand should regularly monitor query performance and optimize the data warehouse to ensure that it remains efficient and effective.

• To ensure that sensitive data stored in the data warehouse is secure. It involves implementing appropriate measures such as access controls, encryption, and regular security audits. They are details of privacy security.

• Significant changes to existing processes and workflows. Solved by establishing a transparent change management process that involves decision-makers and users at all levels.

What is an example of how successful data warehousing has affected a business?

An example of how successful data warehousing has affected Amazon is its recommendation engine. It suggests products to customers based on their browsing and purchasing history. By using artificial intelligence and machine learning algorithms to analyze customer data, Amazon has improved the fully managed accuracy of its recommendations, resulting in increased sales and customer satisfaction.

What role does data integration play in data warehousing?

Data integration is critical to data warehousing, enabling businesses to consolidate and standardize data from multiple sources, ensure data quality, and establish effective data governance practices.

How are data quality and governance tracked in data warehousing?

Data quality and governance are tracked in data warehousing through a combination of data profiling, monitoring, and management processes and establishing data governance frameworks that define policies and procedures for managing data quality and governance. So, businesses can ensure that their data is accurate, consistent, and compliant with regulations, enabling effective decision-making and driving business applications' success.

Are there any measures to the benefits of data warehousing?

The benefits of business data warehousing can be measured through improvements in data quality, efficiency, decision-making, revenue and profitability, and customer satisfaction. By tracking these metrics, businesses can assess the effectiveness of their data warehousing initiatives and make informed decisions about future investments in data management and analytics with cloud services.

How to avoid blunders when warehousing data?

By following the best practices, businesses can avoid common mistakes, minimize the risk of blunders when warehousing data, and ensure their data warehousing initiatives are successful and practical to be analyzed with business intelligence.

More publications

All publications
Article preview image
May 28, 2024
21 min

Predictive Analytics: See the Future

Article preview
May 27, 2024
15 min

Embracing Return Predictions: The Frontier in E-Commerce Customer Satisfaction

Article preview
May 27, 2024
21 min

Changing the Legal Landscape with AI Integration

All publications

Let data make value

We’d love to hear from you

Share the project details – like scope, mockups, or business challenges.
We will carefully check and get back to you with the next steps.

DataForest, Head of Sales Department
DataForest worker
DataForest company founder
top arrow icon

We’d love to
hear from you

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
"They have the best data engineering
expertise we have seen on the market
in recent years"
Elias Nichupienko
CEO, Advascale
Completed projects
In-house employees
Calendar icon

Stay a little longer
and explore what we have to offer!

Book a call