Unifying the Data Warehouse, Data Lake, and Data Marketplace

Unifying the Data Warehouse, Data Lake, and Data Marketplace

There was a time when developing a data warehouse was sufficient to quench the thirst for data, reporting, and analytics of most business users. Not anymore. Organizations have discovered that data can be a valuable business asset. It has taken some time, but finally they realize they can do more with all the data that’s available than just produce simple reports. With the right data they can distinguish themselves from the competition, reduce costs by optimizing business processes, and create new business opportunities.

Data science, investigative analytics, self-service BI, embedded BI, streaming analytics are just a few of the many new forms of how data can be used and exploited. To support all these new forms of data usage, organizations are currently developing new systems, such as data lakes, data marketplaces, and data streaming systems. Unfortunately, most of these new systems are developed as stand-alone systems with almost no relationship with the existing data warehouse system.

In other words, organizations are developing systems that all deliver data to business users. Developing all these data delivery systems independently has two severe drawbacks:

  • Potentially, these data delivery systems share the same data sources. For example, traditional business users, data scientists, and business users who access the data marketplace for ad-hoc data analysis may all be interested in sales data. If several independently developed data delivery systems share the same data sources, similar solutions will be developed to deliver the requested data to the business users. It will be like reinventing the wheel over and over again, which negatively influences productivity and maintenance. Many comparable solutions have to be developed that deal with integrating, aggregating, transforming, filtering, governing, cleansing, auditing, and securing the data. For example, if the zip codes belonging to customer addresses have to be cleansed before they can be used, each data delivery system needs a solution. Or, if two different systems contain customer address data, each data delivery system needs a solution to integrate them.
  • Potentially, these data delivery systems share the same users. For example, a specific user may want to combine results coming from a streaming analytics with data coming from a data warehouse system to compare what’s currently happening with what’s “normal.” If these systems are developed independently of each other, it’s hard to guarantee that the two results are consistent.

It’s crucial that organizations, somehow, bring these data delivery systems together, to create one all-encompassing architecture. This unified architecture is responsible for delivering any form of data in any form to any business user.

This unified data delivery platform is probably not an extension of the well-known data warehouse system. It’s an architecture in which the data warehouse system operates as a module that delivers data to an umbrella architecture that deploys other technologies and systems to deliver data, such as a streaming system and a data lake. This data delivery platform unifies the concepts of data warehouse, data lake, data marketplace, streaming data, and any other data delivery system.

The foundation of this new data delivery platform must be abstraction. It must be able to hide for business users how and where data is stored, how it is copied, which technologies are used, whether data is integrated on-demand or on batch, and so on. In addition, it must be transparent enough to business users to determine how source data has been manipulated. A data delivery platform must be able to support a wide range of business users, ranging from users requiring governable and auditable reports, to users demanding a highly agile marketplace, and to data scientists who analyze raw data.

For the coming years, architecting an integrated data delivery platform will be the challenge for many organizations. If they don’t, their multitude of data delivery systems can lead to a labyrinth of systems that won’t allow them to get the most out of their data asset. Not everyone’s data thirst will be quenched.

Blog written by Rick Van der Lans, originally published here.


Unifying the Data Warehouse, Data Lake, and Data Marketplace

The Age of the Customer: The Key to the 360° View

The Age of the Customer: The Key to the 360° View

The evolution of modern technology in the past 10 years is astounding: our mobile phones are becoming smarter by the second, our online behavior is revealing more about human nature and our wearable technology knows more about our general well-being than our family doctor.

This adoption of technology into almost every facet of our life will surely define this epoch as the digital era. We have been talking about the Internet of Things (IoT) for many years now, these “things” being all sorts of connected devices. Gartner predicted that, by the end of this year, 5.5 million new devices will be connected to the Internet. And each device is creating data at an exponential rate (2.5 quintillion bytes every single day in fact!).

The Potential of IoT

This is an exciting time for the technology user: in the near future, we can expect autonomous driving to become part of our automotive experience (with  Tesla’s autopilot systemalready taking a huge leap in this direction). Built-in connectivity as well as vehicle-to-vehicle communication will become standard features.

Tech giants like Samsung and LG have already created Wi-Fi controlled smart appliances with other home appliance companies likely to follow suit. Imagine being able to unlock your door with just a swipe on your smart phone, or your smart fridge suggesting a recipe for your dinner according to the ingredients you have in your kitchen – and now, imagine this possibility 10 years ago!

With That Potential Come Expectations…

The more we evolve in the digital-era, the more we will expect as the customer. There will become a point where it will not be enough that our wearable technology tells us how many calories we have burnt or how many steps we should take to stay on top of our target. We will expect our wearable technology to provide us personal exercise plans that fit around our current schedule, our eating habits and our preferred type of exercise (having already connected and synced up with our online agenda and our smart kitchen of course.)

From a business perspective, this is a challenging time. As Brian Hopkins, Principal Analyst at Forrester explained, “It’s the age of the customer. The rising level of customer expectations is the driving force of business today”. In order to stay ahead of the game, they need to have a 360-degree view of the data of each and every customer in their database.

In the past, many companies relied on their CRM systems and their data warehouses to give them insight into their customers. It has become clear in recent years however, that these old methods of collecting and analyzing data are not sufficient to support the growing interactions occurring outside of these systems. For example, unstructured data sources such as social media, email interactions and other apps are not accounted for in traditional data architectures and therefore result in a less than complete view of the customer.

Companies have therefore turned to current technologies like data virtualization specifically to help them build a more complete view of their customers. Data virtualization permits the aggregation of data from CRM systems, data warehouses, unstructured sources, smart analytics and any other source which may house supplementary information about your customer. All this data is then available to the data consumer in real-time from a single data layer.

As IoT continues to expand with the adoption of smart appliances and connected devices (with each device continuing to produce data at an exponential rate), companies that want to be smart should adopt data virtualization to create 360 degree views of their customers in order to keep them satisfied.


The Age of the Customer: The Key to the 360° View