A modern data integration strategy employs what’s known as “best-ﬁt engineering,” whereby each part of the data management infrastructure utilizes the most appropriate technology solution to perform its role, including storing data determined by business requirements and Service Level Agreements (SLAs). Unlike a data lake, this new architecture has a distributed approach, aligning information storage selection with information use, and leveraging multiple data technologies that are ﬁt for speciﬁc purposes. A hybrid approach can also signiﬁcantly reduce costs and time to delivery when changes or additions in the warehouse are required.
Compounding of the Term
One term for this new architecture is logical data warehouse. Another is virtual data lake. In either case, the premise is that there is no single data repository. Instead, the logical data warehouse is an ecosystem of multiple, ﬁt-for-purpose,
repositories, technologies, and tools that interact synergistically to manage data storage and provide performant enterprise analytical capabilities.
The Logical Data Warehouse Approach
The original, and so far unfulfilled, analytical requirements of the traditional data warehouse were to be able to retrieve data using a single query language, get speedy query response, and have the ability to quickly assemble diﬀerent data models or views of the data to meet speciﬁc needs. By combining data federation, physical data integration, and a common query language (SQL), the logical data warehouse approach achieves all three of these goals without the need to copy or move all of the data to a central location.
Physical data integration is a robust feature of the logical data warehouse that ensures fast query response while decoupling performance from the source data stores and moving it to the logical data warehouse repository. In this manner, the eﬀort-intensive, physical transfer of the data is minimized and simpliﬁed, eﬀectively removing lengthy data movement delays from the critical path of data integration projects.
Gartner’s Analysis of this New Approach
In Understanding the Logical Data Warehouse: The Emerging Practice, Gartner weighed in on this approach, pointing out that it oﬀers ﬂexibility for companies that have diﬀerent data requirements at diﬀerent times. For example, many use cases require a central repository, such as a traditional data warehouse or analytic database, where data that is needed frequently, or with the greatest retrieval speed, can be stored and optimized for performance.
Increasingly, data analysts must be able to explore data freely with guaranteed adequate query performance. Frequent use cases along these lines are sentiment analysis or fraud detection analysis. These use cases require a distributed technology,
such as Hadoop, to store the massive amounts of data available through social media feeds and clickstream activity logs.
Additionally, they demand direct access to data sources via data federation. As Gartner rightly indicates, a logical layer is needed on top of these technologies to unify the architecture and to allow queries and processes to operate on all systems concurrently as needed.
As the ﬁrst logical data warehouse, Data Virtuality provides this uniform layer over numerous data storage technologies, unifying these data stores and facilitating the use cases suggested above by Gartner. By routing queries among data stores behind the scenes as needed, the Data Virtuality technology oﬀers significant beneﬁts to business users. The business can use the same platform for handling a variety of use cases, and far more than could be handled by a traditional data warehouse.
Also, new approaches to data integration are possible, enabling users to put business needs ﬁrst and allow the technology platform to adapt as needed.
Explanation of the Approach
By decoupling the semantic uniﬁed data access layer, in which the business users interact, from the actual data sources, changes that occur in the original data source can be isolated from interfering with analytical processes. In a profound departure from past data accessibility strategies, business users can interact with data comfortably and easily, focusing on their objective rather than the technological underpinnings.
By consolidating relational and non-relational data sources, including real-time data, Data Virtuality enables immediate analysis using SQL. Data Virtuality provides a central data cockpit, which allows all data sources, whether analytical or operational, to freely interchange data.
Integrated connectors allow data to be immediately processed using analysis, planning, or statistics tools, or written back to source systems as needed. In addition, the logical data warehouse automatically adjusts to changes in the IT landscape and user behavior, thereby oﬀering the highest possible degree of ﬂexibility and speed, with little administrative overhead.
Logical Data Warehouse compared to Traditional ETL Solutions
In a logical data warehouse project, a few clicks can seamlessly connect all data-producing and data-processing systems, including ERP and CRM systems, web shops, social media applications, and just about any SQL and NoSQL data source, all in real time. With instant access to the data, users can begin experimenting with these connections and joins until they achieve the results they want.
In stark contrast to traditional ETL solutions, the key diﬀerence with the logical data warehouse is that there’s no need to move the data to analyze it. This significantly reduces development and database structuring time and costs. Equally ﬂexible and
responsive, the logical data warehouse is a completely diﬀerent data integration paradigm than the inﬂexible traditional data warehouse approach.