Performance Optimization enabled in the Data Virtuality Platform

Published on November 14, 2023

One of the predominant concerns expressed by enterprises considering data virtualization revolves around performance. Given its inherent nature of establishing real-time connections, such worries are entirely justifiable.

To address this, Data Virtuality Platform has integrated a three-level performance optimization mechanism, a step ahead of the commonly found two-level optimization in other data virtualization tools.

The three levels are:

Distributed query optimization: This leverages advanced techniques like pushdowns and optimized join algorithms.
Caching: This functions both in-memory and on-disk.
Self-learning recommended optimization (Materialization): This proposes the materialization of tables/views for enhanced optimization.

In the following, we will delve deeper into the different layers and comprehend how they bolster performance.

Distributed query optimization

All queries entering the Data Virtuality engine undergo transformations to enhance their performance using distributed query optimization. Here’s a breakdown of the primary processes involved:

Rewriting SQL: As a foundational step, queries undergo a refinement process to simplify expressions and criteria, ensuring that the base SQL is optimized for maximum efficiency.
Logical plan optimization: Once the SQL is rewritten, the queries are turned into a logical plan. The Data Virtuality Server uses special optimization rules to look closely at the query’s structure and the size of the data. It also considers detailed cost information to improve its decisions, helping to use techniques like pushdowns.
Processing plan conversion: Subsequent to the logical plan optimization, this plan is transmuted into an actionable format. Within this layout, nodes symbolize fundamental processing actions, steering the query’s execution across the distributed framework.

Caching

Recognizing the constraints of scalability in data virtualization, especially with expansive datasets or a high number of users, Data Virtuality taps into caching to improve query performance. Caching significantly boosts performance for small datasets, yet its effectiveness for larger datasets diminishes rapidly, providing limited control over data loading and storage.

Self-learning recommended optimization (Materialization)

The distinctive part of Data Virtuality Platform’s optimization engine is data materialization with self-learning capabilities. It learns from the query behavior of data consumers and addresses performance issues by autonomously creating and managing the physical data structures of either:

the external data sources or
the internal virtual views in user-defined analytical storage

Further, this self-learning recommendation optimization suggests indexes for the materialized tables. Once data is physically stored in the analytical storage, any slow-performing segments of a query are seamlessly redirected to this optimized data, eliminating the need for report rewriting.

To ensure the data in analytical storage remains updated, periodic materialization tasks are executed. Incremental materializations, which capture only the new or changed data, are also on offer, thereby reducing the amount of data to be materialized.

The advanced data virtualization experience

Data virtualization is a dynamic technology, and performance optimization is crucial for enterprises to leverage its full potential. The Data Virtuality Platform’s three-tiered approach to performance optimization ensures a comprehensive solution, addressing multiple aspects of the performance challenges. Whether you’re dealing with large datasets, numerous users, complex query structures, or slow databases and/or slow network, the platform’s materialization capabilities and optimization features are designed to maximize efficiency.

Experience the power and innovation of the Data Virtuality Platform firsthand – start your free trial today.

For more details on Performance Optimization in the Data Virtuality Platform you can check out the documentation.

Solutions

Overview

By Use Case

By Industry

Resources

Blog

Resource Center

Partners

Case Studies

Docs & Support

FAQ

Company

About Us

Careers

Data Virtuality Trust Center

Article contents

Share Article

Performance Optimization enabled in the Data Virtuality Platform

Article contents

Distributed query optimization

Caching

Self-learning recommended optimization (Materialization)

The advanced data virtualization experience

More interesting articles

CData Software Acquires Data Virtuality to Modernize Data Virtualization for the Enterprise

Enhancing Data Management with Data Warehouse Automation in Data Virtualization

Explore the Data Virtuality SQL AI Assistant

Enhancing Data Virtualization with Replication – A Comprehensive Strategy for Data Architects

Leveraging Data Virtualization in Data Lakehouse Architecture

Generative AI in Data Integration