CData Software Acquires Data Virtuality to Modernize Data Virtualization for the Enterprise
Data Virtuality brings enterprise data virtualization capabilities to CData, delivering highly-performant access to live data at any scale.
Explore how you can use the Data Virtuality Platform in different scenarios.
Learn more about the Data Virtuality Platform or to make your work with Data Virtuality even more successful.
Insights on product updates, useful guides and further informative articles.
Find insightful webinars, whitepapers and ebooks in our resource library.
Stronger together. Learn more about our partner programs.
Read, watch and learn what our customers achieved with Data Virtuality.
In our documentation you will find everything to know about our products
Read the answers to frequently asked questions about Data Virtuality.
In this on-demand webinar, we look at how a modern data architecture can help data scientists to be faster and to work more efficiently.
Learn more about us as a company, our strong partner ecosystem and our career opportunities.
How we achieve data security and transparency
After covering CopyOver Replication in the first blog post of our Short Guide on Data Replication Types series, we’ll delve deeper into Incremental Replication and Upsert replication. For this post we decided to lump them together because the mechanism behind them is the same.
Our platform uniquely combines data virtualization and data replication. This gives data teams the flexibility to always choose the right method for their specific requirement. It provides high-performance real-time data integration through smarter query optimization and data historization as well as master data management through advanced ETL in a single tool.
Incremental replication is a lightweight replication strategy that focuses on low footprint at the data source, a small volume of information that needs to be transferred and little inserts and updates to be made in the destination.
Once a Full/Complete/CopyOver replication was done, incremental replication is the more efficient approach to move forward as it only copies newly added and updated data from the source table to the analytical storage. Incremental replication only focuses on what changes and is therefore less resource intensive and faster than other replication types.
With Data Virtuality Platform, the main aspect of this replication is about knowledge which data is new and which data has been updated.
Below are some cases where the incremental replication method is specifically recommended.
The tricky thing about incremental replication is that it requires a column with a reliable timestamp value which can be used to easily determine which rows are new and which are not. For example, this can be a Modified column displaying exactly when a row is changed (newly inserted or updated), or a column providing an auto-incremented ID.
In theory, you can use any column to check for updates in a row. But we recommend making sure that its content can reflect changes or new entries. Otherwise, the rows freshly retrieved from the source table can be false (missing rows, missing updates, duplicate rows, and so on).
As incremental replications do not build up a record of each update, there will be no stages to fall back to. When the replication job runs, the current data of the target table is scanned and either the maximum of a specific column is generated.
Here is how incremental replication works on our platform.
You can find more detailed examples here.
The purpose of Upsert Replication is to UPdate and inSERT new data in the target table. The process is similar to the SQL MERGE command. It features the selection of multiple columns to update and for the identity. New data can be identified by an SQL expression usually referring to a modification timestamp.
Upsert replication is useful when you want to update and insert certain information at the same time. It differs from batch replication because you have the ability to programmatically choose where to update. So for example, you have a table with various products and you want to change the price only for products that cost less than 150 dollars. You can do that with the upsert replication – all you need is a logic statement when you are updating. If you were to use batch replication in this situation, you could either delete or update, but never do both at the same time.
The upsert replication utilized in Data Virtuality is intended to provide an easy way to replicate a source table to a table stored in the destination database.
Unlike history update, there will never be more than one entry for a row with a distinct set of values for key columns. Additional timestamp columns, like the ones that automatically come with history update, will not be created.
Here is how upsert replication works on our platform.
The Data Virtuality Platform is trusted by businesses around the world to help them harness the power of their data. Book a demo and test all the features of the Data Virtuality Platform in a session tailored to your use case.
Data Virtuality brings enterprise data virtualization capabilities to CData, delivering highly-performant access to live data at any scale.
Discover how integrating data warehouse automation with data virtualization can lead to better managed and optimized data workflows.
Discover how our ChatGPT powered SQL AI Assistant can help Data Virtuality users boost their performance when working with data.
While caching offers certain advantages, it’s not a one-size-fits-all solution. To comprehensively meet business requirements, combining data virtualization with replication is key.
Explore the potential of Data Virtuality’s connector for Databricks, enhancing your data lakehouse experience with flexible integration.
Generative AI is an exciting new technology which is helping to democratise and accelerate data management tasks including data engineering.
Leipzig
Katharinenstrasse 15 | 04109 | Germany
Munich
Trimburgstraße 2 | 81249 | Germany
San Francisco
2261 Market Street #4788 | CA 94114 | USA
Follow Us on Social Media
Our mission is to enable businesses to leverage the full potential of their data by providing a single source of truth platform to connect and manage all data.