Simplified Data Model Changes with the Data Virtuality Sync Tool

Previously on the Data Champion Blog ...

Welcome back to the second part of our mini-series “Data Virtuality Logical Data Warehouse 2.1 – What’s New?” where we introduce you to the key features of the latest version of our revolutionary data integration solution, the Logical Data Warehouse (LDW).

In the first post, we looked at the Data Virtuality Web client, a one-stop data shop and metadata catalog for business users. Missed the article? Catch up here. Need to refresh your memory? Here’s a quick recap:

The DV Web client enables all business users (depending on their permissions) to search metadata and query SQL from the web browser without using the DV Studio.

  • Inspect available data sources and their contents
  • Create simple or advanced SQL statements
  • Generate SQL queries directly from the data source
  • Download results as .cs-files to process further and/or import into e.g. Excel

Now, this time we put the focus on the technical users, meaning your IT, BI developers, and data engineers. Or, in other words, those who are in charge of the construction and administration of the business data models.

At Data Virtuality, we know how essential the work of the IT department is for the success of every data integration project. So, we wanted to do more to support them in their workflows. And with this, we proudly present to you the second new key feature of our Logical Data Warehouse 2.1, the Data Virtuality Synchronization Tool, or DV Sync Tool for short.

But let’s start from the beginning!

Imagine this Scenario:

Your c-level has tasked the BI Analysts of the company to prepare a new business report that shows the total sales by customer country per quarter. So, your BI developers go through the data model to check which reportings already exist. They find reportings for total sales, for total sales per quarter – but there’s no reporting for total sales per customer country yet. That means parts of the data model need to be edited, new views/dimensions need to be added, etc. In short, the production environment of your data integration system needs to be adjusted to accommodate the task requirements and deliver the specific reporting your BI Analysts needs.

And this is where the true challenge begins!

Never Change A Running System!

Many data models in data integration tools only have one instance or environment on which all productive reportings and corresponding automatizations run, and which is being accessed by all frontends. If you now want to change the existing data model structures to get your business report, you will have to change them directly in the production instance. While all your other reportings are running and without being able to test the changes before implementing them. In short, it’s a disaster waiting to happen!

Indeed, whether it hampers the performance of your reportings or outright cripples the entire process – having only one instance for staging and production can put your entire system at risk. And let’s not even speak about all the other business operations that depend on your data model running smoothly ...

Duplicate the System – Problem Solved?

Of course, this poses a serious dilemma: How can you edit your data model if it only has one instance?

Some ingenious users may come up with an apparent quick fix – they simply duplicate the views that need to be changed! By doing so, they get two environments, one for production, and a copy that acts as staging/development instance. Once all changes have been tested and implemented successfully, the dev instance is simply re-named and thus turned into the new production instance.

Sounds like a solid plan? Not quite!

This quick fix may work for smaller projects or companies whose data models aren’t that large. But it’s not feasible in an enterprise environment or for projects with lots of interdependent views and jobs. Because, at some point, it simply becomes impossible to keep track of all the duplicated views, each of which supplies several jobs. They, in turn, reference a couple of more views. And those are also not isolated ... and so on. You’d end up with a hodgepodge of job and view copies and no way to untangle this mess.

Also, larger companies and enterprises usually follow very strict review processes and approval loops which are by design incompatible with this kind of improvised solution.

So, it’s back to square one – or isn’t it?

Enter the Data Virtuality Sync Tool!

Don’t waste your time with risky experiments on the production instance or messy duplications of the entire data model! With its Sync Tool, our Logical Data Warehouse 2.1 establishes a clear distinction between production operations and development/staging. So you can edit your data model smoothly and hassle-free whenever necessary.

Like the DV Web client, the Sync Tool is an out-of-the-box feature in the Logical Data Warehouse 2.1. All you have to do is update to the latest version of the LDW (get in touch for more information), log in with your Data Virtuality credentials, and get started.

In the Logical Data Warehouse 2.1, you can install your data model on two (or more) instances. A production environment with all virtual schemas, data sources etc., and a development/staging instance which is an exact copy of the former. This allows for greater flexibility and freedom to experiment, as the staging instance acts as a stopping point before any changes touch production. So your IT can edit and safely test the data model without having to worry that their changes might interfere with the reporting and other running processes.

The two instances are still linked, of course. And it is at this intersection that the DV Sync Tool comes into the picture.

The Data Virtuality Sync Tool synchronize data model changes between multiple LDW instances (staging/development and production).

  • Work with multiple instances (procedures, jobs/schemas, views etc.)
  • No manual synchronization necessary: automatically transfer your data model changes to the production instance
  • Compare instances and identify changes

2 (Or More) Instances, 1 Data Model, 0 Risk

Through the link between the staging and production instances, the Sync Tool automatically monitors and records all changes that have been made on the development side. They’re then shown in a list with specific icons demarcating the different types of changes, such as a plus symbol for added views or a pencil symbol for edited virtual schemas etc. In this way, you never lose track of the changes to your data model again, no matter how many jobs, schemas, and optimizations you’ve been working on.

From the list, you can select all or individual changes that you want to implement on the production instance. When you’re ready, simply click “finish”, then “synchronize”, and the selected changes are loaded automatically from the development/staging instance to the production environment where they will go live immediately. If you call up the production instance you will be shown the most recent changes/differences between the two environments. Nothing breaks, nothing gets lost, everyone’s happy!

And did we mention that with the LDW 2.1 it’s also possible to have more than two instances? This is especially interesting for enterprises whose processes for changing/editing data models usually have several stages (e.g. development, QA testing, and approval) before any changes are pushed live.

And the synchronization also works the other way around, meaning you can transfer jobs and views that have been active on the production environment to the staging instance for adjustments.

DV Sync Tool, Your Reliable Partner in Every Data Situation

BI projects are always in motion and new reportings have to be added all the time. So your developers are permanently tinkering, adjusting, developing to make sure that the data model can accommodate every use case your team is getting tasked with. In order to work efficiently and flexible, they need the right tool in place that can support them every step of the way.

With the DV Sync Tool, the LDW 2.1 has introduced a valuable feature that enables you to optimize and edit as many jobs and schemata as you need on the staging instance, and have them automatically synchronized with and released on the production instance once you’re finished.

Make messy duplications and risky changes in the production instance a thing of the past, and take your data management to the next level.

Happy Updating!

Want to find out more about our Data Virtuality Logical Data Warehouse 2.1 and its key features? Then schedule a demo with our experts today!