Column Level Data Lineage
Data lineage is a very hot topic these days, mostly driven by increasing regulatory requirements and data quality initiatives. Data lineage plays a critical role in better understanding the data itself: Where does the data come from? How was it modified and by whom? There are various perspectives and approaches out there. At Data Virtuality, we think that a comprehensive approach is needed for successful data lineage. The essential aspect is to make data lineage available together with the data.
Starting LDW 2.3, you can see your data lineage with just one mouse click. Go to the number that you are interested in, click on the right button, and select “show data lineage”.
You will immediately see all the information about the data flow:
- Where the data originally comes from.
- How it was modified, e.g. through a WHERE condition, GROUP BY clause, etc.
- Who was the data owner of each step?
With this metadata information, you can instantly investigate if you see a red flag.
In today's fast-changing business world, information became an actual production factor and data-driven decision-making an inevitable tool to withstand the growing competition across global industries and markets. Exploiting the power of BI/analytics and automating workflows is one way for companies to open new revenue streams while reducing costs by improving the efficiency of their daily processes.
And here lies the challenge. Nowadays, enterprise data is stored in different locations and comes in various, rapidly evolving forms such as:
- Relational and non-relational databases like MySQL, Amazon Redshift or MongoDB
- Flat files like XML, CSV or JSON
- Social Media or Website data like Facebook, Twitter or Google Analytics
- CRM/ERP data like SAP, Oracle or Microsoft Dynamics
- Cloud/Software-as-a-Service applications like Netsuite, Salesforce or Mailchimp
- Data lakes and Enterprise Data Warehouses
- Big Data
Businesses are faced with increasing volumes of data accompanied by growing data variety and velocity. This ultimately leads to further challenges like achieving trustworthy data quality, time efficiency in data management and self-service capabilities for data users. Overcoming these challenges efficiently and effectively became crucial for modern enterprises' success.