Use Cases for the Logical Data Warehouse

This blog post illustrates different use cases, which are conceivable by using a logical data warehouse.

A Modern Data Warehouse

The logical data warehouse is essential for organizations that wish to combine big data and data warehousing in the enterprise.

How Does the Logical Data Warehouse Work?

The logical data warehouse works by intelligently marrying two distinct technologies to create an entirely new manner of integrating data. The first technology is data federation, which connects two or more disparate databases and makes them all appear as if they were a single database. The second is analytical database management providing semantic business-friendly data element naming and modeling allowing flexible ingestion and modeling options.

The First Logical Data Warehouse – A Revolution!

A modern data integration strategy employs what’s known as “best-fit engineering,” whereby each part of the data management infrastructure utilizes the most appropriate technology solution to perform its role, including storing data determined by business requirements and service-level agreements (SLAs). Unlike a data lake, this new architecture has a distributed approach, aligning information storage selection, with information use, and leveraging multiple data technologies that are fit for specific purposes. A hybrid approach can also significantly reduce costs and time to delivery when changes or additions in the warehouse are required.

The Traditional Data Warehouse And The ETL Approach

This blog post will provide more information about a traditional data warehouse and illustrates advantages and disadvantages of such a technology. Furthermore, a deeper explanation of an ETL process is given and explained, why ETL-based data warehousing projects became infamous.

Data Lakes and the Emergence of ELT – the Next Big Thing?

A new technology arises, the data lake strategy. Data lakes are storage repositories, which are able to hold a vast amount of raw data in its native format until needed. In many cases data lakes are Hadoop-based systems and they represent the next stage in both power and flexibility. A compelling benefit of the approach is that there is no need to structure (transform) the data before querying it (which would be referred to as ‘schema on write’). In fact, you can assign structure to the data at the time it is being queried (referred to as ‘schema on read’). However, while data lakes are able to hold large amounts of unstructured data in a cost-effective manner, they are insufficient for interactive analysis when fast query response is required or if access to real-time data is needed.

Self-Service Business Intelligence Tools – a new Approach

Because both the data warehouse and OLAP approaches fall short of business’ expectations for speedy and comprehensive analytical data access, a new approach surfaced - Self-service business intelligence (SSBI) tools.

Multidimensional Databases – OLAP And ROLAP

Online Analytical Processing (OLAP), and cubes are other words for multi-dimensional sets of data that essentially serve as a staging space in which to analyze information. These special online analytic processing databases hold data not in tables but in OLAP cubes which are a mechanism used to store and query data in an organized, multi-dimensional, structure specifically optimized for analysis.

Data Federation – Finally Some Relief

While the majority of data analysts were busy exploring the progression from relational databases to Cubes, analytic databases, and data lakes, another camp was looking into using data federation to integrate data for analysis.

The Challenge Unlocking the Value of Big Data

Big data is here, and it’s transforming the very nature of commerce, enabling new insights, and accelerating the generation of business insights. While the concept of big data isn’t new, its potential is just now being realized as powerful tools to organize, manage, and analyze, immense volumes of enterprise-generated and third-party data finally become available for mainstream use.

However, for many organizations, it’s not so easy to unlock the value in this data. While data volume (the amount of data) and velocity (speed that data is generated) is in part what makes it so valuable, volume and velocity also present significant challenges. Still more daunting is the broad variation in the types and sources of data (variety), including highly structured files, semi-structured text, and unstructured video and audio feeds.

Analytical Databases an Enhancement of SSBI

As SSBI tools evolved, data scientists were still wrestling with the overall challenge of finding an analytical database as flexible for analytics as relational databases were for transactional data processing.

Data Virtualization Sustains Digital Business

The proliferation of disparate data sources distinguishes today’s data landscape. Easily accessible, well-structured data was once the norm. The “status quo” has been disrupted by the phenomenal growth in the variety and volume of multi-structured data originating from machine and IoT, external, application-oriented, cloud-based, and on-premises sources. Emerging in the wake of this digital disruption is a data-centricity shared by businesses ranging in scale from fast-growing SMB’s to global ecommerce enterprises. 

Using JSON Datatype in PostgreSQL

An example using weather data provided by the API at OpenWeatherMap

Introduction

This blog post will provide knowledge about how to use the PostgreSQL’s great functionality to work with JSON objects. Originally introduced in version 9.2, the feature was greatly enhanced in version 9.3 and we will look at the operators that come with the json datatype. We assume an installed version of PostgreSQL 9.3 or higher. How to work with all that is shown in this post. There is another new datatype jsonb, which is introduced in version 9.4.2.