How to connect Cloudera Impala to Pentaho

Discover how to connect Cloudera Impala to Pentaho and how to integrate Cloudera Impala with other data sources to build an organization-wide data model.

WITH PIPES OR PIPES PROFESSIONAL
How to connect Cloudera Impala to Pentaho with Pipes
WITH THE LOGICAL DATA WAREHOUSE
How to connect Cloudera Impala to Pentaho with Logical Data Warehouse


About Cloudera Impala

Cloudera Impala is an open-source massively parallel processing (MPP) SQL query engine for data running Apache Hadoop stored in computer clusters. This ways Impala brings scalable parallel database technology to Hadoop, enabling users to issue low-latency SQL queries to data stored in HDFS and Apache HBase without requiring data movement or transformation.

About Pentaho

Pentaho provides Business Intellingence solutions with a community and an enterprise edition. Users can create reportings and factful dashboards as well as run data mining and extract, transform, load (ETL) processes with Pentaho.

What is Data Virtuality?

Data Virtuality enables companies to build an agile BI stack in 1 day. It connects to Cloudera Impala, Pentaho and more than 200 other databases and cloud services. All connected data sources can be directly queried with SQL and data can be moved into any analytical database. Customers of the Data Virtuality Logical Data Warehouse are digital businesses with the highest flexibility needs.

IMMEDIATE DATA ACCESS

Connect over 200+ databases, cloud services and files (XML, CSV, etc.) in minutes. Query data directly with your favourite business intelligence and analysis tools.

CENTRAL DATA MODEL

Build your single source of data truth and set uniform definitions for your data. Connect your business intelligence tools to Data Virtuality and make sure everyone in your organisation works with the right data.

ALL IN SQL

With Data Virtuality you can query all data sources with SQL regardless of the file format. NoSQL, CSV or XML File: We transform any connected data source in SQL.

REAL-TIME REPORTING

Data Virtuality controls the exchange of data between all databases, cloud services, and analysis tools to help everyone in your organization get the right information, in real-time if needed.