How to join MongoDB and Cloudera Impala

Discover how to join MongoDB with Cloudera Impala for integrated analysis.

WITH PIPES OR PIPES PROFESSIONAL
How to join MongoDB and Cloudera Impala with Pipes
WITH THE LOGICAL DATA WAREHOUSE
How to join MongoDB and Cloudera Impala with Logical Data Warehouse


About MongoDB

MongoDB is an open-source document-oriented cross-platform database program that uses JSON-like documents with schemas and is classified as a NoSQL database. The main features of MongoDB among others cover file storage, load balancing, indexing, ad-hoc queries.

About Cloudera Impala

Cloudera Impala is an open-source massively parallel processing (MPP) SQL query engine for data running Apache Hadoop stored in computer clusters. This ways Impala brings scalable parallel database technology to Hadoop, enabling users to issue low-latency SQL queries to data stored in HDFS and Apache HBase without requiring data movement or transformation.

What is Data Virtuality?

Data Virtuality enables companies to build an agile BI stack in 1 day. It connects to MongoDB, Cloudera Impala and more than 200 other databases and cloud services. All connected data sources can be directly queried with SQL and data can be moved into any analytical database. Customers of the Data Virtuality Logical Data Warehouse are digital businesses with the highest flexibility needs.

IMMEDIATE DATA ACCESS

Connect over 200+ databases, cloud services and files (XML, CSV, etc.) in minutes. Query data directly with your favourite business intelligence and analysis tools.

CENTRAL DATA MODEL

Build your single source of data truth and set uniform definitions for your data. Connect your business intelligence tools to Data Virtuality and make sure everyone in your organisation works with the right data.

ALL IN SQL

With Data Virtuality you can query all data sources with SQL regardless of the file format. NoSQL, CSV or XML File: We transform any connected data source in SQL.

REAL-TIME REPORTING

Data Virtuality controls the exchange of data between all databases, cloud services, and analysis tools to help everyone in your organization get the right information, in real-time if needed.