How to Apache Calcite query mongo DB by using Relational Algebra - apache-calcite

Can any one share sample code for Apache Calcite querying MongoDB by using relational algebra.

Related

Cloud Spanner C++ ORM

Does Spanner have an ORM for C++ like Java? Cannot find any open source C++ ORM. Do people write actual SQL queries to interact with Spanner while coding in C++?
It doesn't look like there is an ORM for Java or C++.
When it comes to official ORM, only Hibernate, Spring Data and Django are supported.
So in this case, I think it would be best to create a Feature request for Google to consider implementing more options.

Why mongoDB is being used in django project

As we know we can build great application with django using postgresql database or additional scalability and feature, sometime we use redis
I have notice some people using mongodb in their django project. My question is, what is the case and what specific purpose they are using mongoDB in their django project.
Why can't those feature can't be achieved by postgresql or redis?
mongoDB and redis are Non-Relational (No-SQL) Databases.
postgresql is a Relational (SQL) Database.
Choosing the type of DB depends on people's requirements.
SQL Databases
SQL databases are known as relational databases, and have a table-based data structure, with a strict, predefined schema required.
Eg: Oracle, MySQL, Microsoft SQL Server, and PostgreSQL
No-SQL Databases
NoSQL databases, or non-relational databases, can be document based, graph databases, key-value pairs, or wide-column stores. NoSQL databases don’t require any predefined schema, allowing you to work more freely with “unstructured data.”
Examples:
Document: MongoDB and CouchDB
Key-value: Redis and DynamoDB
Wide-column: Cassandra and HBase
Graph: Neo4j and Amazon Neptune.
What is the case and what specific purpose they are using mongoDB in their django project?
There are many reasons for that.
I suggest you to read this to know why and in what cases people choose No-SQL DBs over Relational DBs - No-SQL vs Relational DBs

is Impala a columnnar clustered database?

I am new to Big data and related tools/technologies. I was going through the documentation of impala.
Is it true to say that Impala is a clustered columnar database?
and Impala needs heavy memory to compute/transform the data?
Impala is not a Database.
Impala is a MPP (Massive Parallel Processing) SQL query Engine. It is an interface of SQL on top of HDFS structure. You can build a file structure over Parquet files, that are columnar files that allow you fast read of data.
According Impala documentation:
Impala provides fast, interactive SQL queries directly on your Apache Hadoop data stored in HDFS, HBase, or the Amazon Simple Storage Service (S3). In addition to using the same unified storage platform, Impala also uses the same metadata, SQL syntax (Hive SQL), ODBC driver, and user interface (Impala query UI in Hue) as Apache Hive. This provides a familiar and unified platform for real-time or batch-oriented queries.
Impala uses Hive Metastore to store the file structure and Schema of each file. Impala allows you to run SQLs queries in your files, and it will be responsible to parallelize the data in your cluster.
About the uses of Memory, you are partially right. Impala uses memory bound for execution, Hive uses disk based in classical map reduce over Tez execution. In newer version of Impala this allow you to use Disk Spill, that will help you with data that doesn't fit your memory.
Impala integrates with the Apache Hive meta store database, to share databases and tables between both components. The high level of integration with Hive, and compatibility with the HiveQL syntax, lets you use either Impala or Hive to create tables, issue queries, load data, and so on.
Impala is not database.
Impala is not based on Map-Reduce algorithms. It implements a distributed architecture based on daemon processes that are responsible for all the aspects of query execution that run on the same machines.

Migrate DB2 LUW 9.7 databases and data to PostgreSQL

I am trying to migrate DB2 LUW 9.7 databases and data to PostgreSQL 9.3. Any suggestion on which will be the best approach to do it? Which will be the best tool or any open source tool available to perform this?
The db2look utility can reverse-engineer your DB2 tables into DDL statements that will serve as a good starting point for your PostgreSQL definitions. To unload the data from each table, use the EXPORT command, which dumps the results of any SQL SELECT to a delimited text file. Although the db2move utility can handle both of those tasks, is not going to be of much help to you because it extracts the table data into IBM's proprietary PC/IXF format.
If you're moving off of DB2 because of price, IBM provides a free-as-in-beer version called DB2 Express-C, which shares the same core database engine as paid editions of DB2. Express-C is a first-rate, industrial strength DBMS that does not have the sort of severe limitations that other commercial vendors impose on their no-cost engines.

Django 1.6, creating cube data(OLAP) from postgresql

I am not very familiar with OLAP reporting, so is there any django app or python package which converts RDBMS(postgresql) data to cube data, which can be queried using django ORM, I have searched and found solutions like http://cubes.databrewery.org/, jasper etc, but these seemed to be overkill for my use case.
Python Cubes is one of the most lightweight OLAP servers you can find, and it can map your existing relational database into an OLAP schema (without forcing you to transform your data).
In order to query your OLAP cubes, you can use a tool like http://jjmontesl.github.io/cubesviewer/ (disclaimer, I'm the main developer).
Jasper is not really an OLAP server. It's more of a report generation tool.