Calcite over MySql, tries to convert the id column to bigint - apache-calcite

I'm trying to use Calcite to query MySql and Vertica in the same query:
MySql table: CREATE TABLE tableA (id INT(11), name VARCHAR(5), PRIMARY KEY(id));
Vertica table: CREATE TABLE tableB (id INTEGER NOT NULL, name VARCHAR(20), PRIMARY KEY (id));
When running the following query using Calcite:
"SELECT a.\"name\", b.\"name\" " +
"FROM \"mysqlschema\".\"tableA\" as a " +
"INNER JOIN \"verticaschema\".\"tableB\" as b ON a.\"id\" = b.\"id\" " +
"WHERE a.\"id\" = 1 "))
I'm getting:
Exception in thread "main" java.sql.SQLException: Error while executing SQL "SELECT a."name", b."name" FROM "mysqlschema"."tableA" as a INNER JOIN "verticaschema"."tableB" as b ON a."id" = b."id" WHERE a."id" = 1 ": while executing SQL [SELECT `id`, `name`, CAST(`id` AS BIGINT) AS `id0`
FROM `mysqlschema`.`tableA`
WHERE `id` = 1]
at org.apache.calcite.avatica.Helper.createException(
at org.apache.calcite.avatica.Helper.createException(
at org.apache.calcite.avatica.AvaticaStatement.executeInternal(
at org.apache.calcite.avatica.AvaticaStatement.executeQuery(
at com.test.TestCalcite.main(
Caused by: java.lang.RuntimeException: while executing SQL [SELECT `id`, `name`, CAST(`id` AS BIGINT) AS `id0`
FROM `mysqlschema`.`tableA`
WHERE `id` = 1]
at org.apache.calcite.runtime.ResultSetEnumerable.enumerator(
at org.apache.calcite.linq4j.EnumerableDefaults$10$1.<init>(
at org.apache.calcite.linq4j.EnumerableDefaults$10.enumerator(
at Baz$6$1.<init>(Unknown Source)
at Baz$6.enumerator(Unknown Source)
at org.apache.calcite.linq4j.AbstractEnumerable.iterator(
at org.apache.calcite.avatica.MetaImpl.createCursor(
at org.apache.calcite.avatica.AvaticaResultSet.execute(
at org.apache.calcite.jdbc.CalciteResultSet.execute(
at org.apache.calcite.jdbc.CalciteResultSet.execute(
at org.apache.calcite.avatica.AvaticaConnection$1.execute(
at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(
at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(
at org.apache.calcite.avatica.AvaticaStatement.executeInternal(
... 2 more
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'BIGINT) AS `id0`
FROM `mysqlschema`.`tableA`
WHERE `id` = 1' at line 1
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
at java.lang.reflect.Constructor.newInstance(
at com.mysql.jdbc.Util.handleNewInstance(
at com.mysql.jdbc.Util.getInstance(
at com.mysql.jdbc.SQLError.createSQLException(
at com.mysql.jdbc.MysqlIO.checkErrorPacket(
at com.mysql.jdbc.MysqlIO.checkErrorPacket(
at com.mysql.jdbc.MysqlIO.sendCommand(
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(
at com.mysql.jdbc.ConnectionImpl.execSQL(
at com.mysql.jdbc.ConnectionImpl.execSQL(
at com.mysql.jdbc.StatementImpl.executeInternal(
at com.mysql.jdbc.StatementImpl.execute(
at org.apache.commons.dbcp.DelegatingStatement.execute(
at org.apache.commons.dbcp.DelegatingStatement.execute(
at org.apache.calcite.runtime.ResultSetEnumerable.enumerator(
Converting the id column in the MySql table solves it but I can't convert as I'm dealing with existing tables which cannot be changed.
Any thoughts on how to work around this?

Solved by casting the id in B:
ON a."id" = CAST(b."b" AS INT)


Why does left join in redshift not working?

We are facing a weird issue with Redshift and I am looking for help to debug it please. Details of the issue are following:
I have 2 tables and I am trying to perform left join as follows:
select count(*)
from abc.orders ot
left outer join e on **ot.context_id = e.context_id**
where ot.order_id = '222:102'
Above query returns ~7000 records. Looks like it is performing default join as we have only 1 record in [Orders] table with Order ID = ‘222:102’
select count(*)
from abc.orders ot
left outer join e on **ot.event_id = e.event_id**
where ot.order_id = '222:102'
Above query returns 1 record correctly. If you notice, I have just changed column for joining 2 tables. Event_ID in [Events] table is identity column but I thought I should get similar records even if I use any other column like Context_ID.
Further, I tried following query under the impression it should return all the ~7000 records as I am using default join but surprisingly it returned only 1 record.
select count(*)
from abc.orders ot
**join** e on ot.event_id = e.event_id
where ot.order_id = '222:102'
Following are the Redshift database details:
Cutdown version of table metadata:
CREATE TABLE abc.orders (
order_id character varying(30) NOT NULL ENCODE raw,
context_id integer ENCODE raw,
event_id character varying(21) NOT NULL ENCODE zstd,
FOREIGN KEY (event_id) REFERENCES events_20191014(event_id)
SORTKEY ( context_id, order_id );
event_id character varying(21) NOT NULL ENCODE raw,
context_id integer ENCODE raw,
PRIMARY KEY (event_id)
SORTKEY ( context_id, event_id );
Database: Amazon Redshift cluster
I think, I am missing something essential while joining the tables. Could you please guide me in right direction?
Thank you

How to handle error in ibm_db python package while calling stored procedure?

I'm trying call stored procedure using following code
conn = ibm_db.connect("database","username","password")
stmt = ibm_db.exec_immediate(conn, sql)
But this procedure does not return any rows & It will only return code. Now I need to handle error whether procedure run successfully or not. Could anyone help me how to handle this?
For test purposes, I've created a table:
db2 "create table so(c1 int not null primary key)"
and my procedure will simply insert a row into this table - this will allow me to easily force an error with a duplicate key:
db2 "create or replace procedure so_proc(in insert_val int)
language sql
insert into so values(insert_val)"
db2 "call so_proc(1)"
Return Status = 0
db2 "call so_proc(1)"
SQL0803N One or more values in the INSERT statement, UPDATE statement, or
foreign key update caused by a DELETE statement are not valid because the
primary key, unique constraint or unique index identified by "1" constrains
table "DB2V115.SO" from having duplicate values for the index key.
now with Python:
conn = ibm_db.connect("DATABASE=SAMPLE;HOSTNAME=localhost;PORT=61115;UID=db2v115;PWD=xxxxx;","","")
stmt = ibm_db.exec_immediate(conn, "CALL SO_PROC(2)")
stmt = ibm_db.exec_immediate(conn, "CALL SO_PROC(2)")
Exception Traceback (most recent call last)
<ipython-input-8-c1f4b252e70a> in <module>
----> 1 stmt = ibm_db.exec_immediate(conn, "CALL SO_PROC(2)")
Exception: [IBM][CLI Driver][DB2/LINUXX8664] SQL0803N One or more values in the INSERT statement, UPDATE statement, or foreign key update caused by a DELETE statement are not valid because the primary key, unique constraint or unique index identified by "1" constrains table "DB2V115.SO" from having duplicate values for the index key. SQLSTATE=23505 SQLCODE=-803
so if a procedure hits an exception then you'll get it, you just need to handle exception Try/Except block:
stmt = ibm_db.exec_immediate(conn, "CALL SO_PROC(2)")
except Exception:
print("Procedure failed with sqlstate {}".format(ibm_db.stmt_error()))
print("Error {}".format(ibm_db.stmt_errormsg()))
Procedure failed with sqlstate 23505
Error [IBM][CLI Driver][DB2/LINUXX8664] SQL0803N One or more values in the INSERT statement, UPDATE statement, or foreign key update caused by a DELETE statement are not valid because the primary key, unique constraint or unique index identified by "1" constrains table "DB2V115.SO" from having duplicate values for the index key. SQLSTATE=23505 SQLCODE=-803
Or you are actually interested with CALL return code/status? E.g.:
create or replace procedure so_proc_v2(in insert_val int)
language sql
if not exists (select 1 from so where c1 = insert_val)
insert into so values(insert_val);
return 0;
return -1;
end if#
db2 "call so_proc_v2(10)"
Return Status = 0
db2 "call so_proc_v2(10)"
Return Status = -1
then this is a bit tricky. With CLI trace enabled (I have ibm_db installed in my local path so it fetched CLI package there too):
export LD_LIBRARY_PATH=$HOME/.local/lib/python3.7/site-packages/clidriver/lib/
$HOME/.local/lib/python3.7/site-packages/clidriver/bin/db2trc on -cli -f /tmp/cli/trc
$HOME/.local/lib/python3.7/site-packages/clidriver/bin/db2trc off
$HOME/.local/lib/python3.7/site-packages/clidriver/bin/db2trc fmt -cli /tmp/cli.trc /tmp/cli.fmt
trace does show the returns status:
SQLExecute( hStmt=1:8 )
---> Time elapsed - -7.762688E+006 seconds
( Row=1, iPar=1, fCType=SQL_C_LONG, rgbValue=10 )
( return=-1 )
but I don't see anywhere in python-ibmdb API a way to fetch it... (e.g. ibm_dbcallproc doesn't have such option). Which means, that unless I'm missing something, you would have to raise an issue on Github to extent the API

sqlite3 & python: get list of primary and foreign keys

I am very new to sql and intermediate at python. Using sqlite3, how can I get a print() list of of primary and foreign keys (per table) in my database?
Using Python2.7, SQLite3, PyCharm.
sqlite3.version = 2.6.0
sqlite3.sqlite_version = 3.8.11
Also note: when I set up the database, I enabled FKs as such:
conn = sqlite3.connect(db_file)
conn.execute('pragma foreign_keys=ON')
I tried the following:
print(conn.execute("PRAGMA table_info"))
print(conn.execute("PRAGMA foreign_key_list"))
Which returned:
<sqlite3.Cursor object at 0x0000000002FCBDC0>
<sqlite3.Cursor object at 0x0000000002FCBDC0>
I also tried the following, which prints nothing (but I think this may be because it's a dummy database with tables and fields but no records):
rows = conn.execute('PRAGMA table_info')
for r in rows:
print r
rows2 = conn.execute('PRAGMA foreign_key_list')
for r2 in rows2:
print r2
Unknown or malformed PRAGMA statements are ignored.
The problem with your PRAGMAs is that the table name is missing. You have to get a list of all tables, and then execute those PRAGMAs for each one:
rows = db.execute("SELECT name FROM sqlite_master WHERE type = 'table'")
tables = [row[0] for row in rows]
def sql_identifier(s):
return '"' + s.replace('"', '""') + '"'
for table in tables:
print("table: " + table)
rows = db.execute("PRAGMA table_info({})".format(sql_identifier(table)))
rows = db.execute("PRAGMA foreign_key_list({})".format(sql_identifier(table)))
type ='table' AND
name NOT LIKE 'sqlite_%';
this sql will show all table in database, for eache table run sql PRAGMA table_info(your_table_name);, you can get the primary key of the table.
Those pictures show what sql result like in my database:
first sql result
second sql result

How can I update Cassandra table with only primary key and static columns?

I am using Cassandra 3.9 and DataStax C++ driver 2.6. I have created a table that has only a primary key and static columns. I am able to insert data into the table, but I am not able to update the table and I don't know why. As an example, I created the table t that is defined here:
[Cassandra Table with primary key and static column][1]
Then I successfully inserted data into the table with the following CQL insert command:
"insert into t (k, s, i) VALUES('George', 'Hello', 2);"
Then, "select * from t;" results in the following:
k | i | s
George | 2 | Hello
However, if I then try to update the table using the following command:
"UPDATE t set s = "World" where k = "George";"
I get the following error:
SyntaxException: line 1:26 no viable alternative at input 'where' (UPDATE t set s = ["Worl]d" where...)
Does anyone know how to update a table with only static columns and a primary key (i.e. partition key + cluster key)?
Enclose string with single quote
Example :
UPDATE t set s = 'World' where k = 'George';

Doctrine join query to get all record satisfies count greater than 1

I tried with normal sql query
SELECT FROM `activity_shares`
INNER JOIN (SELECT `activity_id` FROM `activity_shares`
GROUP BY `activity_id`
HAVING COUNT(`activity_id`) > 1 ) dup ON activity_shares.activity_id = dup.activity_id
Which gives me record id say 10 and 11
But same query I tried to do in Doctrine query builder,
->add('from','MyBundleDataBundle:ActivityShare c')
->innerJoin('c.activity', 'ca')
// ->andWhere(' = c.activity')
Generated SQL is:
SELECT AS id0 FROM activity_shares a0_
INNER JOIN activities a1_ ON a0_.activity_id =
GROUP BY HAVING count( > 1
Gives only 1 record that is 10.I want to get both.I'm not getting idea where I went wrong.Any idea?
My tables structure is:
| Id |activity |Share| etc...
| 1 | 1 |1 |
| 2 | 1 | 2 |
Activity is foreign key to Activity table.
I want to get Id's 1 and 2
Simplified SQL
first of all let me simplify that query so it gives the same result :
SELECT id FROM `activity_shares`
HAVING COUNT(`activity_id`) > 1
Docrtrine QueryBuilder
If you store the id of the activty in the table like you sql suggests:
You can use the simplified SQL to build a query:
$results =$this->getEntityManager()->createQueryBuilder('c')
->add('from','MyBundleDataBundle:ActivityShare c')
If you are using association tables ( Doctrine logic)
here you will have to use join but the count may be tricky
Solution 1
use the associative table like an entitiy ( as i see it you only need the id)
Let's say the table name is activityshare_activity
it will have two fields activity_id and activityshare_id, if you find a way to add a new column id to that table and make it Autoincrement + Primary the rest is easy :
the new entity being called ActivityShareActivity
$results =$this->getEntityManager()->createQueryBuilder('c')
->add('from','MyBundleDataBundle:ActivityShareActivity c')
the steps to add the new identification column to make it compatible with doctrine (you need to do this once):
add the column (INT , NOT NULL) don' t put the autoincrement yet
Populate the column using a php loop like for
Modify the column to be autoincrement
The correction to your query
->from('MyBundleDataBundle:ActivityShare', 'c')
->innerJoin('c.activity', 'ca')
->groupBy('') //note: it's not
I posted this one last because i am not 100% sure of the output of having+ count but it should word just fine :)
Thanks for your answers.I finally managed to get answer
My Doctrine query is:
->add('from','MyBundleDataBundle:ActivityShare as')
->innerJoin('as.activity', 'a')
->add('from','ChowzterDataBundle:ActivityShare c')
->innerJoin('c.activity', 'ca');
$query->andWhere($query->expr()->in('', $subquery->getDql()))
$result = $query->getQuery();
And SQL looks like:
SELECT AS id0 FROM activity_shares a0_ INNER JOIN activities a1_ ON a0_.activity_id = WHERE IN (SELECT FROM activity_shares a3_ INNER JOIN activities a2_ ON a3_.activity_id = GROUP BY HAVING count( > 1