Apache Calcite - Can't seem to parse DDL statements - apache-calcite

Following is the configuration that I'm using to parse MySQL statements. I am able to parse and process DML statements fine but I can't seem to parse any DDL statements (CREATE TABLE, ALTER TABLE, DROP TABLE, etc.).
public static SqlNode parse(String query) {
SqlParser.Config sqlParserConfig = SqlParser.configBuilder()
.setConformance(SqlConformanceEnum.MYSQL_5)
.setLex(Lex.MYSQL)
.build();
SqlParser parser = SqlParser.create(query, sqlParserConfig);
try {
return parser.parseStmt();
} catch (SqlParseException e) {
lastErrorMessage = e.getMessage();
return null;
}
}
When I try to parse a CREATE TABLE statement, it gives me the following error message.
Encountered "ALTER TABLE" at line 1, column 1.
Was expecting one of:
"SET" ...
"RESET" ...
"ALTER" "SYSTEM" ...
"ALTER" "SESSION" ...
"WITH" ...
"+" ...
"-" ...
"NOT" ...
"EXISTS" ...
<UNSIGNED_INTEGER_LITERAL> ...
<DECIMAL_NUMERIC_LITERAL> ...
<APPROX_NUMERIC_LITERAL> ...
<BINARY_STRING_LITERAL> ...
<PREFIXED_STRING_LITERAL> ...
<QUOTED_STRING> ...
<UNICODE_STRING_LITERAL> ...
...
I can see classes like SqlAlter, SqlCreate,SqlDrop in the library. Is there a different way to parse DDLs?
I'm using 1.17.0 version.

I think you are just missing the appropriate parser factory. Try the following:
SqlParser.Config sqlParserConfig = SqlParser.configBuilder()
.setParserFactory(SqlDdlParserImpl.FACTORY)
.setConformance(SqlConformanceEnum.MYSQL_5)
.setLex(Lex.MYSQL)
.build();
SqlParser parser = SqlParser.create(query, sqlParserConfig);

Related

Create XML dataset with the same table name as initial data set in DBUnit?

I'm trying to create an initial DB state in DB Unit like this...
public function getDataSet() {
$primary = new \PHPUnit\DbUnit\DataSet\CompositeDataSet();
$fixturePaths = [
"test/Seeds/Upc/DB/UpcSelect.xml",
"test/Seeds/Generic/DB/ProductUpcSelect.xml"
];
foreach($fixturePaths as $fixturePath) {
$dataSet = $this->createXmlDataSet($fixturePath);
$primary->addDataSet($dataSet);
}
return $primary;
}
Then after my query I'm attempting to call this user-defined function...
protected function compareDatabase(String $seedPath, String $table) {
$expected = $this->createFlatXmlDataSet($seedPath)->getTable($table);
$result = $this->getConnection()->createQueryTable($table, "SELECT * FROM $table");
$this->assertTablesEqual($expected, $result);
}
The idea here is that I have an initial DB state, run my query, then compare the actual table state with the XML data set representing what I expect the table to look like. This process is described in PHPUnit's documentation for DBUnit but I keep having an exception thrown...
PHPUnit\DbUnit\InvalidArgumentException: There is already a table named upc with different table definition
Test example...
public function testDeleteByUpc() {
$mapper = new UpcMapper($this->getPdo());
$mapper->deleteByUpc("someUpcCode1");
$this->compareDatabase("test/Seeds/Upc/DB/UpcAfterDelete.xml", 'upc');
}
I seem to be following the docs...how is this supposed to be done?
This was actually unrelated to creating a second XML Dataset. This exception was thrown because the two fixtures I loaded in my getDataSet() method both had table definitions for upc.

Adding stored procedures to In-Memory DB using SqLite

I am using In-Memory database (using ServiceStack.OrmLite.Sqlite.Windows) for unit testing in servicestack based web api. I want to test the service endpoints which depends on stored Procedures through In-Memory database for which i have gone through the link Servicestack Ormlite SqlServerProviderTests, the unit test class that i am using for the test is as follows,
using System;
using System.Collections.Generic;
using System.Data;
using System.Linq;
using NUnit.Framework;
using ServiceStack.Text;
using ServiceStack.Configuration;
using ServiceStack.Data;
namespace ServiceStack.OrmLite.Tests
{
public class DummyTable
{
public int Id { get; set; }
public string Name { get; set; }
}
[TestFixture]
public class SqlServerProviderTests
{
private IDbConnection db;
protected readonly ServiceStackHost appHost;
public SqlServerProviderTests()
{
appHost = TestHelper.SetUp(appHost).Init();
db = appHost.Container.Resolve<IDbConnectionFactory>().OpenDbConnection("inventoryDb");
if (bool.Parse(System.Configuration.ConfigurationManager.AppSettings["IsMock"]))
TestHelper.CreateInMemoryDB(appHost);
}
[TestFixtureTearDown]
public void TearDown()
{
db.Dispose();
}
[Test]
public void Can_SqlColumn_StoredProc_returning_Column()
{
var sql = #"CREATE PROCEDURE dbo.DummyColumn
#Times integer
AS
BEGIN
SET NOCOUNT ON;
CREATE TABLE #Temp
(
Id integer NOT NULL,
);
declare #i int
set #i=1
WHILE #i < #Times
BEGIN
INSERT INTO #Temp (Id) VALUES (#i)
SET #i = #i + 1
END
SELECT * FROM #Temp;
DROP TABLE #Temp;
END;";
db.ExecuteSql("IF OBJECT_ID('DummyColumn') IS NOT NULL DROP PROC DummyColumn");
db.ExecuteSql(sql);
var expected = 0;
10.Times(i => expected += i);
var results = db.SqlColumn<int>("EXEC DummyColumn #Times", new { Times = 10 });
results.PrintDump();
Assert.That(results.Sum(), Is.EqualTo(expected));
results = db.SqlColumn<int>("EXEC DummyColumn 10");
Assert.That(results.Sum(), Is.EqualTo(expected));
results = db.SqlColumn<int>("EXEC DummyColumn #Times", new Dictionary<string, object> { { "Times", 10 } });
Assert.That(results.Sum(), Is.EqualTo(expected));
}
}
}
when i tried to execute this through Live-DB, it was working fine. but when i tried for In-Memory DB was getting Exceptions as follows,
System.Data.SQLite.SQLiteException : SQL logic error or missing database near "IF": syntax error
near the code line,
db.ExecuteSql("IF OBJECT_ID('DummyColumn') IS NOT NULL DROP PROC DummyColumn");
i commented the above line and executed the test case but still i am getting exception as follows,
System.Data.SQLite.SQLiteException : SQL logic error or missing database near "IF": syntax error
for the code line,
db.ExecuteSql(sql);
the In-Memory DB Created is as follows, and its working fine for remaining cases.
public static void CreateInMemoryDB(ServiceStackHost appHost)
{
using (var db = appHost.Container.Resolve<IDbConnectionFactory>().OpenDbConnection("ConnectionString"))
{
db.DropAndCreateTable<DummyData>();
TestDataReader<TableList>("Reservation.json", "InMemoryInput").Reservation.ForEach(x => db.Insert(x));
db.DropAndCreateTable<DummyTable>();
}
}
why we are facing this exception is there any other way to add and run stored Procedure in In-Memory DB with Sqlite??
The error is because you're trying to run SQL Server-specific queries with TSQL against an in memory version of Sqlite - i.e. a completely different, embeddable database. As the name suggests SqlServerProviderTests only works on SQL Server, I'm confused why you would try to run this against Sqlite?
SQLite doesn't support Stored Procedures, TSQL, etc so trying to execute SQL Server TSQL statements will always result in an error. The only thing you can do is fake it with a custom Exec Filter, where you can catch the exception and return whatever custom result you like, e.g:
public class MockStoredProcExecFilter : OrmLiteExecFilter
{
public override T Exec<T>(IDbConnection dbConn, Func<IDbCommand, T> filter)
{
try
{
return base.Exec(dbConn, filter);
}
catch (Exception ex)
{
if (dbConn.GetLastSql() == "exec sp_name #firstName, #age")
return (T)(object)new Person { FirstName = "Mocked" };
throw;
}
}
}
OrmLiteConfig.ExecFilter = new MockStoredProcExecFilter();

org.apache.calcite.sql.validate.SqlValidatorException

I'm using Apache Calcite to parse a simple SQL statement and return its relational tree. I obtain a database schema using a JDBC connection to a simple SQLite database. The schema is then added using FrameworkConfig. The parser configuration is then modified to handle identifier quoting and case (not sensitive). However the SQL validator is unable to find the quoted table identifier in the SQL statement. Somehow the parser ignore the configuration settings and converts the table to UPPER CASE. A SqlValidatorException is raised, stating the the table name is not found. I suspect, the configuration is not being updated correctly? I have already validated that the table name is correctly included in the schema's meta-data.
public class ParseSQL {
public static void main(String[] args) {
try {
// register the JDBC driver
String sDriverName = "org.sqlite.JDBC";
Class.forName(sDriverName);
JsonObjectBuilder builder = Json.createObjectBuilder();
builder.add("jdbcDriver", "org.sqlite.JDBC")
.add("jdbcUrl",
"jdbc:sqlite://calcite/students.db")
.add("jdbcUser", "root")
.add("jdbcPassword", "root");
Map<String, JsonValue> JsonObject = builder.build();
//argument for JdbcSchema.Factory().create(....)
Map<String, Object> operand = new HashMap<String, Object>();
//explicitly extract JsonString(s) and load into operand map
for(String key : JsonObject.keySet()) {
JsonString value = (JsonString) JsonObject.get(key);
operand.put(key, value.getString());
}
final SchemaPlus rootSchema = Frameworks.createRootSchema(true);
Schema schema = new JdbcSchema.Factory().create(rootSchema, "students", operand);
rootSchema.add("students", schema);
//build a FrameworkConfig using defaults where values aren't required
Frameworks.ConfigBuilder configBuilder = Frameworks.newConfigBuilder();
//set defaultSchema
configBuilder.defaultSchema(rootSchema);
//build configuration
FrameworkConfig frameworkdConfig = configBuilder.build();
//use SQL parser config builder to ignore case of quoted identifier
SqlParser.configBuilder(frameworkdConfig.getParserConfig()).setQuotedCasing(Casing.UNCHANGED).build();
//use SQL parser config builder to set SQL case sensitive = false
SqlParser.configBuilder(frameworkdConfig.getParserConfig()).setCaseSensitive(false).build();
//get planner
Planner planner = Frameworks.getPlanner(frameworkdConfig);
//parse SQL statement
SqlNode sql_node = planner.parse("SELECT * FROM \"Students\" WHERE age > 15.0");
System.out.println("\n" + sql_node.toString());
//validate SQL
SqlNode sql_validated = planner.validate(sql_node);
//get associated relational expression
RelRoot relationalExpression = planner.rel(sql_validated);
relationalExpression.toString();
} catch (SqlParseException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (RelConversionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (ValidationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (ClassNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} // end main
} // end class
***** ERROR MESSAGE ******
Jan 20, 2016 8:54:51 PM org.apache.calcite.sql.validate.SqlValidatorException
SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table 'Students' not found
This is a case-sensitivity issue, similar to table not found with apache calcite. Because you enclosed the table name in quotes in your SQL statement, the validator is looking for a table called "Students", and the error message attests to this. If your table is called "Students", I am surprised that Calcite can't find it.
There is a problem with how you are using the SqlParser.ConfigBuilder. When you call build(), you are not using the SqlParser.Config object that it creates. If you passed that object to Frameworks.ConfigBuilder.parserConfig, I think you would get the behavior you want.

HBase Get/Scan in a Scalding job

I'm using Scalding with Spyglass to read from/write to HBase.
I'm doing a left outer join of table1 and table2 and write back to table1 after transforming a column.
Both table1 and table2 are declared as Spyglass HBaseSource.
This works fine. But, i need to access a different row in table1 using rowkey to compute transformed value.
I tried the following for HBase get:
val hTable = new HTable(conf, TABLE_NAME)
val result = hTable.get(new Get(rowKey.getBytes()))
I'm getting access to Configuration in Scalding job as mentioned in this link:
https://github.com/twitter/scalding/wiki/Frequently-asked-questions#how-do-i-access-the-jobconf
This works when i run the scalding job locally.
But, when i run it in cluster, conf is null when this code is executed in Reducer.
Is there a better way to do HBase get/scan in a Scalding/Cascading job for cases like this?
Ways to do this...
1) You can use a managed resource
class SomeJob(args: Args) extends Job(args) {
val someConfig = HBaseConfiguration.create().addResource(new Path(pathtoyourxmlfile))
lazy val hPool = new HTablePool(someConfig, 3)
def getConf = {
implicitly[Mode] match {
case Hdfs(_, conf) => conf
case _ => whateveryou are doing for a local conf...
}
}
... somePipe.someOperation.... {
val gets = key.map { key => new Get(key) }
managed(hPool.getTable("myTableName")) acquireAndGet { table =>
val results = table.get(gets)
...do something with these results
}
}
}
2) You can use some more specific cascading code, where you write a custom scheme and inside that you will override the source method and possibly some others depending on your needs. In there you can access the JobConf like this:
class MyScheme extends Scheme[JobConf, SomeRecordReader, SomeOutputCollector, ..] {
#transient var jobConf: Configuration = super.jobConfiguration
override def source(flowProcess: FlowProcess[JobConf], ...): Boolean = {
jobConf = flowProcess match {
case h: HadoopFlowProcess => h.getJobConf
case _ => jconf
}
... dosomething with the jobConf here
}
}

Same Instances header ( arff ) for all my database queries

I am using InstanceQuery , SQL queries, to construct my Instances. But my query results does not come in the same order always as it is normal in SQL.
Beacuse of this Instances constucted from different SQL has different headers. A simple example can be seen below. I suspect my results changes because of this behavior.
Header 1
#attribute duration numeric
#attribute protocol_type {tcp,udp}
#attribute service {http,domain_u}
#attribute flag {SF}
Header 2
#attribute duration numeric
#attribute protocol_type {tcp}
#attribute service {pm_dump,pop_2,pop_3}
#attribute flag {SF,S0,SH}
My question is : How can I give correct header information to Instance construction.
Is something like below workflow is possible?
get pre-prepared header information from arff file or another place.
give instance construction this header information
call sql function and get Instances (header + data)
I am using following sql function to get instances from database.
public static Instances getInstanceDataFromDatabase(String pSql
,String pInstanceRelationName){
try {
DatabaseUtils utils = new DatabaseUtils();
InstanceQuery query = new InstanceQuery();
query.setUsername(username);
query.setPassword(password);
query.setQuery(pSql);
Instances data = query.retrieveInstances();
data.setRelationName(pInstanceRelationName);
if (data.classIndex() == -1)
{
data.setClassIndex(data.numAttributes() - 1);
}
return data;
} catch (Exception e) {
throw new RuntimeException(e);
}
}
I tried various approaches to my problem. But it seems that weka internal API does not allow solution to this problem right now. I modified weka.core.Instances append command line code for my purposes. This code is also given in this answer
According to this, here is my solution. I created a SampleWithKnownHeader.arff file , which contains correct header values. I read this file with following code.
public static Instances getSampleInstances() {
Instances data = null;
try {
BufferedReader reader = new BufferedReader(new FileReader(
"datas\\SampleWithKnownHeader.arff"));
data = new Instances(reader);
reader.close();
// setting class attribute
data.setClassIndex(data.numAttributes() - 1);
}
catch (Exception e) {
throw new RuntimeException(e);
}
return data;
}
After that , I use following code to create instances. I had to use StringBuilder and string values of instance, then I save corresponding string to file.
public static void main(String[] args) {
Instances SampleInstance = MyUtilsForWeka.getSampleInstances();
DataSource source1 = new DataSource(SampleInstance);
Instances data2 = InstancesFromDatabase
.getInstanceDataFromDatabase(DatabaseQueries.WEKALIST_QUESTION1);
MyUtilsForWeka.saveInstancesToFile(data2, "fromDatabase.arff");
DataSource source2 = new DataSource(data2);
Instances structure1;
Instances structure2;
StringBuilder sb = new StringBuilder();
try {
structure1 = source1.getStructure();
sb.append(structure1);
structure2 = source2.getStructure();
while (source2.hasMoreElements(structure2)) {
String elementAsString = source2.nextElement(structure2)
.toString();
sb.append(elementAsString);
sb.append("\n");
}
} catch (Exception ex) {
throw new RuntimeException(ex);
}
MyUtilsForWeka.saveInstancesToFile(sb.toString(), "combined.arff");
}
My save instances to file code is as below.
public static void saveInstancesToFile(String contents,String filename) {
FileWriter fstream;
try {
fstream = new FileWriter(filename);
BufferedWriter out = new BufferedWriter(fstream);
out.write(contents);
out.close();
} catch (Exception ex) {
throw new RuntimeException(ex);
}
This solves my problem but I wonder if more elegant solution exists.
I solved a similar problem with the Add filter that allows adding attributes to Instances. You need to add a correct Attibute with proper list of values to both datasets (in my case - to test dataset only):
Load train and test data:
/* "train" contains labels and data */
/* "test" contains data only */
CSVLoader csvLoader = new CSVLoader();
csvLoader.setFile(new File(trainFile));
Instances training = csvLoader.getDataSet();
csvLoader.reset();
csvLoader.setFile(new File(predictFile));
Instances test = csvLoader.getDataSet();
Set a new attribute with Add filter:
Add add = new Add();
/* the name of the attribute must be the same as in "train"*/
add.setAttributeName(training.attribute(0).name());
/* getValues returns a String with comma-separated values of the attribute */
add.setNominalLabels(getValues(training.attribute(0)));
/* put the new attribute to the 1st position, the same as in "train"*/
add.setAttributeIndex("1");
add.setInputFormat(test);
/* result - a compatible with "train" dataset */
test = Filter.useFilter(test, add);
As a result, the headers of both "train" and "test" are the same (compatible for Weka machine learning)