Cannot Register Vora Tables in Spark

Cannot Register Vora Tables in Spark - vora

When attempting to register all tables in Vora with
vc.sql("REGISTER ALL TABLES USING com.sap.spark.vora")
I receive the following error
"The current Vora version does not support parallel loading of partitioned tables. Please wait until the previous partitioned tables are loaded, then issue your query again."
Is there a way of clearing all previous requests? Is there a way to clear the Vora Catalog outside of SQL command.

This error can occur in Vora 1.2 due to a program error with incorrect handling of partitioned tables. A workaround has been documented now on the Troubleshooting Blog. It is planned to address this issue with the next Vora version.

Deleting the vora-discovery and vora-dlog directories removed all metadata and we were able to recreate our tables.

Related

AWS DMS - Migrate - only schema

We have noticed that if a table is empty in SQL Server, the empty table does not come via DMS. Only after inserting a record it starts to show up.
Just checking, is there a way to get the schema only from DMS?
Thanks

You can use Schema conversion tool for moving DB objects and Schema. Its a free tool by AWS and can be installed on On-Prem server or on EC2. It gives a good report before you can actually migrate the DB schema and other DB objects. It shows how many Tables, SP's Funcs etc can be directly migrated and shows possible solutions too.

How to troubleshoot not responsive table in AWS Redshift?

I have recently faced a problem on Redshift cluster when table stopped responding.
The guess was that there is a lock on that table, but a query
select * from stl_tr_conflict order by xact_start_ts;
gives me nothing, though judging on AWS documentation stl_tr_conflict table should have records of all transaction issues including locks. But maybe records live there only when lock is alive. I am not sure.
Search in useractivitylog in S3 on words violation and ERROR also is not giving any results. So I still can't figure out why one of tables was not accessible.
I am new to database management so I would appreciate any advice on how to troubleshoot this issue.

Storing error message to Redshift through datapipeline

I am trying to run a SQL activity in Redshift cluster through data pipeline. After SQL activity, few logs need be written to a Table in redshift [such as number of rows affected, the error message(if any)].
Requirement:
If the sql Activity is finished successfully, the mentioned table will be written with 'error' column as null,
else if the sql Activity fails on any error, that particular error message is need to be updated into the 'error' column in Redshift table.
Can we able to achieve this through pipeline? If yes, How can we achieve this?
Thanks,
Ravi.

Unfortunately you cannot do this directly with SqlActivity in DataPipeline. The work around is to write a java program (or any executable) that does what you want and schedule it via Datapipeline using ShellCommandActivity.

VORA Tables in Zeppelin and Spark shell

We have created test table from spark shell as well as from Zepellin. But when we do show tables on single table is visible in respective environment. Table created via spark shell is not displayed in Zepellin show table command.
What is the difference between these two tables ? can anybody please explain.

The show tables command only shows the tables defined in the current session.
A table is created in a current session and also in a (persistent) catalog in Zookeeper. You can show all tables that Vora saved in Zookeeper via this command:
SHOW DATASOURCETABLES
USING com.sap.spark.vora
OPTIONS(zkurls "<zookeeper_server>:2181")
You can also register all or single tables in the current session via this command:
REGISTER ALL TABLES
USING com.sap.spark.vora
OPTIONS(zkurls "<zookeeper_server>:2181")
REGISTER TABLE <tablename>
USING com.sap.spark.vora
OPTIONS(zkurls "<zookeeper_server>:2181")
So if you want to access the table that you created in the Spark Shell from Zookeeper and vice versa you need to register it first.
You can use those commands if you need to clear the Zookeeper Catalog. Be aware that tables then need to be recreated:
import com.sap.spark.vora.client._
ClusterUtils.clearZooKeeperCatalog("<zookeeper_server>:2181")
This (and more) information can be found in the Vora Installation and Developer Guide

Error viewing API Manager Statistics using WSO2 DAS

I'm attempting to use AM 1.9.1 and Store statistics in DAS 3.0.0. I'm using a mysql database to house my WSO2AM_STATS_DB instance.
Data is being stored successfully in the database. I have records indicating that attempts were throttled out and requests were made successfully. Unfortunately, when I attempt to view any of the statistics in either the store or the publisher application, the logs show this error:
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'TempTable.apiPublisher' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
Can anyone provide some guidance on how to resolve this?

I was able to resolve this issue by removing the ONLY_FULL_GROUP_BY from the configuration for MySQL.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Cannot Register Vora Tables in Spark - vora

This error can occur in Vora 1.2 due to a program error with incorrect handling of partitioned tables. A workaround has been documented now on the Troubleshooting Blog. It is planned to address this issue with the next Vora version.

Deleting the vora-discovery and vora-dlog directories removed all metadata and we were able to recreate our tables.

Related

AWS DMS - Migrate - only schema

How to troubleshoot not responsive table in AWS Redshift?

Storing error message to Redshift through datapipeline

VORA Tables in Zeppelin and Spark shell

Error viewing API Manager Statistics using WSO2 DAS

Categories

Resources