I have created a table in dashDB from a CSV file using the console load functionality.
How can I get the table DDL using only the dashDB console?
You can download the runtime client that includes the tools such as db2look from here for free: https://www-01.ibm.com/marketing/iwm/iwm/web/preLogin.do?source=swg-idsclt and catalog your dashDB instance and database.
In the end, I viewed the table definition and selected the html and pasted it into a text editor.
dblook would have been a better option, but I couldn't get the command lines tools setup with ssl support.
For dashDB Local, try the below command to display the SQL DDL statement (CREATE TABLE) that was used to create a table. I tried this from the Docker CLI client that I was able to launch from the Kitematic window.
docker exec -it dashDB db_ddl_table -db bludb
Here is a snapshot of the output I am seeing:
bash-3.2$ docker exec -it dashDB db_ddl_table -db bludb
-- Timestamp: Fri Jun 24 20:46:39 UTC 2016
-- Database Name: bludb
-- DDL Statements for Table "IBMADT "."AUDITTRAIL"
CREATE TABLE "IBMADT "."AUDITTRAIL" (
"RECORDID" BIGINT NOT NULL GENERATED ALWAYS AS IDENTITY (
START WITH +1
INCREMENT BY +1
MINVALUE +1
MAXVALUE +9223372036854775807
NO CYCLE
CACHE 20
NO ORDER ) ,
"ACTIVITYTIME" TIMESTAMP NOT NULL WITH DEFAULT CURRENT TIMESTAMP ,
"ACTIVITYTYPE" VARCHAR(30 OCTETS) NOT NULL ,
"ACTIVITYPARAMS" VARCHAR(255 OCTETS) ,
"USERID" VARCHAR(255 OCTETS) NOT NULL ,
"USERROLE" VARCHAR(20 OCTETS) NOT NULL ,
"REMOTEHOST" VARCHAR(255 OCTETS) ,
"SESSIONID" VARCHAR(255 OCTETS) ,
"RESPONSECODE" CHAR(5 OCTETS) )
IN "USERSPACE1"
ORGANIZE BY ROW#
-- DDL Statements for Primary Key on Table "IBMADT "."AUDITTRAIL"
ALTER TABLE "IBMADT "."AUDITTRAIL"
ADD PRIMARY KEY
("RECORDID")#
-- DDL Statements for Table "DB2GSE "."GSE_COORDINATE_SYSTEMS"
CREATE TABLE "DB2GSE "."GSE_COORDINATE_SYSTEMS" (
"COORDSYS_NAME" VARCHAR(128 OCTETS) NOT NULL ,
"DEFINITION" VARCHAR(2048 OCTETS) NOT NULL ,
"ORGANIZATION" VARCHAR(128 OCTETS) ,
"ORGANIZATION_COORDSYS_ID" INTEGER ,
"DESCRIPTION" VARCHAR(256 OCTETS) ,
"DEFINER" VARCHAR(128 OCTETS) NOT NULL WITH DEFAULT USER )
IN "USERSPACE1"
ORGANIZE BY ROW#
Related
So I have two hive queries, one that creates the table and the other one that reads parquet data from another table and inserts the relevant columns into my new table. I would like this new hive table to export its data to an s3 location with data in csv.gz format. My hive queries running on emr are currently outputting 00000_0.gz and I have to rename them using a bash script to csv.gz. This is quite a hacky way as I have to mount my s3 directory into my terminal and it would be ideal if my queries could directly do this. Could someone please review my queries to see where if there's any fault, many thanks.
CREATE TABLE db.test (
app_id string,
app_account_id string,
sdk_ts BIGINT,
device_id string)
PARTITIONED BY (
load_date string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION "s3://test_unload/";
set hive.execution.engine=tez;
set hive.cli.print.header=true;
set hive.exec.compress.output=true;
set hive.merge.tezfiles=true;
set hive.merge.mapfiles=true;
set hive.merge.mapredfiles=true;
set hive.merge.smallfiles.avgsize=1024000000;
set hive.merge.size.per.task=1024000000;
set hive.exec.dynamic.partition.mode=nonstrict;
insert into db.test
partition(load_date)
select
'' as app_id,
'288' as app_account_id,
from_unixtime(CAST(event_epoch as BIGINT), 'yyyy-MM-dd HH:mm:ss') as sdk_ts,
device_id,
'20221106' as load_date
FROM processed_events.test
where load_date = '20221106'; ```
We are reading data from a Amazon KDS stream using Apache Flink. The incoming stream records contain a column named "timestamp". We also want to use the approximate arrival time value from the record metadata (which is automatically added by KDS). The metadata key for the approximate arrival time is also named "timestamp". This causes an error when trying to use both columns:
CREATE TABLE source_table
(
`timestamp` VARCHAR(100)
`approximate_arrival_time` TIMESTAMP(3) METADATA FROM 'timestamp'
)
WITH
(
'connector' = 'kinesis',
...
);
Trying to access data from the table:
SELECT * FROM source_table;
Results in this error:
org.apache.flink.table.api.ValidationException: Field names must be unique. Found duplicates: [timestamp]
at org.apache.flink.table.types.logical.RowType.validateFields(RowType.java:272)
at org.apache.flink.table.types.logical.RowType.(RowType.java:157)
at org.apache.flink.table.planner.connectors.DynamicSourceUtils.createProducedType(DynamicSourceUtils.java:215)
at org.apache.flink.table.planner.connectors.DynamicSourceUtils.validateAndApplyMetadata(DynamicSourceUtils.java:443)
at org.apache.flink.table.planner.connectors.DynamicSourceUtils.prepareDynamicSource(DynamicSourceUtils.java:158)
at org.apache.flink.table.planner.connectors.DynamicSourceUtils.convertSourceToRel(DynamicSourceUtils.java:119)
at org.apache.flink.table.planner.plan.schema.CatalogSourceTable.toRel(CatalogSourceTable.java:85)
at org.apache.calcite.sql2rel.SqlToRelConverter.toRel(SqlToRelConverter.java:3585)
at org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:2507)
at org.apache.calcite.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:2144)
at org.apache.calcite.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:2093)
at org.apache.calcite.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:2050)
at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:663)
at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:644)
at org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3438)
at org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:570)
at org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$rel(FlinkPlannerImpl.scala:169)
at org.apache.flink.table.planner.calcite.FlinkPlannerImpl.rel(FlinkPlannerImpl.scala:161)
at org.apache.flink.table.planner.operations.SqlToOperationConverter.toQueryOperation(SqlToOperationConverter.java:989)
at org.apache.flink.table.planner.operations.SqlToOperationConverter.convertSqlQuery(SqlToOperationConverter.java:958)
at org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:283)
at org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:101)
at org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:704)
at org.apache.zeppelin.flink.sql.AbstractStreamSqlJob.run(AbstractStreamSqlJob.java:102)
at org.apache.zeppelin.flink.FlinkStreamSqlInterpreter.callInnerSelect(FlinkStreamSqlInterpreter.java:89)
at org.apache.zeppelin.flink.FlinkSqlInterrpeter.callSelect(FlinkSqlInterrpeter.java:503)
at org.apache.zeppelin.flink.FlinkSqlInterrpeter.callCommand(FlinkSqlInterrpeter.java:266)
at org.apache.zeppelin.flink.FlinkSqlInterrpeter.runSqlList(FlinkSqlInterrpeter.java:160)
at org.apache.zeppelin.flink.FlinkSqlInterrpeter.internalInterpret(FlinkSqlInterrpeter.java:112)
at org.apache.zeppelin.interpreter.AbstractInterpreter.interpret(AbstractInterpreter.java:47)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:110)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:852)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:744)
at org.apache.zeppelin.scheduler.Job.run(Job.java:172)
at org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:132)
at org.apache.zeppelin.scheduler.ParallelScheduler.lambda$runJobInScheduler$0(ParallelScheduler.java:46)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
I am working with the WSO2 IS 5.2.0
for some reasons, I would like to change the data schema of the default embedded H2 database.
for example,
the maximum length of volume "ACCESS_TOKEN" in table "IDN_OAUTH2_ACCESS_TOKEN" is 255 chars. I would like to change it to 8194.
I made the following change the configuration file "/dbscripts/identity/h2.sql" (see the value "8194")
CREATE TABLE IF NOT EXISTS IDN_OAUTH2_ACCESS_TOKEN (
TOKEN_ID VARCHAR (255),
ACCESS_TOKEN VARCHAR (8194),
REFRESH_TOKEN VARCHAR (255),
CONSUMER_KEY_ID INTEGER,
AUTHZ_USER VARCHAR (100),
TENANT_ID INTEGER,
USER_DOMAIN VARCHAR(50),
USER_TYPE VARCHAR (25),
GRANT_TYPE VARCHAR (50),
TIME_CREATED TIMESTAMP DEFAULT 0,
REFRESH_TOKEN_TIME_CREATED TIMESTAMP DEFAULT 0,
VALIDITY_PERIOD BIGINT,
REFRESH_TOKEN_VALIDITY_PERIOD BIGINT,
TOKEN_SCOPE_HASH VARCHAR (32),
TOKEN_STATE VARCHAR (25) DEFAULT 'ACTIVE',
TOKEN_STATE_ID VARCHAR (128) DEFAULT 'NONE',
SUBJECT_IDENTIFIER VARCHAR(255),
PRIMARY KEY (TOKEN_ID),
FOREIGN KEY (CONSUMER_KEY_ID) REFERENCES IDN_OAUTH_CONSUMER_APPS(ID) ON DELETE CASCADE,
CONSTRAINT CON_APP_KEY UNIQUE (CONSUMER_KEY_ID,AUTHZ_USER,TENANT_ID,USER_DOMAIN,USER_TYPE,TOKEN_SCOPE_HASH,
TOKEN_STATE,TOKEN_STATE_ID)
the problems is that I just cannot put this change into effect. I did everything (restart, reinstall), the original settings ("256") persists...
it seems that the database schema had been generated in the IS server image. and the generating script file says "GENERATE IF NOT EXISTS..."
anyone has any idea?
thanks
Remove <IS_HOME>/repository/database/*. Then start server with -Dsetup.
./wso2server.sh -Dsetup
Yes, It is generated with the server distribution. What you can do is (backup and) remove the content of repository/database, update the db scripts and start the server with bin/wso2server.sh -Dsetup. This needs to be done only once, from next time on you can start the server as usual.
One other possibility is to use the H2 console. If you already have data, it will be the better option.
When I create a database dump from Amazon RDS and then I try to import it locally, the result is ERROR 1064 (42000) at line 54.
At line 54 there is the following statement:
CREATE TABLE account_emailconfirmation (
The command used for dump is:
mysqldump -u user -h host.rds.amazonaws.com -p --default-character-set=utf8 --result-file=sync.sql database_name
The command used for import is:
mysql --user=root -p mpl -vv < sync.sql
And here is the output (verbosity increased).
--------------
CREATE TABLE `account_emailconfirmation` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`created` datetime(6) NOT NULL,
`sent` datetime(6) DEFAULT NULL,
`key` varchar(64) NOT NULL,
`email_address_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `key` (`key`),
KEY `acc_email_address_id_5bcf9f503c32d4d8_fk_account_emailaddress_id` (`email_address_id`),
CONSTRAINT `acc_email_address_id_5bcf9f503c32d4d8_fk_account_emailaddress_id` FOREIGN KEY (`email_address_id`) REFERENCES `account_emailaddress` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
--------------
ERROR 1064 (42000) at line 54: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '(6) NOT NULL,
`sent` datetime(6) DEFAULT NULL,
`key` varchar(64) NOT NULL,
' at line 3
Bye
The problem is with datetime(6). MySql introduced storing of fractional seconds in v5.6.4. The syntax for indicating fractional seconds in datetime - this is the (6) - is not recognised by previous mysql versions.
The data was exported from a mysql v5.6.4 or later, and was attempted to be imported into an earlier version. Since the error message starts with (6), I believe that this is the problem.
'key' is a reserved word in MySQL, and shouldn't be used as a column name. It's possible that a different version on Amazon RDS allowed it, but your best bet is to change the column name to something different.
My use case: pushes data from a stream configured in the ESB to BAM and create a report using “Gadget Generation Tool”
Publishing the stream from ESB to BAM after adding an agent to the proxy service worked fine.
From the stream I created a table using the Analytics->Add screen and the table seems to persist as I am able to do a select and see results from the same screen.
Now I am trying to generate a Dashboard using the Gadget Generation Tool but the table is not available, though the jdbc connection is working fine but the table is nowhere:
Script for Analytic Table run from Analytics->Add screen
CREATE EXTERNAL TABLE IF NOT EXISTS CREDITTABLE(creditkey STRING, creditFlag STRING, version STRING)
STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler'
WITH SERDEPROPERTIES ( "cassandra.host" = "127.0.0.1" ,
cassandra.port" = "9163" , "cassandra.ks.name" = "EVENT_KS" ,
"cassandra.ks.username" = "admin" ,
"cassandra.ks.password" = "admin" ,
"cassandra.cf.name" = "firstStream" ,
"cassandra.columns.mapping" = ":key,payload_k1-constant, Version" );
Tried looking for table in following databases:
jdbc:h2:repository/database/WSO2CARBON_DB;AUTO_SERVER=TRUE
jdbc:h2:repository/database/metastore_db;AUTO_SERVER=TRUE
jdbc:h2:repository/database/samples/BAM_STATS_DB;AUTO_SERVER=TRUE
Have not done any custom db configurations.
Did you try jdbc:h2:repository/database/samples/WSO2CARBON_DB;AUTO_SERVER=TRUE? Also, what you have pasted is the Cassandra Storage Definition, probably used for getting the input, not persisting the output. If you give the full hive query, that would help to figure out the problem more.
Why did I not see the table in Gadget Generation tool?
The table I have created using the Hive script is a Casandra Distributed database table and the reference I gave in the Gadget generation tool while looking up for the table were from the h2 RDBMS database table.
Below are the references to the h2 RDBMS databse which comes out of box with WSO2
jdbc:h2:repository/database/WSO2CARBON_DB;AUTO_SERVER=TRUE
jdbc:h2:repository/database/metastore_db;AUTO_SERVER=TRUE
jdbc:h2:repository/database/samples/BAM_STATS_DB;AUTO_SERVER=TRUE
Resolution ----- How to get tables listed in the Gadget Generation tool?
To get the tables listed in the Gadget Generation tool you have to extensively use the Hive Script to complete the following 3 steps:
Create a Hive table reference for the Casandra data stream to which data is pushed from ESB in my case.
CREATE EXTERNAL TABLE IF NOT EXISTS CREDITTABLE(
payload_creditkey STRING, payload_creditFlag STRING, payload_version STRING) STORED BY
'org.apache.hadoop.hive.cassandra.CassandraStorageHandler' WITH SERDEPROPERTIES ( "cassandra.host" = "127.0.0.1" ,
"cassandra.port" = "9163" , "cassandra.ks.name" = "EVENT_KS" , "cassandra.ks.username" = "admin" , "cassandra.ks.password" = "admin" ,
"cassandra.cf.name" = "firstStream" , "cassandra.columns.mapping" = ":key,payload_k1-constant, Version" );
Using Hive script create a H2 RDBMS script and reference to which I would be copying my data from the Casandra stream.
CREATE EXTERNAL TABLE IF NOT EXISTS CREDITTABLEh2summary(
creditFlg STRING,
verSion STRING
)
STORED BY
'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler'
TBLPROPERTIES (
'mapred.jdbc.driver.class' = 'org.h2.Driver' ,
'mapred.jdbc.url' = 'jdbc:h2:C:/wso2bam-2.2.0/repository/samples/database/BAM_STATS_DB' ,
'mapred.jdbc.username' = 'wso2carbon' ,
'mapred.jdbc.password' = 'wso2carbon' ,
'hive.jdbc.update.on.duplicate' = 'true' ,
'hive.jdbc.primary.key.fields' = 'creditFlg' ,
'hive.jdbc.table.create.query' = 'CREATE TABLE CREDITTABLE_newh2(creditFlg VARCHAR(100), version VARCHAR(100))' );
Write a Hive query using which data would be copied from Casandra to H2[RDBMS]
insert overwrite table CREDITTABLEh2summary select a.payload_creditFlag,a.payload_version from CREDITTABLE a;
On doing this I was able to see the table in the Gadget Generation tool however I also had to chage the referenc to the H2 Database to absolute in the JDBC URL value that I passed.
Observation:
Was wondering if the Gadget generation tool can directly point to the Casandra Stream without having to copy the tables to a RDBMS database.