SAP HANA Vora Insert into table? - vora

I am trying to do an insert into an existing table, but receive incorrect syntax error:
Statement:
vc.sql("insert into table HIST_TEMP values (0, 'AAA','2010-06-01', 30.5, 12.0)")
Error:
org.apache.spark.sql.SapParserException: Syntax error at or near line 1, column 36
insert into table HIST_TEMP values (0, 'AAA','2010-06-01', 30.5, 12.0)
at org.apache.spark.sql.SapSqlParser$.parse(SapSqlParser.scala:176)
Table:
vc.sql(s"""
CREATE TABLE HIST_TEMP(
INSTRUMENT_ID INT,
TRADING_SYMBOL VARCHAR(5),
TRADE_DATE DATE,
CLOSE_PRICE DOUBLE,
SPLIT_FACTOR DOUBLE)
USING com.sap.spark.vora
OPTIONS (tableName "HIST_TEMP",
hosts "$vHost",
zkurls "localhost:2181") """)

Vora currently only officially supports appending data to an existing table (using the APPEND statement). For details see SAP HANA Vora Developer Guide -> Chapter "3.5 Appending Data to Existing Tables"

The syntax for the insert should be
insert into <tablename> (col1, col2, col3...) values('val1', 'val2', 'val3'...);
Gopal

Related

Failed to parse SQL query - column invalid identifier

I am on Application Express 21.1.0.
I added a column to a db table, and tried to add that column to the Form region based on that table.
I got error...
ORA-20999: Failed to parse SQL query! ORA-06550: line 4, column 15: ORA-00904: "NEEDED_EXAMS": invalid identifier
And I can not find the column in any "source> column" attribute of any page item of that form.
I can query the new column in "SQL COMMANDS".
The new column's name is "NEEDED_EXAMS". It's a varachar2(500).
Don't do it manually; use built-in feature by right-clicking the region and then select "Synchronize columns" from the menu as it'll do everything for you. It works for reports and forms.
Solved.
I have many parsing schemas. And I was creating tables through object browser in different schema than my app's parsing schema.

How to RENAME struct/array nested columns using ALTER TABLE in BigQuery?

Suppose we have the following table in BigQuery:
CREATE TABLE sample_dataset.sample_table (
id INT
,struct_geo STRUCT
<
country STRING
,state STRING
,city STRING
>
,array_info ARRAY
<
STRUCT<
key STRING
,value STRING
>
>
);
I want to rename the columns inside the STRUCT and the ARRAY using an ALTER TABLE command. It's possible to follow the Google documentation available here for normal columns ("non-nested" columns) i:
ALTER TABLE sample_dataset.sample_table
RENAME COLUMN id TO str_id
But when I try to run the same command for nested columns I got errors from BigQuery.
Running the command for a column inside a STRUCT gives me the following message:
ALTER TABLE sample_dataset.sample_table
RENAME COLUMN `struct_geo.country` TO `struct_geo.str_country`
Error: ALTER TABLE RENAME COLUMN not found: struct_geo.country.
The exact same message appears when I run the same statement, but targeting a column inside an ARRAY:
ALTER TABLE sample_dataset.sample_table
RENAME COLUMN `array_info.str_key` TO `array_info.str_key`
Error: ALTER TABLE RENAME COLUMN not found: array_info.str_key
I got stuck since the BigQuery documentation about nested columns (available here) lacks examples of ALTER TABLE statements and refers directly to the default documentation for non-nested columns.
I understand that I can rename the columns by simply creating a new table using a CREATE TABLE new_table AS SELECT ... and then passing the new column names as aliases, but this would run a query over the whole table, which I'd rather avoid since my original table weighs way over 10TB...
Thanks in advance for any tips or solutions!

writing from a Spark DataFrame to BigQuery table gives BigQueryException: Provided Schema does not match

My PySpark computes a DataFrame that I want to insert into a BigQuery table (from a dataproc cluster).
On the BigQuery side, the partition field is REQUIRED.
On the DataFrame side, the partition field inferred is not REQUIRED, that is why I make a schema defining this field as REQUIRED :
StructField("date_part",DateType(),False)
So, I create a new DF with the new schema and when I show this DF, I see as expected :
date_part: date (nullable = false)
But my PySpark ended like that :
Caused by: com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryException: Provided Schema does not match Table xyz$20211115. Field date_part has changed mode from REQUIRED to NULLABLE
Is there something I missed ?
Update :
I am using Spark 3.0 image
And spark-bigquery-latest_2.12.jar connector

Snowflake table is not accepting null values in date field

I have one table in snowflake, I am performing bulk load using.
one of the columns in table is date, but in the source table which is on sql server is having null values in date column.
The flow of data is as :
sql_server-->S3 buckets -->snowflake_table
I am able to perform the sqoop job in EMR , but not able to load the data into snowflake table, as it is not accepting null values in the date column.
The error is :
Date '' is not recognized File 'schema_name/table_name/file1', line 2, character 18 Row 2,
column "table_name"["column_name":5] If you would like to continue loading when an error is
encountered, use other values such as 'SKIP_FILE' or 'CONTINUE' for the ON_ERROR option.
can anyone help, where I am missing
Using below command you can able to see the values from stage file:
select t.$1, t.$2 from #mystage1 (file_format => myformat) t;
Based on the data you can change your copy command as below:
COPY INTO my_table(col1, col2, col3) from (select $1, $2, try_to_date($3) from #mystage1)
file_format=(type = csv FIELD_DELIMITER = '\u00EA' SKIP_HEADER = 1 NULL_IF = ('') ERROR_ON_COLUMN_COUNT_MISMATCH = false EMPTY_FIELD_AS_NULL = TRUE)
on_error='continue'
The error shows that the dates are not arriving as nulls. Rather, they're arriving as blank strings. You can address this a few different ways.
The cleanest way is to use the TRY_TO_DATE function on your COPY INTO statement for that column. This function will return database null when trying to convert a blank string into a date:
https://docs.snowflake.com/en/sql-reference/functions/try_to_date.html#try-to-date

Rename Column Name in Athena AWS

I have tried several ways to rename some column name in athena table.
after reading the following article
https://docs.aws.amazon.com/athena/latest/ug/alter-table-replace-columns.html
But I have get a no luck on it.
I tried
ALTER TABLE "users_data"."values_portions" REPLACE COLUMNS ('username/teradata' 'String', 'username_teradata' 'String')
Got error
no viable alternative at input 'alter table "users_data"."values_portions" replace' (service: amazonathena; status code: 400; error code: invalidrequestexception; request id: 23232ssdds.....; proxy: null)
You can refer to this document which talks about renaming columns. The query that you are trying to run will replace all the columns in the existing table with provided column list.
One strategy for renaming columns is to create a new table based on the same underlying data, but using new column names. The example mentioned in the link creates a new orders_parquet table called orders_parquet_column_renamed. The example changes the column o_totalprice name to o_total_price and then runs a query in Athena.
Another way of changing the column name is by simply going to AWS Glue -> Select database -> select table -> edit schema -> double click on column name -> type in new name -> save.