Group by incompatible google sql cloud - google-cloud-platform

I did an experiment to create multiple queries on Google SQL cloud, as shown in the picture, but the result is
ERROR 1055 (42000): Expression # 5 of SELECT list is not in GROUP BY
clause and contains nonaggregated column 'ipol.sales.name' which is
not functionally dependent on columns in GROUP BY clause; this is
incompatible with sql_mode = only_full_group_by

In cloud SQL instance go to Edit option and Add Database Flag SQL mode - Traditional

you cannot SET GLOBAL unless you're logged in as root (or equivalent) user.
and also, the GROUP BY clause is invalid. fix the statement before running it
... whatever Expression # 5 may be (just count them).
this comes from sql_mode=only_full_group_by.

if you run this query in your database
select version(),##sql_mode;
you will get
version()
##sql_mode
8.0.26-google
ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION
the ONLY_FULL_GROUP_BY in the ##sql_mode caused the problem.
edit your Cloud SQL via GCP GUI page
add flag sql_mode, select all of the value in the ##sql_mode except ONLY_FULL_GROUP_BY and save.

Related

Apach Superset [v2.0.0]: Row Level Security - Not working in SQL Lab

How do we get ROW LEVEL SECURITY working in Superset [v2.0.0]?
I tried many thing but last I am stuck with getting it work within SQL Lab. See bug:https://github.com/apache/superset/issues/20774
If I set RLS_IN_SQLLAB = True within superset/config.py and try to run a query SELECT * FROM FLIGHTS, it tries to pick the WHERE clause "AIRLINE" = 'AS' but still fails due to prefixing it with public.flights.AIRLINE which is not a valid Column name.
SQL Lab in Apache superset does not support row-level security. SQL lab is meant for superadmins who have access to all data.

How to fetch the latest schema change in BigQuery and restore deleted column within 7 days

Right now I fetch columns and data type of BQ tables via the below command:
SELECT COLUMN_NAME, DATA_TYPE
FROM `Dataset`.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
WHERE table_name="User"
But if I drop a column using command : Alter TABLE User drop column blabla:
the column blabla is not actually deleted within 7 days(TTL) based on official documentation.
If I use the above command, the column is still there in the schema as well as the table Dataset.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
It is just that I cannot insert data into such column and view such column in the GCP console. This inconsistency really causes an issue.
If I want to write bash script to monitor schema changes and do some operation based on it.
I need more visibility on the table schema of BigQuery. The least thing I need is:
Dataset.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS can store a flag column that indicates deleted or TTL:7days
My questions are:
How can I fetch the correct schema in spanner which reflects the recently deleted the column?
If the column is not actually deleted, is there any way to easily restore it?
If you want to fetch the recently deleted column you can try searching through Cloud Logging. I'm not sure what tools Spanner supports but if you want to use Bash you can use gcloud to fetch logs. Though it will be difficult to parse the output and get the information you want.
Command used below fetched the logs for google.cloud.bigquery.v2.JobService.InsertJob since an ALTER TABLE is considered as an InsertJob and filter it based from the actual query where it says drop. The regex I used is not strict (for the sake of example), I suggest updating the regex to be stricter.
gcloud logging read 'protoPayload.methodName="google.cloud.bigquery.v2.JobService.InsertJob" AND protoPayload.metadata.jobChange.job.jobConfig.queryConfig.query=~"Alter table .*drop.*"'
Sample snippet from the command above (Column PADDING is deleted based from the query):
If you have options other than Bash, I suggest that you create a BQ sink for your logging and you can perform queries there and get these information. You can also use client libraries like Python, NodeJS, etc to either query in the sink or directly query in the GCP Logging.
As per this SO answer, you can use the time travel feature of BQ to query the deleted column. The answer also explains behavior of BQ to retain the deleted column within 7 days and a workaround to delete the column instantly. See the actual query used to retrieve the deleted column and the workaround on deleting a column on the previously provided link.

GCP CLOUD SQL denies permission for pre aggregation

I am trying to use pre aggregations over CLOUD SQL on Google Cloud Platform but the database is denying access and giving error Statement violates GTID consistency.
Any help is appreciated.
Cube.js done pre-aggregation by CREATE TABLE ... SELECT, but you are using MySQL on top of Google SQL with --enforce-gtid-consistency (has limitations).
Since only transactionally safe statements can be logged, there is a limitation to use CREATE TABLE ... SELECT (and some another SQL), because this statement is actually logged as two separate events.
There are two ways how to solve this issue:
1. Use pre-aggregations to an external database. (recommended way).
https://cube.dev/docs/pre-aggregations/#read-only-data-source-pre-aggregations
2. Use not documented flag loadPreAggregationWithoutMetaLock
Attention: This flag is an experimental and can be removed or changed in the feature..
Take a look at the source code
You can pass it directly in the driver constructor. This will produce two SQL statements to pass the limitation:
CREATE TABLE
INSERT INTO
Thanks

Auto Create statistics in Azure SQL DW

In Azure SQL Datawarehouse i just used the below tsql code to enable auto statistics creation.The command ran successfully , but when i checked in database properties under option tab Auto Create Statistics is till set to False.
ALTER DATABASE MyDB SET AUTO_CREATE_STATISTICS ON;
Please let me know if i'm missing here something. I have the db_owner access for the database also.
I'm guessing that you are using SQL Server Management Studio.
I was able to reproduce the symptom by turning off and on auto_create_statistics.
The issue appears to be that the database metadata is cached in SSMS. Right-click the database name and select "Refresh" before selecting "Properties". Using this method I got the correct setting for auto_create_statistics showing up each time.
My tests were done using SSMS 17.7
(The need to refresh the database metadata can also occur when adding or removing tables, columns, etc)
You can also query sys.databases, the is_auto_create_stats_on column.

Use SQL expression as calculated field in Amazon Quicksights

I am trying to integrated a simple SQL expression in Amazon Quicksights, but every time I use the calculated field I get a error stating that the methods used are not valid.
Amazon Quicksight does not let me use aggregate functions:
ROUND((SUM(CASE WHEN dyn_boolean THEN 1 ELSE 0 END) * 100.0) / COUNT(session_unique), 2)
I know that I can change the CASE into a ifelse, but that does not solve the entire problem.
If you want full power of SQL while preparing data, use custom SQL option during creation of data set.
'New Data set' -> 'FROM EXISTING DATA SOURCES' ->'Create Data Set'->'Edit and Preview Data'-> 'Switch to custom SQL tool'
You can write custome SQL of your choice .