Use SQL expression as calculated field in Amazon Quicksights - amazon-web-services

I am trying to integrated a simple SQL expression in Amazon Quicksights, but every time I use the calculated field I get a error stating that the methods used are not valid.
Amazon Quicksight does not let me use aggregate functions:
ROUND((SUM(CASE WHEN dyn_boolean THEN 1 ELSE 0 END) * 100.0) / COUNT(session_unique), 2)
I know that I can change the CASE into a ifelse, but that does not solve the entire problem.

If you want full power of SQL while preparing data, use custom SQL option during creation of data set.
'New Data set' -> 'FROM EXISTING DATA SOURCES' ->'Create Data Set'->'Edit and Preview Data'-> 'Switch to custom SQL tool'
You can write custome SQL of your choice .

Related

How to fetch the latest schema change in BigQuery and restore deleted column within 7 days

Right now I fetch columns and data type of BQ tables via the below command:
SELECT COLUMN_NAME, DATA_TYPE
FROM `Dataset`.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
WHERE table_name="User"
But if I drop a column using command : Alter TABLE User drop column blabla:
the column blabla is not actually deleted within 7 days(TTL) based on official documentation.
If I use the above command, the column is still there in the schema as well as the table Dataset.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
It is just that I cannot insert data into such column and view such column in the GCP console. This inconsistency really causes an issue.
If I want to write bash script to monitor schema changes and do some operation based on it.
I need more visibility on the table schema of BigQuery. The least thing I need is:
Dataset.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS can store a flag column that indicates deleted or TTL:7days
My questions are:
How can I fetch the correct schema in spanner which reflects the recently deleted the column?
If the column is not actually deleted, is there any way to easily restore it?
If you want to fetch the recently deleted column you can try searching through Cloud Logging. I'm not sure what tools Spanner supports but if you want to use Bash you can use gcloud to fetch logs. Though it will be difficult to parse the output and get the information you want.
Command used below fetched the logs for google.cloud.bigquery.v2.JobService.InsertJob since an ALTER TABLE is considered as an InsertJob and filter it based from the actual query where it says drop. The regex I used is not strict (for the sake of example), I suggest updating the regex to be stricter.
gcloud logging read 'protoPayload.methodName="google.cloud.bigquery.v2.JobService.InsertJob" AND protoPayload.metadata.jobChange.job.jobConfig.queryConfig.query=~"Alter table .*drop.*"'
Sample snippet from the command above (Column PADDING is deleted based from the query):
If you have options other than Bash, I suggest that you create a BQ sink for your logging and you can perform queries there and get these information. You can also use client libraries like Python, NodeJS, etc to either query in the sink or directly query in the GCP Logging.
As per this SO answer, you can use the time travel feature of BQ to query the deleted column. The answer also explains behavior of BQ to retain the deleted column within 7 days and a workaround to delete the column instantly. See the actual query used to retrieve the deleted column and the workaround on deleting a column on the previously provided link.

Define AWS database to use in Custom SQL?

I am creating a dataset in AWS Quicksight using custom SQL which I prepare/test in Athena. However, unless I define each join/table "databasename".table, the QS custom SQL fails. I have tried the below but it has failed. Is it possible to instruct the query to fun against a specific DB at the beginning of the query?
USING AwsDataCatalog."databasename"
In the data preparation, in the custom SQL page, on the left pane, you should be able to choose the database name (Schema).
If you do not set that, then it will use Athena's default schema so you have to fully qualify all table names.

Fetch Schedule data from a BigQuery Table to another BigQuery Table (Scheduled queries)

I am really new to GCP and I am trying to Query in a GCP BigQuery to fetch all data from one BigQuery table and Insert all into another BigQuery table
I am trying the Following query where Project 1 & Dataset.Table1 is the Project where I am trying to read the data. and Project 2 and Dataset2.Table2 is the Table where I am trying to Insert all the data with the same Naming
SELECT * FROM `Project1.DataSet1.Table1` LIMIT 1000
insert INTO `Project2.Dataset2.Table2`
But am I receiving a query error message?
Does anyone know how to solve this issue?
There may be a couple of comments...
The syntax might be different => insert into table select and so on - see DML statements in the standard SQL
Such approach of data coping might not be very optimal considering time and cost. It might be better to use bq cp -f ... commands - see BigQuery Copy — How to copy data efficiently between BigQuery environments and bq command-line tool reference - if that is possible in your case.
The correct syntax of the query is as suggested by #al-dann. I will try to explain further with a sample query as below:
Query:
insert into `Project2.Dataset2.Table2`
select * from `Project1.DataSet1.Table1`
Input Table:
This will insert values into the second table as below:
Output Table:

SnowFlake & PowerBI "native queries aren't support by this value"

Quick Note, I have reviewed these threads and they do not fix my issue:
( Outdated info, see documentation below ) Access Snowflake query results through PowerBI
( I would expect this to fix my issue, but it does not ) How to write a Snowflake SELECT statement query in Advance Editor from powerBi
Hi All,
When attempting to query snowflake with a native query, I get this error:
These are the parameters:
I have verified the credentials / tables /  databases / schemas are correct by connecting to directly to one table at a time, but simple queries like the screenshot and complex queries all return this message "native queries aren't support by this value".
I know this is a new feature ( June 2021 ) and I have read the documentation here: https://learn.microsoft.com/en-us/power-query/connectors/snowflake#connect-using-advanced-options
EDIT:
I have tried the following query formats:
SELECT * FROM "MyDatabase".PUBLIC.ITEMSTABLE
SELECT * FROM "MyDatabase"."PUBLIC".ITEMSTABLE
SELECT * FROM "MyDatabase"."PUBLIC"."ITEMSTABLE"
I believe that this may be due to my MyDatabase being case sensitive and PowerBI stripping the quotes around it in the query.
In snowflake, this query succeeds while the same query in PowerBI fails:
SELECT * FROM "MyDatabase".PUBLIC.ITEMSTABLE
Issue opened with Microsoft here:
https://community.powerbi.com/t5/Issues/Unable-to-query-case-sensitive-Snowflake-tables/idc-p/2030983
Any help is appreciated.
Most likely the query provided in message box is terminated with semicolon. It should be removed from the source query:
And actual query sent to Snowflake:
As we can see it is wrapped with outer query so any kind of input that makes the full query invalid one will error out.
I had similar issue with Native query written using Dataverse as Datasource. PowerBI Refresh was successful on the power bi desktop but refresh was failing on powerbi server. It was fixed for me when I have appended [EnableFolding=false] keyword in the native query. I have attached the screenshot for a reference of what i have modified. Hope this will help someone
enter image description here
Regards,
Mohith
It has been confirmed by a Microsoft ticket that my issue was that I had a case sensitive database name. The solution from MS was to... Not have a case sensitive DB name.

programmatically change dataset SQL statement in power bi

Is it possible to change SQL statement of the dataset via API call?
My Scenario: I have data in multiple tables in SQL Server. I have created a SQL query with joins to fetch the required data. I created a SQL server dataset by providing that query in the SQL Statement section and published it on the Power BI workspace. Now, I want to modify that SQL Statment programmatically.
I want to import this same .pbix file to create different datasets. The Idea is to use import date set api to import this dummy dataset and then programmatically change the db source and the SQL Statment, to customizes it for my different report need.
Any pointer or help is much appreciated.
For server name and database name, you can simply use parameters. Click the button to the left of the field to do this. You can make some changes in the query using parameters too, but this isn't very flexible. This can be done by defining text parameter and using it in in the M statement associated with the dataset’s Source step. For more information you may see this article:
https://www.red-gate.com/simple-talk/sql/bi/power-bi-introduction-working-with-parameters-in-power-bi-desktop-part-4/
Then you can use the Rest API to modify parameter values and refresh your datasets. You will need Update Parameters In Group and Refresh Dataset In Group API calls.
At of this writing, this is not supported by the Power BI REST API.
Possible workaround: Given you're using SQL Server, I'd suggest you create a VIEW in SQL Server with the statement you defined in your Power BI report, and change your report to point to that view instead.
Then, to modify the SQL statement, you just have to ALTER the view in the database.