Unable to run queries in SQL Lab in Apache Superset

Unable to run queries in SQL Lab in Apache Superset - apache-superset

I'm unable to run queries in SQL Lab. As soon as I run the query, Result sections shows {e} and query history says failed. Attaching the screenshot.

Resolved this one by looking through dev tools for exact exceptions.
This one has to do with the Signal alarms. Since superset is not supported with Windows, you will need to pass the exceptions on enter and on exit methods in core.py. It should be located in the file ~ C:\python\venv\Lib\site-packages\superset\utils
Comment Try in the below method and add pass:
def enter(self):
def exit(self, type, value, traceback):

Related

How to load data/update Power BI Dataset monthly

I've been asked to implement a way to load data to my datasets once a month. As Power BI Service doesn't have this option, I had to find a solution using Power Query and bellow I describe the step-by-step of my solution.
If it helps you at some way, please, let me know by posting a comment bellow. If you have a better and/or more elegant solution I'm glad to hear from you.

So, as my first solution didn't work, here I'll post the definity solution that we (me and my colleges) found.
I have to say that this solution is not so simple as it uses a Linux server, Gitlab and Jenkins, so it require a relative complex environment and I'll not describe how to build it.
At the end, I'll suggest a simpler solution.
THE ENVIRONMENT
On my company we use Jenkins to schedule jobs, Gitlab to store source code and we have a Linux Server to execute small tasks using Shell Script. For this problem I used all three services besides Power BI API.
JENKINS
I use Jenkins to schedule a job that run montlhy. This job was created using the following configs:
Parameters: I created 2 parameters (workspace_id and dataset_id) so I can test the script at any environment (Power BI Workspace) by just changing the value of those parameters;
Schedule Job: this job was schedule to run every day 1 at 02:00 a.m. As Jenkins uses the same sintax as CRON (I thing it is just a intermediate between you and CRON) the value of this field is 0 2 1 * *.
Build: as here we have a remote linux server to execute the scripts, I used a Execute shell script on remote host using ssh. I don't know why on Jenkins you can not execute the curl command direct on the job, it just didn't work, so I had to split the solution into both Jenkins and Linux server. At SSH site you have to select the credentials (previously created by my team) and at command are the commands bellow:
#Navigate to the script shell directory
cd "script-shell-script/"
# pulls the last version of the script. If you aren't using Gitlab,
# remove this command
git pull
# every time git pulls a new file version, it has read access.
# This command allows the execution of the filechmod +x powerbi_refresh_dataset.sh
# make a call to the file passing as parameter the workspace id and dataset id
./powerbi_refresh_dataset.sh $ID_WORKSPACE $ID_DATASET
SHELL SCIPT
As you already imagine, the core solution is the content of powerbi_refresh_dataset.sh. But, before going, there, you must understand how Power BI API works and you have to configure your Power BI environment to make API calls work. So, please, make sure that you already have your Principal Service properly configured by following this tutorial: https://learn.microsoft.com/en-us/power-bi/developer/embedded/embed-service-principal
Once you got your object_id, client_id and client_secret you can create your shell script file. Bellow is the code of my .sh file.
# load OBJECT_ID, CLIENT_ID and CLIENT_SECRET as environment variables
source credential_file.sh
# This command retrieves a new token from Microsoft Credentials Manager
token_msg=$(curl -X POST "https://login.windows.net/$OBJECT_ID/oauth2/token" \
-H 'Content-Type: application/x-www-form-urlencoded' \
-H 'Accept: application/json' \
-d 'grant_type=client_credentials&resource=https://analysis.windows.net/powerbi/api&client_id='$CLIENT_ID'&client_secret='$CLIENT_SECRET
)
# Extract the token from the response message
token=$(echo "$token_msg" | jq -r '.access_token')
# Ask Power BI to refresh dataset
refresh_msg=$(curl -X POST 'https://api.powerbi.com/v1.0/myorg/groups/'$1'/datasets/'$2'/refreshes' \
-H 'Authorization: Bearer '$token \
-H 'Content-Type: application/json' \
-d '{"notifyOption": "NoNotification"}')
And here goes some explanation. The first command is source credential_file.sh which loads 3 variables (OBJECT_ID, CLIENT_ID and CLIENT_SECRET). The intention here is to separate confidential info from the script so I can store the main script file on a version control (Git) and not disclosure any sensitivy information. So, besides powerbi_refresh_dataset.sh file you must have credential_file.sh at the same directory and with the following content:
OBJECT_ID=OBJECT_ID_VALUE
CLIENT_ID=CLIENT_ID_VALUE
CLIENT_SECRET=CLIENT_SECRET_VALUE
It's important to say that if you are using Git or any other version control, only powerbi_refresh_dataset.sh file goes to version control and credential_file.sh file must remain only at your Linux Server. I suggest you to save it's content into a password store application like keepass, as CLIENT_SECRET is not possible to retrieve.
FINAL CONSIDERATIONS
So above is the most relevant info of my solution. As you can see I'm ommiting (intentionally) how to build the environment and make them talk (jekins with linux, jenkins with Git and so on).
If all you have is a Linux or Windows host, I suggest you this:
Linux Host
On this simpler environment, just create the powerbi_refresh_dataset.sh and credential_file.sh, place it at any directory and create a CRON task to call powerbi_refresh_dataset as many time as you wish.
Windows Host
On windows you can do almost the same as on Linux, but you'll have to replace the content of shell script file by Power Shell command (google it) and use the Scheduled Task to regularly execute you Power Shell file.
Well, I think this would help you. I know it's not a complete answer as it will only works if you have a similar environment, but I hope that the final tips might help you.
Best regards

The Solution
First let me resume the solution. I just putted a condition execution at the end of each query that checks if today is the day where new data must be uploaded or not. If yes, it returns the step to be executed, if not, it raises a error.
There is many ways to implement that and I'll go from the simplest form to the more complex one.
Simplest Version: checking if it's the day to load new data directly at the query
This is the simplest way to implement the solution, but, depending on your dataset it may not be the smartest one.
Lets say you have this foo query:
let
step1 = ...,
...,
...,
step10 = SomeFunction(Somevariable, someparameter)
in
setp10
Now lets pretend you want that query to upload new data just on 1st day of the month. To do that, you just insert a condicional struction at in clause.
let
step1 = ...,
...,
...,
step10 = SomeFunction(Somevariable, someparameter)
in
if Date.Day(DateTime.LocalNow()) = 1 then setp10 else error "Today is not the day to load data"
At this example I just replaced the setp10 at the return of the query by this piece of code:if Date.Day(DateTime.LocalNow()) = 1 then setp10 else error "Today is not the day to load data". By doing that, setp10 will be the result of this query only if this query is been executed at day 1st of the month, otherwise, it will return a error.
And here it's worthy some explanation. Power Query is not a script language that runs at the same order that it's declared. So the fact the condicional statement was placed at the end of the query doesn't mean that all code above will be executed before the error is launched. As Power Query just executes what's necessary, the if... statement it will probably be the first one to be executed. For more info about how Power Query works behind the scene, I stronlgy recomend you this reading: https://bengribaudo.com/blog/2018/02/28/4391/power-query-m-primer-part5-paradigm
Using function
Now lets move foward. Lets say that your Dataset set has not only one, but many queries and all of them needs to be executed only once a month. In this case, a smart way to do that is by using what all other programming languages have to reuse block of code: create a function!
For this, create a new Blank Query and paste this code on its body:
(step) =>
let
result = if Date.Day(DateTime.LocalNow()) = 1 then step else error "Today is not the day to load data"
in
result
Now, at each query you'll call this function, sending the last setp as parameter. The function will check which day is today and return the same step passed as parameter if it's the day to load the data. Otherwise, it will return the error.
Bellow is the code of our query using our function called check_if_upload
let
step1 = ...,
...,
...,
step10 = SomeFunction(Somevariable, someparameter)
step11 = check_if_upload(step10)
in
step11
Using parameters
One final tip. As your query raises a error if today is not the day to upload day, it means that you can only test your ETL once a month, right? The error message also limite you to save you Power Query, which means that if you don't apply the modifications you can't upload the new Power Query version (having this implementations) to Power BI Service.
Well, you could change the value of the day verification into the function, but it's let's say, a little dummy.
A more ellegante way to change this parameter is by using parameters. So, lets do it. Create a parameter (I'll call it Upload Day) as a number type. Now, all you have to do is use this parameter at your function. It will look like this:
(step) =>
let
result = if Date.Day(DateTime.LocalNow()) = #"Upload Day" then step else error "Today is not the day to load data"
in
result
That's it. Now you can change the upload day directly at Power BI Service, just changing this parameter at the dataset (click on dataset name and goes to Settings >> Parameters).
Hope you neiled it and that its helpful for you.
Best regards.

Error cloning database unable to update the following flags: cloudsql.enable_password_validation

I am attempting to clone a database. I was able to previous clone it in the console, but now I want to create a small script to automate this and it fails with the following error message:
(gcloud.sql.instances.clone) [ERROR_RDBMS] unable to update the following flags: cloudsql.enable_password_validation
If I attempt to clone it in the console, I get the same error shown above.
I looked up the documentation and enable_password_validation does not seem to be in the list of supported flags, which would explain why it can't update it.
If I run gcloud sql instances describe my-instance, I don't see the flag in question.
But running on the source instance:
SELECT * FROM pg_settings
yields this row in particular:
name
setting
unit
category
short_desc
extra_desc
context
vartype
source
min_val
max_val
enumvals
boot_val
reset_val
sourcefile
sourceline
pending_restart
cloudsql.enable_password_validation
off
NULL
Customized Options
Sets whether to enable Cloud SQL password validation.
NULL
superuser
bool
configuration file
NULL
NULL
NULL
on
off
/pgsql/data/postgresql.auto.conf
3
False
Any advice on how to solve this?

There is currently an ongoing issue with password validation in Cloud SQL Postgres instances. The issue involves the exact flag that is giving you problems cloudsql.enable_password_validation:
Diagnosis: Affected postgres instances from a recent release have the following flag set and are unable to remove or disable this flag: cloudsql.enable_password_validation=on. This flag does not appear in Cloud Console, and attempting to disable flag via gcloud returns error where the flag is not recognized or supported. Password validation occurs on every new client connection but is limited to 50 QPS, and thus higher rates will return errors.
When did this issue start occurring? Have you also attempted to clone the database since then? This is due to the issue receiving several updates. If you continue experiencing issues, you could open a support case with GCP as the status page recommends.
EDIT (2/24/2022)
I wanted to update this answer. The issue seems to be resolved as shown in the status page of Google Cloud:
The issue with Cloud SQL has been resolved for all affected instances as of Tuesday, 2022-02-22 14:30 US/Pacific. We thank you for your patience while we worked on resolving the issue.
If you still see this error, you can update the question confirming that it was not resolved as part of the outage resolution.

what will be the query for check completion of workflow?

I have to cheack the status of workflow weather that workflow completed within scheduled time or not in sql query format. And also send an email of workflow status like 'completed within time ' or not 'completed within time'. So, please help me out

You can do it either using option1 or option 2.
You need access to repository meta database.
Create a post session shell script. You can pass workflow name and benchmark value to the shell script.
Get workflow run time from repository metadata base.
SQL you can use -
SELECT WORKFLOW_NAME,(END_TIME-START_TIME)*24*60*60 diff_seconds
FROM
REP_WFLOW_RUN
WHERE WORKFLOW_NAME='myWorkflow'
You can then compare above value with benchmark value. Shell script can send a mail depending on outcome.
you need to create another workflow to check this workflow.
If you do not have access to Metadata, please follow above steps except metadata SQL.
Use pmcmd GetWorkflowDetails to check status, start and end time for a workflow.
pmcmd GetWorkflowDetails -sv service -d domain -f folder myWorkflow
You can then grep start and end time from there, compare them with benchmark values. The problem is the format etc. You need little bit scripting here.

Python script reading an Excel file stops executing in Task Scheduler

A python 2.7 script, executed by Task Scheduler running on Windows Server 2012 (64 bit), ends without raising an exception at the point where it is opening an existing XLS file. Creating the Dispatch works fine (try/except not shown):
xlApp = win32com.client.DispatchEx('Excel.Application')
but right after that:
try:
log_message("Opening Excel Workbook object for the attachment using password '%s'" % email_found['PASSWORD'])
workbook = xlApp.Workbooks.Open(attachment, False, False, None, email_found['PASSWORD'])
log_message("Workbook opened, produced object with type '%s'" % type(workbook).__name__)
except Exception, e:
log_message("Exception opening workbook")
message = "Exception raised : %s" % str(e)
log_message(message, 'ERROR')
xlApp.Quit()
return 22
log_message() writes to a log file and optionally sends an email. The first message appears, and that's the end of the log file. Excel shows up as running in Task Manager, and the task shows as running in Task Scheduler.
The very same script, when run in a command shell by the same user, completes successfully. The very same script, when run in Task Scheduler on a Win7 box, completes successfully.
Other information:
The user whose account runs the task is a local administrator. I have tried two such users, and the same thing happens for both.
There is only the one instance of Excel (2010) installed on the box.
The file whose full path is in 'attachment' definitely exists, and can be opened interactively by Excel.
The string stored at email_found['PASSWORD'] contains the correct password for the XLS file.
I found some posts that mention the necessity of having one or the other (or both) of these directories:
C:\Windows\SysWOW64\config\systemprofile\Desktop
C:\Windows\System32\config\systemprofile\Desktop
Both are present for me (they're empty).
I'm running this script on Windows, rather than using xlrd on our preferred Linux platform, because xlrd does not support password protected XLS files. The XLS file is sent to us every week day by a client.
Any suggestions are most welcome, and thanks in advance.

I had a similar problem. But when I configured the task in TaskScheduler, I changed the configuration from "run whether the user is logged on or not" to "run when the user is logged on". And it worked! My script went to the OneDrive folder and read and extract the data from the selected excel.

Why LOAD DATA LOCAL INFILE will work from the CLI but not in application?

The problem:
My C++ application connects to a MySQL server, reads the first/header line of each db export.txt, makes a create table statement to prepare for the import and executes that against the database (no problem with that, the table appears just as intended) -- but when I try and execute the LOAD DATA LOCAL INFILE to import the data into the newly created table, I get the error "The used command is not allowed with this MySQL version". But, this works on the CLI! When I execute this command on the CLI using mysql -u <user> -p<password> -e "LOAD DATA LOCAL INFILE 'myfile.txt' INTO TABLE mytable FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n';" it works flawlessly?
The Situation:
My company gets a large quantity of database exports (160 files/10gb of .txt files that are '|' delimited) from our vendors on a monthly basis that have to replace the old vendor lists. I am working on a smallish C++ app to deal with it on my work desktop. The application is meant to set up the required tables, import the data, then execute a series of intermediate queries against multiple tables to assemble information in a series of final tables, which is then itself exported and uploaded to the production environment, for use in the companies e-commerce website.
My Setup:
Ubuntu 12.04
MySQL Server v. 5.5.29 + MySQL Command Line client
Linux GNU C++ Compiler
libmysqlcppconn is installed and I have the required mysqlconn library linked in.
I have already overcome/tried the following issues/combinations:
1.) I have already discovered (the hard way) that LOAD DATA [LOCAL] INFILE statements must be enabled in the config -- I have the "local-infile" option set in the configuration files for both client and server. (fixed by updating the /etc/mysql/my.cnf with "local-infile" statements for the client and server. NOTE: I could have used the --local-infile=1 to restart the mysql-server, but this is my local dev environment so I just wanted it turned on permanently)
2.) LOAD DATA LOCAL INFILE seems to fail to perform the import (from the CLI) if the target import file does not have execute permissions enabled (fixed with chmod +x target_file.txt)
3.) I am using the mysql root account in my application code (because its my localhost, not production and this particular program will never run on a production server.)
4.) I have tried executing my compiled binary program using the sudo command (no change, same error "The used command is not allowed with this MySQL version")
5.) I have tried changing the ownership of the binary file from my normal login to root (no change, same error "The used command is not allowed with this MySQL version")
6.) I know the libcppmysqlconn is working because I am able to connect and perform the CREATE TABLE call without a problem, and I can do other queries and execute statements
What am I missing? Any suggestions? Thanks in advance :)

After much diligent trial and error working with the /etc/mysql/my.cfg file (I know this is a permissions issue because it works on the command line, but not from the connector) and after much googling and finding some back alley tech support posts I've come to conclude that the MySQL C++ connector did not (for whatever reason) decide to implement the ability for developers to be able to allow the local-infile=1 option from the C++ connector.
Apparently some people have been able to hack/fork the MySQL C++ connector to expose the functionality, but no one posted their source code -- only said it worked. Apparently there is a workaround in the MySQL C API after you initialize the connection you would use this:
mysql_options( &mysql, MYSQL_OPT_LOCAL_INFILE, 1 );
which apparently allows the LOAD DATA LOCAL INFILE statements to work with the MySQL C API.
Here are some reference articles that lead me to this conclusion:
1.) How can I get the native C API connection structure from MySQL Connector/C++?
2.) Mysql 5.5 LOAD DATA INFILE Permissions
3.) http://osdir.com/ml/db.mysql.c++/2004-04/msg00097.html
Essentially if you want the ability to use the LOAD DATA LOCAL INFILE functionality from a programmatic Connector API -- you have to use the mysql C API or hack/fork the existing mysql C++ api to expose the connection structure. Or just stick to executing the LOAD DATA LOCAL INFILE from the command line :(

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js