Moving views to another Project Bigquery - google-cloud-platform

Would like to be able to move (copy and replace) views from one project to another.
Suppose we have Dev GCP Project 1, along with an Integrated and Production GCP project. I would like to be able to move individual views or specific datasets only (Not tables) from Dev to Int, then Int to Prod.
I know i can use the following Google Cloud Shell command for moving tables.
bq cp ProjectNumberDev.DatasetDev.TableDev ProjectNumberInt.DatasetInt.TableInt
However, this command only works with tables and not views, is there a way to do this with views? Or is a Table Insert / Post API script the only way?

Per documentation:
Currently, there is no supported method for copying a view from one dataset to another. You must recreate the view in the target dataset.
You can copy the SQL query from the old view:
Issue the bq show command.
The --format flag can be used to control the output. If you are getting information about a view in a project other than your default project, add the project ID to the dataset in the following format: [PROJECT_ID]:[DATASET]. To write the view properties to a file, add > [PATH_TO_FILE] to the command.
bq show --format=prettyjson [PROJECT_ID]:[DATASET].[VIEW] > [PATH_TO_FILE]
Meantime, if you can script all your views during the development then you can use CREATE VIEW statement in all environments
See more for Data Definition Language
You can apply same approach for tables creation, etc.

Moving views from one project to another is possible to do in the Cloud Console at least now.
Currently, you can copy a view only by using the Cloud Console.
Here is the instruction from GCP documentation:

Related

How to Transfer a BigQuery view to a Google Cloud Storage bucket as a csv file

I need to export the content of a BigQuery view to the csv file in GCP, with Airflow DAG. To export the content of the BQ TABLE, I can use BigQueryToCloudStorageOperator. But in my case I need to use an existing view, and BigQueryToCloudStorageOperator fails with this error, which I see while checking the logs for failed DAG:
BigQuery job failed: my_view is not allowed for this operation because it is currently a VIEW
So, what options do I have here? I can't use a regular table, so may be there is another operator that would work with a view data stored in BQ, instead of table? Or may be the same operator would work with some addition options (although I don't see anything useful in here Apache documentation for BigQueryToCloudStorageOperator)?
I think the Bigquery client doesn’t give the possibility to export a view to a GCS file.
It’s not perfect but I propose you 2 solutions
First solution (more native with existing operators) :
Create a staging table to export it to GCS
At the beginning of your DAG, create a task that truncate this staging table
Add a task with a select on your view and an insert in your staging table (insert/select)
Use the bigquery_to_gcs operator from your staging table
Second solution (less native with Python clients and PythonOperator) :
Use a PythonOperator
In this operator, use a Bigquery Python client to load data from your view as Dict and the storage Python client to generate a file to GCS from this Dict
I have a preference for the first solution, even if it forces me to create a staging table.
I ended up with a kind of combined solution, part of it is what Mazlum Tosun suggested in his answer: in my DAG I added an extra first step, a DataLakeKubernetesPodOperator, which runs a Python file. In that Python file there are calls to SQL files, which contain simple queries (put in the await asyncio.wait(...) block and executed with bq_execute() ): truncate an existing table (to prepare it for a new data), and then copy (insert) data from the view to the truncated table (as Mazlum Tosun suggested).
After that step, the rest is the same as before: I use BigQueryToCloudStorageOperator to copy data from the regular table (which now contains data from the view) to google cloud storage bucket, and now it works fine.

Attach multiple query tabs in BigQuery to the same BQ Session

I cannot find a way to do this in the UI: I'd like to have distinct query tabs in the BigQuery's UI attached to the same session (i.e. so they share the same ##session_id and _SESSION variables). For example, I'd like to create a temporary table (session-scoped) in one tab, then in a separate query tab be able to refer to that temp table.
As far as I can tell, when I put a query tab in Session Mode, it always creates a new session, which is precisely what I don't want :-\
Is this doable in BQ's UI?
There is 3rd party IDE for BigQuery supporting such a feature (namely: joining Tab(s) into existing session)
This is Goliath - part of Potens.io Suite available at Marketplace.
Let's see how it works there:
Step 1 - create Tab with new session and run some query to actually initiate session
Step 2 - create new Tab(s) and join to existing session (either using session_id or just simply respective Tab Name
So, now both Tabs(Tab 2 and Tab 3) share same session with all expected perks
You can add as many Tabs to that session as you want to comfortably organize your workspace
And, as you can see Tabs that belong to same session are colored in user defined color so easy to navigate between them
Note: Another tool in this suite is Magnus - Workflow Automator. Supports all BigQuery, Cloud Storage and most of Google APIs as well as multiple simple utility type Tasks like BigQuery Task, Export to Storage Task, Loop Task and many many more along with advanced scheduling, triggering, etc. Supports GitHub as a source control as well
Disclosure: I am GDE for Google Cloud and creator of those tools and leader on Potens team

Can we use button to change data source in power bi?

I am new in power bi. I am using SQL connection for data load in power bi.
I created the report in Dev environment. But I want to use the same report in all environment(dev/test/uat/prod).
Question: Is it possible to switch the connection via button click in dashboard?
You'll have to use a parameter to select the connection and store the report in template format - *.pbit. Then you can easily create different versions of the report from the template by specifying the according parameter setting.
The only way to use a slicer for changing the environment would be to load the data from all different environments into the model first - which is clearly not recommended.
Power BI offers Deployment Pipelines for this purpose. This tool will allow you to create 3 workspaces for dev, test and production stages. Then you can deploy from one stage to another by clicking a button in Power BI Service or using the REST API. In the pipeline you can define rules for dataset and parameters, which can be used to automatically change the datasource when deploying to the next stage, i.e. to change the datasource from your dev database to the test database, or from test database to production one.
You can also implement similar functionality using the API. See for example this answer.
Its tricky question.
Try with above answers, if those not work try with these approach.
I don't think for the moment they didn't implement solution for that.
From my experience I had to create 3 dashboards and gateways to dev, test and prod dashboard.
If your dev,test,prod database column names are same you just copy past your dashboard and rename it according to that.
Then go to change data source and add new test env host and change schema to test env.
If you get few errors you have to resolve , check column names, host and finally you have to sync your data.
You can use same approach for the prod env .
once you publish, you can point to gateway for dev,test or prod environment.
Note: Establish gateway on your server.

Move entire dataset from one google project to another google project without data

as part of code deployment to production, we need to copy all tables from a big query dataset to production environment. However, the UI option or the bq command line option is moving the data too . How do I just move all the BIG QUERY tables at once from non prod to prod environment without data??
Kindly suggest?
posting my comment as an answer:
I don't know about any way how to achieve what you want directly, but there is a possible workaround:
You first need to create the dataset in the destination project and then run CREATE TABLE new_project.dataset.xx AS SELECT * FROM old_project.dataset.xx WHERE 1=0.
You also need to make sure to specify the partition field. This works well for datasets where there are just a few tables, for larger datasets you can script this operation in Python or whatever else you use.

Creating a BigQuery dataset from a log sink in GCP

When running
gcloud logging sinks list
it seems I have several sinks for my project
▶ gcloud logging sinks list
NAME DESTINATION FILTER
myapp1 bigquery.googleapis.com/projects/myproject/datasets/myapp1 resource.type="k8s_container" resource.labels.cluster_name="mygkecluster" resource.labels.container_name="myapp1"
myapp2 bigquery.googleapis.com/projects/myproject/datasets/myapp2 resource.type="k8s_container" resource.labels.cluster_name="mygkecluster" resource.labels.container_name="myapp2"
myapp3 bigquery.googleapis.com/projects/myproject/datasets/myapp3 resource.type="k8s_container" resource.labels.cluster_name="mygkecluster" resource.labels.container_name="myapp3"
However, when I navigate in my BigQuery console, I don't see the corresponding datasets.
Is there a way to import these sinks as datasets so that I can run queries against them?
This guide on creating BigQuery datasets does not list how to do so from a log sink (unless I am missing something)
Also any idea why the above datasets are not displayed when using the bq ls command?
Firstly, be sure to be in the good project. if not, you can import dataset from external project by clicking on the PIN button (and you need to have enough permission for this).
Secondly, the Cloud Logging sink to BigQuery doesn't create the dataset, only the tables. So, if you have created the sinks without the dataset, you sinks aren't running (or run in error). Here more details
BigQuery: Select or create the particular dataset to receive the exported logs. You also have the option to use partitioned tables.
In general, what you expect for this feature to do is right, using BigQuery as log sink is to allow you to query the logs with BQ. For the problem you're facing, I believe it is to do with using Web console vs. gcloud.
When using BigQuery as log sink, there are 2 ways to specify a dataset:
point to an existing dataset
create a new dataset
When creating a new sink via web console, there's an option to have Cloud Logging create a new dataset for you as well. However, when using gcloud logging sinks create, it does not automatically create a dataset for you, only create the log sink. It seems like it also does not validate whether the specified dataset exists.
To resolve this, you could either use web console for the task or create the datasets on your own. There's nothing special about creating a BQ dataset to be a log sink destination comparing to creating a BQ dataset for other purpose. Create a BQ dataset, then create a log sink to point to the dataset and you're good to go.
Conceptually, different products (BigQuery, Cloud Logging) on GCP runs independently, the log sink in Cloud Logging is simply an object that pairs up filter and destination, but does not own/manage the destination resource (eg. BQ dataset). It's just that in web console, it provide some extra integration to make things easier.