In Google Cloud Platform, you can add labels to several resources and also add labels to the query jobs you execute. I did this second option. A typical code looks like this:
bq query --label=my_label:{parameter} --label=my_label2:{parameter2} --format=json --use_legacy_sql=false '{query}'"
But, by mistake, the first time I did like this:
bq query --label=my_label{parameter} --label=my_label2:{parameter2} --format=json --use_legacy_sql=false '{query}'"
which created several jobs (I regularly ran this command) having a label named my_labelFoo with an empty value instead of a label named my_label with a value of Foo. This was detected when, in the Billing UI, we noticed several labels as options for filtering, being all of them:
my_labelFoo
my_labelBar
my_labelBaz
my_labelJohn
my_labelGeorge
my_labelRingo
my_labelPaul
...
What I tried to do, then, is to delete the metadata of those wrong jobs. So I tried this query in BigQuery (having the appropriate permissions):
SELECT job_id, query, labels FROM `my-project`.`region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT WHERE ARRAY_LENGTH(labels) > 0 AND EXISTS(SELECT * FROM UNNEST(labels) l WHERE l.key = 'my_labelRingo')"
For each job_id retrieved this way, I tried invoking:
from google.cloud.bigquery import Client
Client().delete_job_metadata(job_id, location="us")
What I can say for good, is that the job entries were removed (they were a few amount), but...
...when I go back to the Billing UI, I still see the my_labelRingo as a selectable label from there. I don't want that label to exist anymore.
So, my question is:
How do I delete the wrong labels from the Billing UI?
Is there, perhaps, a time I have to wait for my_labelRingo to cease to exist?
The situation you are experiencing with the labels in the Billing console is something specific to Cloud Billing Support and you will need to directly engage them using this link
so they can fully investigate why it is happening.
The solutions shared below are different alternatives to delete labels in the GCP BigQuery console.
You can delete a table or view label in the following ways:
Using the console
Using SQL DDL statements
Using the bq command-line tool's bq update command
Calling the tables.patch API method
Because views are treated like table resources, tables.patch is used to modify both views and tables.
Using the client libraries
But you would need to have the next permissions:
bigquery.tables.get
bigquery.tables.update
For example to delete a label through the console you need to follow the next steps:
On the console, select the dataset you want to edit.
Click on the details tab and then click the pencil icon to the right of labels.
On the edit labels dialog
For each label you want to delete, click delete (X)
Click on update to save the changes,
Also you can see more ways to delete labels.
Related
I added resource labels to a few VMs to be able to pull a more granular billing breakdown by label. However, when I go to the billing report, I don't see any option to filter by Label. Is this a permission issue or am I missing something?
If I embed "label=" in the url, the label option will show, but it still doesn't retrieve the matching key pair.
As per my analysis your issue can be due to below reasons :
As per the official doc it says that
When filtering your billing breakdown by label keys, you are not able
to select labels applied to a project. You can select other
user-created labels that you set up and applied to Google Cloud
services.
This might be the reason you are unable to filter the label.
Google does not recommend creating large numbers of unique labels,
such as for timestamps or individual values for every API call. Refer
to these common use cases for labels and Refer this link for
requirements of label.
You need to enable “resourcemanager.projects.get “ permissions and
also enable “resourcemanager.projects.update” to add or modify the
label.
Refer to this link to create the label.
I cannot find a way to do this in the UI: I'd like to have distinct query tabs in the BigQuery's UI attached to the same session (i.e. so they share the same ##session_id and _SESSION variables). For example, I'd like to create a temporary table (session-scoped) in one tab, then in a separate query tab be able to refer to that temp table.
As far as I can tell, when I put a query tab in Session Mode, it always creates a new session, which is precisely what I don't want :-\
Is this doable in BQ's UI?
There is 3rd party IDE for BigQuery supporting such a feature (namely: joining Tab(s) into existing session)
This is Goliath - part of Potens.io Suite available at Marketplace.
Let's see how it works there:
Step 1 - create Tab with new session and run some query to actually initiate session
Step 2 - create new Tab(s) and join to existing session (either using session_id or just simply respective Tab Name
So, now both Tabs(Tab 2 and Tab 3) share same session with all expected perks
You can add as many Tabs to that session as you want to comfortably organize your workspace
And, as you can see Tabs that belong to same session are colored in user defined color so easy to navigate between them
Note: Another tool in this suite is Magnus - Workflow Automator. Supports all BigQuery, Cloud Storage and most of Google APIs as well as multiple simple utility type Tasks like BigQuery Task, Export to Storage Task, Loop Task and many many more along with advanced scheduling, triggering, etc. Supports GitHub as a source control as well
Disclosure: I am GDE for Google Cloud and creator of those tools and leader on Potens team
Right now I fetch columns and data type of BQ tables via the below command:
SELECT COLUMN_NAME, DATA_TYPE
FROM `Dataset`.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
WHERE table_name="User"
But if I drop a column using command : Alter TABLE User drop column blabla:
the column blabla is not actually deleted within 7 days(TTL) based on official documentation.
If I use the above command, the column is still there in the schema as well as the table Dataset.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
It is just that I cannot insert data into such column and view such column in the GCP console. This inconsistency really causes an issue.
If I want to write bash script to monitor schema changes and do some operation based on it.
I need more visibility on the table schema of BigQuery. The least thing I need is:
Dataset.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS can store a flag column that indicates deleted or TTL:7days
My questions are:
How can I fetch the correct schema in spanner which reflects the recently deleted the column?
If the column is not actually deleted, is there any way to easily restore it?
If you want to fetch the recently deleted column you can try searching through Cloud Logging. I'm not sure what tools Spanner supports but if you want to use Bash you can use gcloud to fetch logs. Though it will be difficult to parse the output and get the information you want.
Command used below fetched the logs for google.cloud.bigquery.v2.JobService.InsertJob since an ALTER TABLE is considered as an InsertJob and filter it based from the actual query where it says drop. The regex I used is not strict (for the sake of example), I suggest updating the regex to be stricter.
gcloud logging read 'protoPayload.methodName="google.cloud.bigquery.v2.JobService.InsertJob" AND protoPayload.metadata.jobChange.job.jobConfig.queryConfig.query=~"Alter table .*drop.*"'
Sample snippet from the command above (Column PADDING is deleted based from the query):
If you have options other than Bash, I suggest that you create a BQ sink for your logging and you can perform queries there and get these information. You can also use client libraries like Python, NodeJS, etc to either query in the sink or directly query in the GCP Logging.
As per this SO answer, you can use the time travel feature of BQ to query the deleted column. The answer also explains behavior of BQ to retain the deleted column within 7 days and a workaround to delete the column instantly. See the actual query used to retrieve the deleted column and the workaround on deleting a column on the previously provided link.
Would like to be able to move (copy and replace) views from one project to another.
Suppose we have Dev GCP Project 1, along with an Integrated and Production GCP project. I would like to be able to move individual views or specific datasets only (Not tables) from Dev to Int, then Int to Prod.
I know i can use the following Google Cloud Shell command for moving tables.
bq cp ProjectNumberDev.DatasetDev.TableDev ProjectNumberInt.DatasetInt.TableInt
However, this command only works with tables and not views, is there a way to do this with views? Or is a Table Insert / Post API script the only way?
Per documentation:
Currently, there is no supported method for copying a view from one dataset to another. You must recreate the view in the target dataset.
You can copy the SQL query from the old view:
Issue the bq show command.
The --format flag can be used to control the output. If you are getting information about a view in a project other than your default project, add the project ID to the dataset in the following format: [PROJECT_ID]:[DATASET]. To write the view properties to a file, add > [PATH_TO_FILE] to the command.
bq show --format=prettyjson [PROJECT_ID]:[DATASET].[VIEW] > [PATH_TO_FILE]
Meantime, if you can script all your views during the development then you can use CREATE VIEW statement in all environments
See more for Data Definition Language
You can apply same approach for tables creation, etc.
Moving views from one project to another is possible to do in the Cloud Console at least now.
Currently, you can copy a view only by using the Cloud Console.
Here is the instruction from GCP documentation:
Ive been looking around, and have not been able to find anywhere on the AWS console a place where i can query the tables i have created in DynamoDB.
Is it possible for me to run quick queries against any of the tables i have in DynamoDB from within AWS itself. Or will i actually have to go ahead and build a quick app that lets me run the queries??
I would have thought that there would be some basic tool provided that lets me run queries against the tables. If there is, its well hidden....
Any help would be great!
DynamoDB console -> Tables -> click the table you want to query -> select the Items tab
You will then see an interface to Scan or Query the table. You can change the first drop-down from "Scan" to "Query" based on what you want to do, and change the second drop-down to select the table index you want to query.