GCP Data Catalog - BigQuery gcloud Command - google-cloud-platform

How to change between enumerated values of a field in a Data Catalog Tag Attached to a BigQuery Dataset using gcloud command?
I am able to manually change the value, let us say the enumerated values are Happy and sad, I am able to swap between sad and Happy manually but, I want to execute it using gcloud command.

The basic syntax is:
gcloud beta data-catalog entries update [ENTRY_ID] --location=[LOCATION] --update-mask=[FIELD_NAME] --type=[TAG_TYPE] [FIELD_NAME]=[NEW_VALUE]
Where:
LOCATION is the location of the Data Catalog instance that contains the entry
TAG_TYPE is the type of the tag you want to update
NEW_VALUE is the new value you want to set for the field
FIELD_NAME is the name of the field you want to update
field named emotion from "sad" to "happy", you would use the following command:
gcloud beta data-catalog entries update [ENTRY_ID] --location=[LOCATION] --update-mask=emotion --type=[TAG_TYPE] emotion=happy

Related

GCP, is there a way to find which Asset-type can be labelled and which are not?

I need to find out which resources (Asset-Types) in entire GCP organization can be labelled.
In short, i do not want resources which doesn't have a column Label in the schema. Is there a way to find columns of every asset-type ? or any other way to extract only resources that have column/attribute Label?
gcloud asset search-all-resources --scope=organizations/Org-ID
--filter=-labels:* --format='csv(name, assetType, labels)' --sort-by=name > notLabels.csv
i use this command to get the resources but it returns also the resources that can't be labelled.
You can find the list of services that support labels in GCP in this documentation.
And you can filter it with the following format below as an example:
gcloud asset search-all-resources --filter labels.env:*
The above command lists the services that has env as key and anything that has value on it.
gcloud asset search-all-resources --filter=-labels.*
The second sample command above lists the resources with no labels value by adding - before the label parameter.
You can find more information on using filter searches using labels here.

AWS Athena/Glue: Add partition to the data catalog using AWS CLI

I want to add manually a partiton to the datacalaog.
I can do it using Athena editor but i want to make it using a shell script that i can parameter and schedule.
Here's an example of my sql command to add a partition:
ALTER TABLE tab ADD PARTITION (year='2020', month='10', day='28', hour='23') LOCATION 's3://bucket/data/2020/10/28/23/'
i want to do the same thing with a shell command.
I think that i can use Glue APi: create-partition
Doc: https://docs.aws.amazon.com/cli/latest/reference/glue/create-partition.html
I'm trying but there something wrong with the format of the parameter that i add.

gcloud command output formatting to use results in another gcloud command

I'm trying to automate the deletion of SSL certificates that end with a certain text pattern on GCP projects.
For this I use the command:
gcloud compute ssl-certificates list --filter="name~'819$'" --format="(name)"
Which output displays exactly this format:
NAME
certname1-1602160819
certname2-1602160819
certname3-1602160819
...and so on
The thing is that if I want to use the results from this command to then use it to input another gcloud command that deletes each certificate, I get the first variable as NAME which is the field title and obviously not a certificate.
Here is my script:
#!/bin/bash
for oldcert in $( gcloud compute ssl-certificates list --filter="name~'819$'" --format="(NAME)")
do
gcloud compute ssl-certificates delete $oldcert
done
Do you know how I could get the field name NAME out of my output so I could treat each result in another command directly.
Thanks for your precious advices
#Hitobat thanks very much for your comment
I used the csv[no-heading] option even though the tails -n +2 otion also does the job
the following commands did the job great:
#!/bin/bash
for oldcert in $( gcloud compute ssl-certificates list --filter="name~'819$'" --format="csv[no-heading](name)")
do
gcloud compute ssl-certificates delete $oldcert --quiet
done
The right format to use is --format=value[](name).
According to the docs:
value
CSV with no heading and <TAB> separator instead of <COMMA>. Used
to retrieve individual resource values.
So it's equivalent to the --format="csv[no-heading](name) that you used, but "more correct" (and a little more legible).

How to export and import google cloud monitoring dashboards between projects using script or API?

I have exported the dashboards using gcloud alpha monitoring dashboards list --format=json, but using gcloud dashboard create using file is not working, basically I want to export the dashboards from one project and import that in other project.
The output of the list sub command probably (didn't test this) has too many dashboards for the create command.
Also, you should remove two fields (name and etag). No need to export as json, yaml will also work and is easier to edit anyway.
I did the following:
gcloud monitoring dashboards list and find the dashboard I was looking for
Note it's name property and get the id from the last part in the name property (a large decimal number or guid)
gcloud monitoring dashboards describe $DASHBOARD_ID > dashboard-$DASHBOARD_ID.yaml the dashboard
Edit the file to remove the etag and name field (the name is usually located at the end of the file)
gcloud monitoring dashboards create --config-from-file dashboard-$DASHBOARD_ID.yaml

Cloud DataFlow SQL from BigQuery UI cannot read Cloud Storage filesets: "Table not found: datacatalog.entry"

I'm trying to create a Data Flow job using the beta Cloud DataFlow SQL within Google Big Query UI.
My data source is a Cloud Storage Fileset (that is a set of files in Cloud Storage defined through a Data Catalog).
Following GCP documentation, I was able to define my fileset, assign it a schema and visualize it in the Resources tab of Big Query UI.
But then I cannot launch any Dataflow job in the Query Editor, because I get the following error message in the query validator: Table not found: datacatalog.entry.location.entry_group.fileset_name...
Is it an issue of some APIs not authorized?
Thanks for your help!
You may be using the wrong location in the full path. When your create a Data Catalog Fileset, check the location you provided, i.e: using the sales regions example from the docs:
gcloud data-catalog entries create us_state_salesregions \
--location=us-central1 \
--entry-group=dataflow_sql_dataset \
--type=FILESET \
--gcs-file-patterns=gs://us_state_salesregions_{my_project}/*.csv \
--schema-from-file=schema_file.json \
--description="US State Sales regions..."
When you are building your DataFlow SQL query:
SELECT tr.*, sr.sales_region
FROM pubsub.topic.`project-id`.transactions as tr
INNER JOIN
datacatalog.entry.`project-id`.`us-central1`.dataflow_sql_dataset.us_state_salesregions AS sr
ON tr.state = sr.state_code
Check the full path, it should look like the example above:
datacatalog.entry, then your location - in this example is us-central1, next your project-id, next your entry group id - in this example dataflow_sql_dataset, next your entry id - in this example us_state_salesregions
let me know if this works for you.