runtime_version versus runtime-version in cloudml-samples/flowers/sample.sh - google-cloud-ml

In Google's sample code found at cloudml-samples/flowers/sample.sh, between lines 52 and 64, is the argument "runtime_version":
# Training on CloudML is quick after preprocessing. If you ran the above
# commands asynchronously, make sure they have completed before calling this one.
gcloud ml-engine jobs submit training "$JOB_ID" \
--stream-logs \
--module-name trainer.task \
--package-path trainer \
--staging-bucket "$BUCKET" \
--region us-central1 \
--runtime_version=1.0 \
-- \
--output_path "${GCS_PATH}/training" \
--eval_data_paths "${GCS_PATH}/preproc/eval*" \
--train_data_paths "${GCS_PATH}/preproc/train*"
Shouldn't "runtime_version" be replaced with "runtime-version" to avoid an error?

Yes. I've submitted a PR (in the future, never hesitate to do so yourself)

Related

What is different nami wallet signing and payment.skey signing?

I built transaction at nodejs by using cardano-cli.
cardano-cli transaction build \
--alonzo-era \
--testnet-magic 1 \
--tx-in dbf7f56f844cc4b85daccb62bedf4eeff0a84cb060f0f79b206c7f087b3f0ba1#0 \
--tx-in dbf7f56f844cc4b85daccb62bedf4eeff0a84cb060f0f79b206c7f087b3f0ba1#1 \
--tx-in 61b88efd41ccbb0e71c48aca2cbe63728078ec7fb20ec9c27acfe33d0647248d#0 \
--tx-in-script-file /cardano/plutus/direct-sale.plutus \
--tx-in-datum-file /cardano/temp/testnet/datums/list.json \
--tx-in-redeemer-file /cardano/temp/testnet/redeemers/buy.json \
--required-signer-hash 2be4a303e36f628e2a06d977e16f77ce2b9046b8c56576bb5286d1be \
--tx-in-collateral dbf7f56f844cc4b85daccb62bedf4eeff0a84cb060f0f79b206c7f087b3f0ba1#1 \
... some txout ...
--change-address addr_test1qq47fgcrudhk9r32qmvh0ct0wl8zhyzxhrzk2a4m22rdr0sqcga2xfzv6crryyt0sfphksfr947jjddy3t4u0qwfmmfq2h0pj8 \
--protocol-params-file /cardano/testnet/protocol-parameters.json \
--mint "2 20edea925974af2102c63adddbb6a6e789f8d3a16500b15bd1e1c32b.4143544956495459" \
--mint-script-file /cardano/plutus/activity-minter.plutus \
--mint-redeemer-file /cardano/redeemers/mint.json \
--invalid-before 19059345 \
--invalid-hereafter 19059495 \
--out-file ./tx.raw
After run this command, I got cborHex and I used this at frontend.
When I sign by using name wallet, I got some error.
"transaction submit error ShelleyTxValidationError ShelleyBasedEraBabbage (ApplyTxError [UtxowFailure (FromAlonzoUtxowFail (WrappedShelleyEraFailure (InvalidWitnessesUTXOW [VKey (VerKeyEd25519DSIGN \"cf949f966b426f25db11b6062edc31312001e3cd0ced4c6c7db3da7b5ac9766b\")])))])"
But when I sign with payment.skey at nodejs, it was worked.
I was discuss Alexd1985 at cardano forum.
https://forum.cardano.org/t/how-to-resolve-shelleytxvalidationerror-shelleybasederababbage-applytxerror-utxowfailure-fromalonzoutxowfail-wrappedshelleyerafailure-invalidwitnessesutxow-error/113555/4
What is solution for this problem?

Cloud Armor Waf - How to forward rate based ban to recaptcha?

I successfully got rate-based-limit working in Cloud Armor. reCaptcha works for me too. But I'm looking for a solution if cloud armor rate based can redirect users to recaptcha after exceeding some number of requests?
rate-based-limit
gcloud beta compute security-policies rules create 100 \
--security-policy=$CA_POLICY \
--expression="true" \
--action=rate-based-ban \
--rate-limit-threshold-count=50 \
--rate-limit-threshold-interval-sec=120 \
--ban-duration-sec=300 \
--conform-action=allow \
--exceed-action=deny-404 \
--enforce-on-key=IP
recaptcha redirect
gcloud compute security-policies rules create 101 \
--security-policy $CA_POLICY \
--expression "request.path.matches(\"/index.php\")" \
--action redirect \
--redirect-type google-recaptcha

How to get latest version of an image from artifact registry

is there a command (gcloud) that return the latest fully qualified name of an image from Artifact registry
Try:
PROJECT=
REGION=
REPO=
IMAGE=
gcloud artifacts docker images list \
${REGION}-docker.pkg.dev/${PROJECT}/${REPO} \
--filter="package=${REGION}-docker.pkg.dev/${PROJECT}/${REPO}/${IMAGE}" \
--sort-by="~UPDATE_TIME" \
--limit=1 \
--format="value(format("{0}#{1}",package,version))"
Because:
Filters the list for a specific image
Sorts the results descending (~) by UPDATE_TIME1
Only takes 1 value i.e. the most recent
Outputs the results as {package}#{version}
1 -- Curiously, --sort-by uses the output (!) field name not the underlying type (surfaced by e.g. --format=json or --format=yaml) name.
Many thanks to the previous answer, I use it to remove the tag "latest" of my last pushed artifact. I then add it when I push another. Leaving here if anyone interested.
Doc : https://cloud.google.com/artifact-registry/docs/docker/manage-images#tag
Remove tag :
gcloud artifacts docker tags delete \
$(gcloud artifacts docker images list ${REGION}-docker.pkg.dev/\
${PROJECT}/${REPO}/${IMAGE}/\
--filter="package=${REGION}-docker.pkg.dev/${PROJECT}/${REPO}/${IMAGE}"\
--sort-by="~UPDATE_TIME" --limit=1 --format="value(format("{0}",package))"):latest
Add tag:
gcloud artifacts docker tags add \
$(gcloud artifacts docker images list \
${REGION}-docker.pkg.dev/${PROJECT}/${REPO}/${IMAGE}/ \
--filter="package=${REGION}-docker.pkg.dev/${PROJECT}/${REPO}/${IMAGE}" \
--sort-by="~UPDATE_TIME" --limit=1 \
--format="value(format("{0}#{1}",package,version))") \
$(gcloud artifacts docker images list \
${REGION}-docker.pkg.dev/${PROJECT}/${REPO}/${IMAGE}/ \
--filter="package=${REGION}-docker.pkg.dev/${PROJECT}/${REPO}/${IMAGE}" \
--sort-by="~UPDATE_TIME" --limit=1 \
--format="value(format("{0}",package))"):latest

unable to create a gcloud alert policy in command line with multiple conditions

I am trying to create a single alert policy for Cloud-Sql instance_state through gcloud with multiple conditions.
If the instance is in "RUNNABLE" OR "FAILED" state for more than 5 minutes, then a alert should be triggerred. I was able to create that in console and below is the screenshot:
Now I try the same using the command line and give this gcloud command:
gcloud alpha monitoring policies create \
--display-name='Test Database State Alert ('$PROJECTID')' \
--condition-display-name='Instance is not running for 5 minutes'\
--notification-channels="x23234dfdfffffff" \
--aggregation='{"alignmentPeriod": "60s","perSeriesAligner": "ALIGN_COUNT_TRUE"}' \
--condition-filter='metric.type="cloudsql.googleapis.com/database/instance_state" AND resource.type="cloudsql_database" AND (metric.labels.state = "RUNNABLE")'
OR 'metric.type="cloudsql.googleapis.com/database/instance_state" AND resource.type="cloudsql_database" AND (metric.labels.state = "FAILED")' \
--duration='300s' \
--if='> 0.0' \
--trigger-count=1 \
--combiner='OR' \
--documentation='The rule "${condition.display_name}" has generated this alert for the "${metric.display_name}".' \
--project="$PROJECTID" \
--enabled
I am getting the error below in the OR part of the condition:
ERROR: (gcloud.alpha.monitoring.policies.create) unrecognized arguments:
OR
metric.type="cloudsql.googleapis.com/database/instance_state" AND resource.type="cloudsql_database" AND (metric.labels.state = "FAILED")
Even if i put ( ) over the condition still it fails, also the || operator also fails.
Can anyone please tell me the correct gcloud command for this? Also i want the structure of the alert policy to be similar to the one created in cloud-console as shown above
Thanks
I was able to use gcloud alpha monitoring policies conditions create to append additional conditions.
gcloud alpha monitoring policies create \
--notification-channels=projects/qwiklabs-gcp-04-d822dd6cd419/notificationChannels/2510735656842641871 \
--aggregation='{"alignmentPeriod": "60s","perSeriesAligner": "ALIGN_MEAN"}' \
--condition-display-name='CPU Utilization >0.95 for 1m'\
--condition-filter='metric.type="compute.googleapis.com/instance/cpu/utilization" resource.type="gce_instance"' \
--duration='1m' \
--if='> 0.95' \
--display-name=' alert on spikes or consistantly high cpu' \
--combiner='OR'
gcloud alpha monitoring policies list --format='value(name,displayName)'
gcloud alpha monitoring policies conditions create \
projects/qwiklabs-gcp-04-d822dd6cd419/alertPolicies/1712202834227136574 \
--aggregation='{"alignmentPeriod": "60s","perSeriesAligner": "ALIGN_MEAN"}' \
--condition-display-name='CPU Utilization >0.80 for 10m'\
--condition-filter='metric.type="compute.googleapis.com/instance/cpu/utilization" resource.type="gce_instance"' \
--duration='10m' \
--if='> 0.80'
Duplicate --condition-filter clauses did not work for me. YMMV.
From the docs gcloud alpha monitoring policies create, it appears that you can specify repeated (!) occurrences of:
[--aggregation=AGGREGATION --condition-display-name=CONDITION_DISPLAY_NAME --condition-filter=CONDITION_FILTER --duration=DURATION --if=IF_VALUE --trigger-count=TRIGGER_COUNT | --trigger-percent=TRIGGER_PERCENT]
So I think you need to duplicate your --condition-filter with the --combiner="OR", i.e.
gcloud alpha monitoring policies create \
--display-name='Test Database State Alert ('$PROJECTID')' \
--notification-channels="x23234dfdfffffff" \
--aggregation='{"alignmentPeriod": "60s","perSeriesAligner": "ALIGN_COUNT_TRUE"}' \
--condition-display-name='RUNNABLE'\
--condition-filter='metric.type="cloudsql.googleapis.com/database/instance_state" AND resource.type="cloudsql_database" AND (metric.labels.state = "RUNNABLE")'
--duration='300s' \
--if='> 0.0' \
--trigger-count=1 \
--aggregation='{"alignmentPeriod": "60s","perSeriesAligner": "ALIGN_COUNT_TRUE"}' \
--condition-display-name='FAILED'\
--condition-filter='metric.type="cloudsql.googleapis.com/database/instance_state" AND resource.type="cloudsql_database" AND (metric.labels.state = "FAILED")' \
--duration='300s' \
--if='> 0.0' \
--trigger-count=1 \
--combiner='OR' \
--documentation='The rule "${condition.display_name}" has generated this alert for the "${metric.display_name}".' \
--project="$PROJECTID" \
--enabled

Use `capture_tpu_profile` in AI Platform

we are trying to capture TPU profiling data while running our training task on AI Platform. Following this tutorial. All needed information like TPU name getting from our model output.
config.yaml:
trainingInput:
scaleTier: BASIC_TPU
runtimeVersion: '1.15' # also tried '2.1'
task submitting command:
export DATE=$(date '+%Y%m%d_%H%M%S') && \
gcloud ai-platform jobs submit training "imaterialist_image_classification_model_${DATE}" \
--region=us-central1 \
--staging-bucket='gs://${BUCKET}' \
--module-name='efficientnet.main' \
--config=config.yaml \
--package-path="${PWD}/efficientnet" \
-- \
--data_dir='gs://${BUCKET}/tfrecords/' \
--train_batch_size=8 \
--train_steps=5 \
--model_dir="gs://${BUCKET}/algorithms_training/imaterialist_image_classification_model/${DATE}" \
--model_name='efficientnet-b4' \
--skip_host_call=true \
--gcp_project=${GCP_PROJECT_ID} \
--mode=train
When we tried to run capture_tpu_profile with name that our model got from master:
capture_tpu_profile --gcp_project="${GCP_PROJECT_ID}" --logdir='gs://${BUCKET}/algorithms_training/imaterialist_image_classification_model/20200318_005446' --tpu_zone='us-central1-b' --tpu='<tpu_IP_address>'
we got this error:
File "/home/kovtuh/.local/lib/python3.7/site-packages/tensorflow_core/python/distribute/cluster_resolver/tpu_cluster_resolver.py", line 480, in _fetch_cloud_tpu_metadata
"constructor. Exception: %s" % (self._tpu, e))
ValueError: Could not lookup TPU metadata from name 'b'<tpu_IP_address>''. Please doublecheck the tpu argument in the TPUClusterResolver constructor. Exception: <HttpError 404 when requesting https://tpu.googleapis.com/v1/projects/<GCP_PROJECT_ID>/locations/us-central1-b/nodes/<tpu_IP_address>?alt=json returned "Resource 'projects/<GCP_PROJECT_ID>/locations/us-central1-b/nodes/<tpu_IP_address>' was not found". Details: "[{'#type': 'type.googleapis.com/google.rpc.ResourceInfo', 'resourceName': 'projects/<GCP_PROJECT_ID>/locations/us-central1-b/nodes/<tpu_IP_address>'}]">
Seems like TPU device isn't connected to our project when provided in AI Platform, but what project is connected to and can we get an access to such TPUs to capture it's profile?