Google Artifacts vulnerability report - On demand image scanning - google-cloud-platform

I have scenario where I am using GCP an on-demand image scanning using gcloud - like:
gcloud artifacts docker images scan europe-west2-docker.pkg.dev/ORG_NAME/myrepo/python_script#sha256:4e3dd2d724ded3cc434ec9fdc33bdfdab1c0d579430b64c9baf4ecb901115b05 --remote --location=europe
And its working fine and I can retrieve the vulnerability scan report using:
gcloud artifacts docker images list-vulnerabilities projects/ORG_NAME/locations/europe/scans/661e3c2a-c27c-6617-9088-80c7d40e14b8
Now the problem that this report doesn't appear on the artifacts dashboard like it does for auto scanning.
Auto Scanned Image:
Auto Scanned Image - Vulnerability Scan report
Any idea how can I make this on demand scanning report to show on artifacts dashboard for the particular image.
I have tried to look for storing artifacts metadata: https://cloud.google.com/container-analysis/docs/metadata-storage - but I can't find a way to store the data and make it appear on the artifacts dashboard.

The artifacts dashboard only provides data for containers that were scanned automatically. Gcloud is the preferred way to retrieve vulnerability data from On Demand Scanning.

Related

Get container image label without pulling the image from GCR

I am trying to create a dataset for at least 250 container images built by docker and pushed to a single GCP project on Google Container Repository (GCR). The GCR is highly active, thus it changes the version quite frequently, thus the automation.
All of these images add a certain label at the time of push from the CI system. I want to add those labels in the dataset. I tried accessing the label and its value after pulling the image, however, pulling 250+ images and then inspecting them is taking too much resources on this automation and may not even be possible.
So in short, I just want to know if there's any gcloud API (REST or CLI) which can fetch the label metadata without pulling the image first?
I tried looking in the docs, but couldn't find anything. I tried the following command which only gives the SHA256 digest and the repository details, but not labels
gcloud container images describe gcr.io/[PROJECT-ID]/[IMAGE]
# Output
image_summary:
digest: sha256:[SHA_DIGEST_HERE]
fully_qualified_digest: gcr.io/[PROJECT-ID]/[IMAGE]#sha256:[SHA_DIGEST_HERE]
registry: gcr.io
repository: [PROJECT-ID]/[IMAGE]
Update:
I tried the curl command with the access token which gave me different layers instead
$> curl https://gcr.io:443/v2/[PROJECT-ID]/[IMAGE]/manifests/latest -H "Authorization: Bearer {token}"
// output
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json",
"size": [size],
"digest": "sha256:[SHA_256_DIGEST]"
},
"layers": [
// different layers here
]
}
Not sure how can I actually extract the manifest itself and look into it.
I want something like what this question is asking, but for GCR instead of dockerhub.
As of the moment, only Artifact Registry has the Label Repositories option to identify and group related repositories.
I would suggest if you want to use labels, you may want to migrate from Google Container Registry to Artifact Registry in order to use Label Repositories.
Another option is you may want to file this one as a feature request. Please be advised that this doesn't have a specific ETA however you can still keep track of the progress by following the thread once the ticket has been created.

GCP Vertex AI Training: Auto-packaged Custom Training Job Yields Huge Docker Image

I am trying to run a Custom Training Job in Google Cloud Platform's Vertex AI Training service.
The job is based on a tutorial from Google that fine-tunes a pre-trained BERT model (from HuggingFace).
When I use the gcloud CLI tool to auto-package my training code into a Docker image and deploy it to the Vertex AI Training service like so:
$BASE_GPU_IMAGE="us-docker.pkg.dev/vertex-ai/training/pytorch-gpu.1-7:latest"
$BUCKET_NAME = "my-bucket"
gcloud ai custom-jobs create `
--region=us-central1 `
--display-name=fine_tune_bert `
--args="--job_dir=$BUCKET_NAME,--num-epochs=2,--model-name=finetuned-bert-classifier" `
--worker-pool-spec="machine-type=n1-standard-4,replica-count=1,accelerator-type=NVIDIA_TESLA_V100,executor-image-uri=$BASE_GPU_IMAGE,local-package-path=.,python-module=trainer.task"
... I end up with a Docker image that is roughly 18GB (!) and takes a very long time to upload to the GCP registry.
Granted the base image is around 6.5GB but where do the additional >10GB come from and is there a way for me to avoid this "image bloat"?
Please note that my job loads the training data using the datasets Python package at run time and AFAIK does not include it in the auto-packaged docker image.
The image size shown in the UI is the virtual size of the image. It is the compressed total image size that will be downloaded over the network. Once the image is pulled, it will be extracted and the resulting size will be bigger. In this case, the PyTorch image's virtual size is 6.8 GB while the actual size is 17.9 GB.
Also, when a docker push command is executed, the progress bars show the uncompressed size. The actual amount of data that’s pushed will be compressed before sending, so the uploaded size will not be reflected by the progress bar.
To cut down the size of the docker image, custom containers can be used. Here, only the necessary components can be configured which would result in a smaller docker image. More information on custom containers here.

Weird SHA1-like tags on gcr images

We're noticing weird SHA1-like tags on our gcr images. The nature of these tags are that
they are the same size as SHA1, i.e. exactly 40 hexadecimal characters
we didn't create them
any image that is tagged by us does not have this weird SHA1-like tag
What are these tagged images and can they be deleted?
The "weird SHA1-like tags" in your Container Registry images is created automatically by Cloud Build. Check this related StackOverflow answer.
As an example, here's an image that is created when I deployed an application on App Engine:
Upon navigating, here's the details:
Yes, there's an option removing or deleting unused image from the container registry.
Just a note, the container images delete command deletes the specified image from the registry, and all associated tags are also deleted.
For example, in order to list all untagged images in a project, first filter digests that lack tags using this command:
gcloud container images list-tags [HOSTNAME]/[PROJECT-ID]/[IMAGE] --filter='-tags:*' --format="get(digest)" --limit=$BIG_NUMBER
Another option is, gcloud container images list-tags [HOSTNAME]/[PROJECT-ID]/[IMAGE] --format=json --limit=unlimited, this will give you an easily consumable json blob of information for images in a repository (such as digests with associated tags).
Then, you can iterate over and delete with this command:
gcloud container images delete [HOSTNAME]/[PROJECT-ID]/[IMAGE]#DIGEST --quiet

How to fix with custom image from slurm-gcp?

I distributed slurm-gcp using Terraform through the GitHub and it was available successfully. Source:
Slurm on Google Cloud Platform
But I want to change the image I use when using node to a custom image.
I am trying to edit /slurm/scripts/config.yaml.
Among the contents of the file:
image: projects/schedmd-slurm-public/global/images/family/schedmd-slurm-20-11-7-hpc-centos-7
I want to edit the part.
How to reroute this part to my custom image?
First you need to create your own image.
Create a new VM with the image you want to modify; make appropriate changes and stop the VM. Then create a new image from the VM's disk.
Next create a custom image from that disk and your path in the config.yaml file can look like this:
image: projects/my-project-name/global/images/your-image-name
You can get exact path to your custom image by running:
wb#cloudshell:~ (wb)$ gcloud compute images describe your-image-name | grep selfLink
selfLink: https://www.googleapis.com/compute/v1/projects/wb/global/images/your-image-name

Google Cloud AI Platform: Image Data Labeling Service Error - Image URI not valid

Error in Google Cloud Data labeling Service:
I am trying to create a dataset of images in Google's Data labeling service.
Using a single image to test it out.
Created a Google storage bucket named: my-bucket
Uploaded an image to my-bucket - image file name: testcat.png
Created and uploaded a csv file (UTF-8) with URI path of image stored inside it.
image URI path as stored in csv file: gs://my-bucket//testcat.png
Named the csv file : testimage.csv
Uploaded the csv file in the gs bucket - my-bucket.
i.e. testimage.csv, and testcat.png are in the same google storage bucket (my-bucket).
When I try to create the datasset in google console, GCP gives me the following error message:
** Failed to import dataset gs://my-bucket/testcat.png is not a valid
youtube uri nor a readable file path.**
I've checked multiple times and the URI for this image in Google is exactly the same as what I've used. I've tried at least 10-15 times ... the error persists.
Any one faced and successfully resolved this issue?
Your help is greatly appreciated.
Thanks!
As you can see in our AI Platform Data Labeling Service documentation, there is a service update due to the coronavirus (COVID-19) health emergency that states that data labeling services are limited or unavailable until further notice.
You can't start new data labeling tasks through the Cloud Console, Google Cloud SDK, or the API
You can request data labeling tasks only through email at cloudml-data-customer#google.com
New data labeling tasks can't contain personally identifiable information