Weird SHA1-like tags on gcr images - google-container-registry

We're noticing weird SHA1-like tags on our gcr images. The nature of these tags are that
they are the same size as SHA1, i.e. exactly 40 hexadecimal characters
we didn't create them
any image that is tagged by us does not have this weird SHA1-like tag
What are these tagged images and can they be deleted?

The "weird SHA1-like tags" in your Container Registry images is created automatically by Cloud Build. Check this related StackOverflow answer.
As an example, here's an image that is created when I deployed an application on App Engine:
Upon navigating, here's the details:
Yes, there's an option removing or deleting unused image from the container registry.
Just a note, the container images delete command deletes the specified image from the registry, and all associated tags are also deleted.
For example, in order to list all untagged images in a project, first filter digests that lack tags using this command:
gcloud container images list-tags [HOSTNAME]/[PROJECT-ID]/[IMAGE] --filter='-tags:*' --format="get(digest)" --limit=$BIG_NUMBER
Another option is, gcloud container images list-tags [HOSTNAME]/[PROJECT-ID]/[IMAGE] --format=json --limit=unlimited, this will give you an easily consumable json blob of information for images in a repository (such as digests with associated tags).
Then, you can iterate over and delete with this command:
gcloud container images delete [HOSTNAME]/[PROJECT-ID]/[IMAGE]#DIGEST --quiet

Related

Get container image label without pulling the image from GCR

I am trying to create a dataset for at least 250 container images built by docker and pushed to a single GCP project on Google Container Repository (GCR). The GCR is highly active, thus it changes the version quite frequently, thus the automation.
All of these images add a certain label at the time of push from the CI system. I want to add those labels in the dataset. I tried accessing the label and its value after pulling the image, however, pulling 250+ images and then inspecting them is taking too much resources on this automation and may not even be possible.
So in short, I just want to know if there's any gcloud API (REST or CLI) which can fetch the label metadata without pulling the image first?
I tried looking in the docs, but couldn't find anything. I tried the following command which only gives the SHA256 digest and the repository details, but not labels
gcloud container images describe gcr.io/[PROJECT-ID]/[IMAGE]
# Output
image_summary:
digest: sha256:[SHA_DIGEST_HERE]
fully_qualified_digest: gcr.io/[PROJECT-ID]/[IMAGE]#sha256:[SHA_DIGEST_HERE]
registry: gcr.io
repository: [PROJECT-ID]/[IMAGE]
Update:
I tried the curl command with the access token which gave me different layers instead
$> curl https://gcr.io:443/v2/[PROJECT-ID]/[IMAGE]/manifests/latest -H "Authorization: Bearer {token}"
// output
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json",
"size": [size],
"digest": "sha256:[SHA_256_DIGEST]"
},
"layers": [
// different layers here
]
}
Not sure how can I actually extract the manifest itself and look into it.
I want something like what this question is asking, but for GCR instead of dockerhub.
As of the moment, only Artifact Registry has the Label Repositories option to identify and group related repositories.
I would suggest if you want to use labels, you may want to migrate from Google Container Registry to Artifact Registry in order to use Label Repositories.
Another option is you may want to file this one as a feature request. Please be advised that this doesn't have a specific ETA however you can still keep track of the progress by following the thread once the ticket has been created.

Is this another bug of gcloud CLI? Cannot remove a tag with digest reference,

This is related to my previous question.
Cannot deploy Cloud Functions with Cloud Build saying "GOOGLE_MANIFEST_DANGLING_TAG: Manifest is still referenced by tag: latest"
I've read that there is an issue in CLI
https://github.com/GoogleCloudPlatform/docker-credential-gcr/issues/73
attempting to delete the manifest first before removing the tags
So I'm trying to untag the cache image.
But
if I do
gcloud container images untag PATH/cashe#sha256:<digest>:latest
#PATH~DIGEST for sure has been copied from console
there appears an error message saying
ERROR: (gcloud.container.images.untag) digest must be of the form "sha256:<digest>".
Looks to me it's trying to read strings next to the last :(colon) as digest and as tag name at the same time.
By the way this worked
gcloud container images untag PATH/cashe:latest --quiet
though there is a WARNING : Successfully resolved tag to sha256, but it is recommended to use sha256 directly.
Tags are a way to 'label' specific image (manifests) in a repository. The tag may only be used once per image in a repository.
gcloud container images untag requires *.gcr.io/PROJECT_ID/IMAGE_PATH:TAG
You should not include the image's digest (== SHA256 of the manifest). Although including the digest does uniquely identify the image that's tagged, it's redundant; the repository is likely mapping tags (e.g. latest) to image digests.
You should use:
gcloud container images untag ${PATH}/cashe:latest

where does the google gcr images gets stored in google storage?

Unknowingly I have deleted the below buckets from my project
artifacts.<PROJECT-ID>.appspot.com
us.artifacts.<PROJECT-ID>.appspot.com
This has deleted all the images from gcr. Let me know if the above buckets are where the gcr images are stored or is it something else?
Also when I created a new image and pushed it to gcr, all the deleted images in gcr console got recovered. But whenever I try to pull any old image it is throwing "unknown blob" error.
Yes, these buckets are where the docker container artifacts are built and stored..(Artifacts being the build steps results, that add up to an image)
Then they are referenced by the Google Container Registry (i.e. gcr.io), but they will be still located in your bucket.
Since you removed the bucket and its contents, you will be missing old building steps from your built images, that's why you get the error pulling image configuration: unknown blob error message.
For example, I uploaded a new image following this documentation, and I removed the artifacts.<PROJECT-ID>.appspot.com bucket afterwards. Then I reuploaded it, using a tag (I used quickstart-image:tag1, and when pulling it this way:
docker pull gcr.io/wave16-joan/quickstart-image:latest
I got the error pulling image configuration: unknown blob error message, because it's missing the steps I already had in my previous build.
However, doing this:
docker pull gcr.io/wave16-joan/quickstart-image:tag1
Allowed me to pull the image, but the image wasn't usable.
Regarding your second question, I believe that the reason why you are seeing in the Container Registry references to the images you removed, it's because GCR is still saving the references to the steps from building these images, however since they are deleted, they are not able to be pulled.

AWS CodeBuild - Environment based off of image from docker hub

Quick question and this may be a dumb one. I am attempting to use AWS Code Build with an image I've published to Docker Hub. I selected the option to use a custom image, and the the option to look for the image in another location (an external image repo).
I can't seem to figure out how to reference my image in the appropriate format to use it in the other location field.
Any help would be greatly appreciated.
In the "Other location" text box you can enter the image name from DockerHub. For example, simply give "openjdk" or "openjdk:latest" to use https://hub.docker.com/r/library/openjdk/ as the Docker image for your build. Don't put the "docker pull " prefix for your image name is all.
Note that CodeBuild only supports public Docker images from DockerHub today. Private registries are not supported.
Lets say that you published your image in hub.docker.com, and your repo name is gjackson/myrepo, and you want to grab the image tagged latest, you should populate the other location field with docker.io/gjackson/myrep:latest.

Docker force overwrite last tag and pushing on AWS ECR

I'm pushing my images to AWS ECR via docker push ... command. The image is tagged with a specific version.
When I actually push two different images with the same tag, this results in two images on the AWS ECR registry, one which become untagged.
0.0.1 sha256:572219f8764b21e5a045bcc1c5eab399e2bc2370a37b23cb1ca9298ea39e233a 138.33 MB
sha256:60d161db0b9cb1345cf7c3e6119b8eba7114bc2dfc44c0b3ed02454803f6ef76 138.21MB
The problem this is causing is that if I continue to push more images with the same tag, the total size of the repository keeps increasing.
What I would like is to "overwrite" the existing tag when pushing an image. Which means that two different sha256 digest with the same tag would result in a single image on the registry (of course multiple when tag version changes).
Is it possible to do so? I would like to avoid an "untagged" pruning technique if possible. For now, my publish script delete the previous same tag if it exists but I feel this should be handled by AWS ECR or docker push directly.
Unfortunately this is not possible. Here is what you can do:
Use 2 different tags for the images that you want to overwrite. I would recommend a tag with the version and another tag with a known prefix and something guaranteed unique e.g. 1.1.1 and SNAPSHOT-hash. The next time you push an image with the same version, the tag 1.1.1 will be removed from the old image and added to the new image. However the SNAPSHOT-* tags will remain in all images.
Configure a lifecycle policy where images starting from SNAPSHOT- will expire after an image count of more than x. This way, old images will automatically expire.