Copy docker image from one AWS ECR repo to another

Copy docker image from one AWS ECR repo to another - amazon-web-services

We want to copy a docker image from non-prod to prod ECR account. Is it possible without pulling, retaging and pushing it again.

No you have to run these commands
docker login OLD_REPO
docker pull OLD_REPO/IMAGE:TAG
docker tag OLD_REPO/IMAGE:TAG NEW_REPO/IMAGE:TAG
docker login NEW_REPO
docker push NEW_REPO/IMAGE:TAG

I have written this program in python to migrate all the images (or a specific image) from a repository to another region or to another account in a different region
https://gist.github.com/fabidick22/6a1962697357360f0d73e01950ae962b

Answer: No, you must pull, tag, and push.
I wrote a bash script for this today. You can specify the number of tagged images that will be copied.
https://gist.github.com/virtualbeck/a635ef6701991f2087384eab7edbb18b

a slight improvement ( and may be couple of bug fixes on ) this answer: https://stackoverflow.com/a/69905254/65706
set -e
################################# UPDATE THESE #################################
LAST_N_TAGS=10
SRC_AWS_REGION="us-east-1"
TGT_AWS_REGION="eu-central-1"
SRC_AWS_PROFILE="your_source_aws_profile"
TGT_AWS_PROFILE="your_target_aws_profile"
SRC_BASE_PATH="386151140899.dkr.ecr.$SRC_AWS_REGION.amazonaws.com"
TGT_BASE_PATH="036149202915.dkr.ecr.$TGT_AWS_REGION.amazonaws.com"
#################################################################################
URI=($(aws ecr describe-repositories --profile $SRC_AWS_PROFILE --query 'repositories[].repositoryUri' --output text --region $SRC_AWS_REGION))
NAME=($(aws ecr describe-repositories --profile $SRC_AWS_PROFILE --query 'repositories[].repositoryName' --output text --region $SRC_AWS_REGION))
echo "Start repo copy: `date`"
# source account login
aws --profile $SRC_AWS_PROFILE --region $SRC_AWS_REGION ecr get-login-password | docker login --username AWS --password-stdin $SRC_BASE_PATH
# destination account login
aws --profile $TGT_AWS_PROFILE --region $TGT_AWS_REGION ecr get-login-password | docker login --username AWS --password-stdin $TGT_BASE_PATH
for i in ${!URI[#]}; do
echo "====> Grabbing latest $LAST_N_TAGS from ${NAME[$i]} repo"
# create ecr repo if one does not exist in destination account
aws ecr --profile $SRC_AWS_PROFILE --region $SRC_AWS_REGION describe-repositories --repository-names ${NAME[$i]} || aws ecr --profile $TGT_AWS_PROFILE --region $TGT_AWS_REGION create-repository --repository-name ${NAME[$i]}
for tag in $(aws ecr describe-images --repository-name ${NAME[$i]} \
--query 'sort_by(imageDetails,& imagePushedAt)[*]' \
--filter tagStatus=TAGGED --output text \
| grep IMAGETAGS | awk '{print $2}' | tail -$LAST_N_TAGS); do
# if [[ ${NAME[$i]} == "repo-name/frontend-nba" ]]; then
# continue
# fi
# # 386517340899.dkr.ecr.us-east-1.amazonaws.com/spectralha-api/white-ref-detector
# if [[ ${NAME[$i]} == "386351741199.dkr.ecr.us-east-1.amazonaws.com/repo-name/white-ref-detector" ]]; then
# continue
# fi
echo "START ::: pulling image ${URI[$i]}:$tag"
AWS_REGION=$SRC_AWS_REGION AWS_PROFILE=$SRC_AWS_PROFILE docker pull ${URI[$i]}:$tag
AWS_REGION=$SRC_AWS_REGION AWS_PROFILE=$SRC_AWS_PROFILE docker tag ${URI[$i]}:$tag $TGT_BASE_PATH/${NAME[$i]}:$tag
echo "STOP ::: pulling image ${URI[$i]}:$tag"
echo "START ::: pushing image $TGT_BASE_PATH/${NAME[$i]}:$tag"
# status=$(AWS_REGION=$TGT_AWS_REGION AWS_PROFILE=$TGT_AWS_PROFILE docker push $TGT_BASE_PATH/${NAME[$i]}:$tag)
# echo $status
AWS_REGION=$TGT_AWS_REGION AWS_PROFILE=$TGT_AWS_PROFILE docker push $TGT_BASE_PATH/${NAME[$i]}:$tag
echo "STOP ::: pushing image $TGT_BASE_PATH/${NAME[$i]}:$tag"
sleep 2
echo ""
done
# docker image prune -a -f #clean-up ALL the images on the system
done
echo "Finish repo copy: `date`"
echo "Don't forget to purge you local docker images!"
#Uncomment to delete all
#docker rmi $(for i in ${!NAME[#]}; do docker images | grep ${NAME[$i]} | tr -s ' ' | cut -d ' ' -f 3 | uniq; done) -f

Related

I'm getting "command not found" error after running these commands in GitLab. I used these commands 2 days ago and they worked fine

I was using these commands for my deploy-job the other day and it worked fine. This is a new pipeline for a new project and now these commands aren't working. I'm getting errors in my pipeline after every command saying "command not found". Here's my gitlab-ci file for reference
variables:
DOCKER_REGISTRY: 775362094965.dkr.ecr.us-west-2.amazonaws.com
AWS_DEFAULT_REGION: us-west-2
APP_NAME: flask-app
DOCKER_HOST: tcp://docker:2375
stages:
- build
- deploy
build-job:
stage: build
image:
name: amazon/aws-cli
entrypoint: [""]
services:
- docker:dind
before_script:
- amazon-linux-extras install docker
- aws --version
- docker --version
script:
- docker build -t $DOCKER_REGISTRY/$APP_NAME:latest .
- aws ecr get-login-password | docker login --username AWS --password-stdin $DOCKER_REGISTRY
- docker push $DOCKER_REGISTRY/$APP_NAME:latest
deploy-job:
stage: deploy
script:
- echo `aws ecs describe-task-definition --task-definition $CI_AWS_ECS_TASK_DEFINITION --region us-west-2` > input.json
- echo $(cat input.json | jq '.taskDefinition.containerDefinitions[].image="'$REPOSITORY_URI':'$IMAGE_TAG'"') > input.json
- echo $(cat input.json | jq '.taskDefinition') > input.json
- echo $(cat input.json | jq 'del(.taskDefinitionArn)' | jq 'del(.revision)' | jq 'del(.status)' | jq 'del(.requiresAttributes)' | jq 'del(.compatibilities)' | jq 'del(.registeredAt)' | jq 'del(.registeredBy)') > input.json
- aws ecs register-task-definition --cli-input-json file://input.json --region us-west-2
- revision=$(aws ecs describe-task-definition --task-definition $CI_AWS_ECS_TASK_DEFINITION --region us-west-2 | egrep "revision" | tr "/" " " | awk '{print $2}' | sed 's/"$//' | cut -d "," -f 1)
- aws ecs update-service --cluster $CI_AWS_ECS_CLUSTER --service $CI_AWS_ECS_SERVICE --task-definition $CI_AWS_ECS_TASK_DEFINITION:$revision --region us-west-2
My build-job works fine, I'm just getting "command not found" with my deploy-job.

You need to specify an image outside of the build job or in the deploy job. Right now, you're only specifying an image inside your build-job.

Why the environment variable is not working with Docker run command used in Jenkins pipeline

Below is my Deployment stage pipeline code.
stage('Deploy') {
if (continueBuild) {
println("Start Deployment");
//Deploy step for liberty-web
if ("${repo_name}" == 'enterprise-content-management/liberty-web') {
if ("${deploy_env}" == "DEV") {
def REACT_APP_CONFIGS = sh(script: "aws ssm get-parameter --region us-east-1 --name \"/liberty/config/liberty-web_dev/app.config\" | jq -r '.Parameter.Value'", returnStdout: true).trim().replaceAll('\n', '').replaceAll('\"', '\\\\"');
def APP_SPECIFIC_CONFIG = sh(script: "aws ssm get-parameter --region us-east-1 --name \"/liberty/config/liberty-web_dev/app.appSpecificConfig\" | jq -r '.Parameter.Value'", returnStdout: true).trim().replaceAll('\n', '').replaceAll('\"', '\\\\"');
print REACT_APP_CONFIGS
print APP_SPECIFIC_CONFIG
def CLOUDFRONT_DISTRIBUTION_ID = sh(script: "aws ssm get-parameter --region us-east-1 --name \"/liberty/config/liberty-web_dev/cloudfront.distribution.id\" | jq -r '.Parameter.Value'", returnStdout: true).trim()
print CLOUDFRONT_DISTRIBUTION_ID
def DEPLOYMENT_BUCKET = sh(script: "aws ssm get-parameter --region us-east-1 --name \"/liberty/config/liberty-web_dev/s3.bucket.name\" | jq -r '.Parameter.Value'", returnStdout: true).trim()
print DEPLOYMENT_BUCKET
writeFile file: 'build-web-dev.sh', text: "#!/usr/bin/env bash \n docker run --rm --env REACT_APP_CONFIGS=\"${REACT_APP_CONFIGS}\" --env APP_SPECIFIC_CONFIG=\"${APP_SPECIFIC_CONFIG}\" --name liberty-web -v /data/jenkins/workspace/liberty-web-deployment:/Project -w /Project node:12-alpine npm run build"
sh 'cat build-web-dev.sh'
sh 'bash build-web-dev.sh'
sh "aws cloudfront create-invalidation --distribution-id ${CLOUDFRONT_DISTRIBUTION_ID} --paths \"/*\" && aws s3 sync build/ s3://${DEPLOYMENT_BUCKET}"
}
}
}
}
This is a node app. When i try to access below 2 env variables mentioned(REACT_APP_CONFIGS, APP_SPECIFIC_CONFIG) only REACT_APP_CONFIGS works. These values of the params are stored in SSM in AWS. I tried by putting the same value for both variables. But still the same. Ex;-
In my node app
console.log(process.env.REACT_APP_CONFIGS) -> gives correct value
console.log(process.env.APP_SPECIFIC_CONFIG) -> undefined
What is the reason for this behaviour?

Fetch a particular tagged latest image from AWS ECR repo

Is it possible to fetch latest image from ECR with a particular docker tag which starts from develop like developXXX?
I am able to see latest image from a repo with this:
aws ecr describe-images --repository-name reponame --output text --region eu-west-1 --query 'sort_by(imageDetails,& imagePushedAt)[*].imageTags[*]' | tr '\t' '\n' | tail -1

Matching 'develop' keyword from all fetched image and returning the latest one with tail -1.
aws ecr describe-images --repository-name reponame --output text --region eu-west-1 --query 'sort_by(imageDetails,& imagePushedAt)[*].imageTags[*]' | grep -w "develop" | tail -1
You can change logic in grep -w "develop" part which can fit to your condition

How to insert variable to sh aws command in a jenkins pipeline groovy script?

How can I pass the variables into the below sh command? This is as part of the jenkins pipeline groovy script I am using:
Working as it as when I use the values directly as below. I.e with "test" and :us-west-2"
sh '''
$(aws --profile test ecr get-login --no-include-email --region us-west-2)
'''
Not working when I try to parameterise with profile and region as below:
sh '''
$(aws --profile "${params.profile}", ecr get-login --no-include-email --region "${params.region}")
'''
ERROR:
[Pipeline] sh
/opt/slave_home/workspace/test1#tmp/durable-de1d1e87/script.sh: 2: /opt/slave_home/workspace/test1#tmp/durable-de1d1e87/script.sh: Bad substitution
+

Working with the following syntax:
sh '''
(aws --profile "${profile}" ecr get-login --no-include-email --region "${region}")
'''

How to delete untagged images from AWS ECR Container Registry

When pushing images to Amazon ECR, if the tag already exists within the repo the old image remains within the registry but goes in an untagged state.
So if i docker push image/haha:1.0.0 the second time i do this (provided that something changes) the first image gets untagged from AWS ECR.
Is there a way to safely clean up all the registries from untagged images?

You can delete all images in a single request, without loops:
IMAGES_TO_DELETE=$( aws ecr list-images --region $ECR_REGION --repository-name $ECR_REPO --filter "tagStatus=UNTAGGED" --query 'imageIds[*]' --output json )
aws ecr batch-delete-image --region $ECR_REGION --repository-name $ECR_REPO --image-ids "$IMAGES_TO_DELETE" || true
First it gets a list of images that are untagged, in json format:
[ {"imageDigest": "sha256:..."}, {"imageDigest": "sha256:..."}, ... ]
Then it sends that list to batch-image-delete.
The last || true is required to avoid an error code when there are no untagged images.

Now, that ECR support lifecycle policies (https://docs.aws.amazon.com/AmazonECR/latest/userguide/LifecyclePolicies.html) you can use it to delete the untagged images automatically.
Setting up a lifecycle policy preview using the console
Open the Amazon ECS console at https://console.aws.amazon.com/ecs/.
From the navigation bar, choose the region that contains the
repository on which to perform a lifecycle policy preview.
In the navigation pane, choose Repositories and select a repository.
On the All repositories: repository_name page, choose Dry-Run
Lifecycle Rules, Add.
Enter the following details for your lifecycle policy rule:
For Rule Priority, type a number for the rule priority.
For Rule Description, type a description for the lifecycle policy
rule.
For Image Status, choose either Tagged or Untagged.
If you specified Tagged for Image Status, then for Tag Prefix List,
you can optionally specify a list of image tags on which to take
action with your lifecycle policy. If you specified Untagged, this
field must be empty.
For Match criteria, choose values for Count Type, Count Number, and
Count Unit (if applicable).
Choose Save
Create additional lifecycle policy rules by repeating steps 5–7.
To run the lifecycle policy preview, choose Save and preview results.
Under Preview Image Results, review the impact of your lifecycle
policy preview.
If you are satisfied with the preview results, choose Apply as
lifecycle policy to create a lifecycle policy with the specified
rules.
From here:
https://docs.aws.amazon.com/AmazonECR/latest/userguide/lpp_creation.html

I actually forged a one line solution using aws cli
aws ecr describe-repositories --output text | awk '{print $5}' | egrep -v '^$' | while read line; do repo=$(echo $line | sed -e "s/arn:aws:ecr.*\///g") ; aws ecr list-images --repository-name $repo --filter tagStatus=UNTAGGED --query 'imageIds[*]' --output text | while read imageId; do aws ecr batch-delete-image --repository-name $repo --image-ids imageDigest=$imageId; done; done
What it's doing is:
get all repositories
for each repository give me all images with tagStatus=UNTAGGED
for each image+repo issue a batch-delete-image
If you have JQ, you can use this version that is more robust by not relying on the changing text format and also more efficient as it batch deletes once per repository:
aws ecr describe-repositories \
| jq --raw-output .repositories[].repositoryName \
| while read repo; do
imageIds=$(aws ecr list-images --repository-name $repo --filter tagStatus=UNTAGGED --query 'imageIds[*]' --output json | jq -r '[.[].imageDigest] | map("imageDigest="+.) | join (" ")');
if [[ "$imageIds" == "" ]]; then continue; fi
aws ecr batch-delete-image --repository-name $repo --image-ids $imageIds;
done
This has been broken up into more lines for readability, so better put it into a function in your .bashrc, but you could of course stuff it into a single line:
aws ecr describe-repositories | jq --raw-output .repositories[].repositoryName | while read repo; do imageIds=$(aws ecr list-images --repository-name $repo --filter tagStatus=UNTAGGED --query 'imageIds[*]' --output json | jq -r '[.[].imageDigest] | map("imageDigest="+.) | join (" ")'); if [[ "$imageIds" == "" ]]; then continue; fi; aws ecr batch-delete-image --repository-name $repo --image-ids $imageIds; done

Setting a Lifecycle policy is definitely the best way of managing this. That being said - if you do have a bunch of images that you want to delete keep in mind that the max for batch-delete-images is 100. So you need to do this is for the number of untagged images is greater than 100:
IMAGES_TO_DELETE=$( aws ecr list-images --repository-name $ECR_REPO --filter "tagStatus=UNTAGGED" --query 'imageIds[0:100]' --output json )
echo $IMAGES_TO_DELETE | jq length # Gets the number of results
aws ecr batch-delete-image --repository-name $ECR_REPO --image-ids "$IMAGES_TO_DELETE" --profile qa || true

If you want to remove an untagged image from a repository you can simply create a JSON lifecycle policy and then use python to apply the JSON policy to the repo
In my case, I am applying the policy to all the ECR repositories that are there in ECR and I have created a "lifecyclepolicy.json" file in my current directory where I have added the lifecycle policy of ECR
Here is my python code:-
import os
import json
import boto3
def ecr_lifecycle(lifecycle_policy):
ecr_client = boto3.client('ecr')
repositories = []
describe_repo_paginator = ecr_client.get_paginator('describe_repositories')
for response_list_repopaginator in describe_repo_paginator.paginate():
for repo in response_list_repopaginator['repositories']:
repositories.append(repo['repositoryName'])
for repository in repositories:
response=ecr_client.put_lifecycle_policy(repositoryName=repository,
lifecyclePolicyText=json.dumps(lifecycle_policy))
return response
if __name__ == '__main__':
path = os.path.dirname(__file__)
json_file = open(os.path.join(path, 'lifecyclepolicy.json'))
data = json.load(json_file)
ecr_lifecycle(data)
If you want to see the JSON file:-
{
"rules": [
{
{
"rulePriority": 10,
"description": "Only keep untagged images for 7 days",
"selection": {
"tagStatus": "untagged",
"countType": "sinceImagePushed",
"countUnit": "days",
"countNumber": 7
}
"action": {
"type": "expire"
}
}
]
}

Base on #Ken J's anwer,
Here is a python script that will clean ALL your ECR:
#!/usr/bin/python3
import subprocess
import json
import os
# Based on: https://stackoverflow.com/questions/40949342/how-to-delete-untagged-images-from-aws-ecr-container-registry
region="us-east-1"
debug = False
def _runCommand(command):
if debug:
print(" ".join(command))
p = subprocess.Popen(command, shell = False, stdout = subprocess.PIPE, stderr = subprocess.PIPE)
return [p.stdout.read().decode("utf-8"), p.stderr.read().decode("utf-8")]
command = "aws ecr describe-repositories --region " + region + " --output json".split(" ")
data = _runCommand(command)[0]
for i in json.loads(data)["repositories"]:
name = i["repositoryName"]
print(name)
command = ["aws", "ecr", "list-images", "--region", region, "--repository-name", name, "--filter", "tagStatus=UNTAGGED", "--query", 'imageIds[*]', "--output" , "json"]
data = _runCommand(command)[0]
command = ["aws", "ecr", "batch-delete-image", "--region", region, "--repository-name", name, "--image-ids",data]
data = _runCommand(command)[0]
print(data)

First Step -->
untaggedImages = aws ecr list-images --repository-name <your_repo_name> --filter "tagStatus=UNTAGGED" --query 'to_string(imageIds[*])' --output json""")
Second step -->
aws ecr batch-delete-image --repository-name <your_repo_name> --image-ids "$untaggedImages" || true """)
to_string function is required because the returned JSON won't be in string format, instead it will be as an Object.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js