I am trying to access S3 objects user-defined metadata inside codebuild and set as environment variable.
As per docs, it only output etag and VersionId. So I am assuming by default user defined metadata is not exported to codepipeline when s3 is a source action
I am thinking to use aws cli command and then set this as environment variable for the codebuild. Is there a better way?
aws s3api head-object --bucket bucket-name --profile profile --key xxxx.zip
You are right: the only way to get the object metadata is to use a head-object CLI call. You can use the below buildspec in your CodeBuild stage to get the object metadata for a pipeline with s3 source action.
version: 0.2
phases:
build:
commands:
- BUCKET_PATH=$(echo $CODEBUILD_SOURCE_VERSION | cut -d ':' -f 6)
- BUCKET=$(echo $BUCKET_PATH | cut -d '/' -f 1)
- KEY=$(echo $BUCKET_PATH | cut -d '/' -f 2,3,4)
- aws s3api head-object --bucket $BUCKET --key $KEY --query Metadata
Please note that updating metadata on the s3 source object will also trigger the pipeline with s3 source action.
Related
Is there an easy way using the AWS CLI to delete all size 0 objects under a prefix?
For example if our s3 prefix looks like this:
$ aws s3 ls --recursive s3://bucket/prefix
2022-04-20 10:39:51 0 empty_file
2022-04-20 10:39:52 21 top_level_file
2022-04-14 15:01:34 0 folder_a/nested_empty_file
2022-04-23 03:35:02 42 folder_a/dont_delete_me
I would like an aws cli command line invocation to just delete empty_file and folder_a/nested_empty_file.
I know this could be done via boto or any other number of s3 api implementations, but it feels like I should be able to do this in a one-liner from the command line given how simple it is.
Using the aws s3api subcommand and jq for json wrangling we can do the following:
aws s3api delete-objects --bucket bucket --delete "$(aws s3api list-objects-v2 --bucket bucket --prefix prefix --query 'Contents[?Size==`0`]' | jq -c -r '{ "Objects": [.[] | {"Key": .Key}] }')"
aws s3api does not yet support reading from stdin (see GH PR here: https://github.com/aws/aws-cli/pull/3209) so we need to pass the list of objects via sub-shell expansion, which is unfortunately a little awkward but still meets my requirements of being a one-liner.
I have 2 AWS accounts. Lets say A and B.
Account A uses CodeBuild to build and upload artifacts to an S3 bucket owned by B. B account has set a ACL permission for the bucket in order to give Write permissions to A.
The artifact file is successfully uploaded to the S3 bucket. However, B account doesnt have any permission over the file, since the file is owned by A.
Account A can change the ownership by running
aws s3api put-object-acl --bucket bucket-name --key key-name --acl bucket-owner-full-control
But this has to be manually run after every build from A account. How can I grant permissions to account B through CodeBuild procedure? Or how can account B override this ownership permission error.
The CodeBuild starts automatically with web-hooks and my buildspec is this:
version: 0.2
env:
phases:
install:
runtime-versions:
java: openjdk8
commands:
- echo Entered the install phase...
build:
commands:
- echo Entered the build phase...
post_build:
commands:
- echo Entered the post_build phase...
artifacts:
files:
- 'myFile.txt'
CodeBuild does not natively support writing artifact to a different account as it does not set proper ACL on the cross account object. This is the reason the following limitation is called out in the CodePipeline documentation:
Cross-account actions are not supported for the following action types:
Jenkins build actions
CodeBuild build or test actions
https://docs.aws.amazon.com/codepipeline/latest/userguide/pipelines-create-cross-account.html
One workaround is setting the ACL on the artifact yourself in the CodeBuild:
version: 0.2
phases:
post_build:
commands:
- aws s3api list-objects --bucket testingbucket --prefix CFNtest/OutputArti >> $CODEBUILD_SRC_DIR/objects.json
- |
for i in $(jq -r '.Contents[]|.Key' $CODEBUILD_SRC_DIR/objects.json); do
echo $i
aws s3api put-object-acl --bucket testingbucket --key $i --acl bucket-owner-full-control
done
I did it using aws cli commands from the build phase.
version: 0.2
phases:
build:
commands:
- mvn install...
- aws s3 cp my-file s3://bucketName --acl bucket-owner-full-control
I am using the build phase, since post_build will be executed even if the build was not successful.
edit: updated answer with a sample.
I am using a command using aws cli in my windows machine to get latest file from s3 bucket .
aws s3 ls s3://Bucket-name --recursive | sort |tail -n 1
It is listing all the files in sorted manner according to date upto here:
aws s3 ls s3://Bucket-name --recursive | sort
But writing the full command throws error:
'Tail is not recognized as an internal or external command'.
Is there some other alternative for tail or for the full command.
The AWS CLI permits JMESPath expressions in the --query parameter.
This command shows the most recently-updated object:
aws s3api list-objects --bucket my-bucket --query 'sort_by(Contents, &LastModified)[-1].Key' --output text
It's basically saying:
Sort by LastModified
Obtain the last [-1] entry
Show the Key (filename)
We have ~400,000 files on a private S3 bucket that are inbound/outbound call recordings. The files have a certain pattern to it that lets me search for numbers both inbound and outbound. Note these calls are on the Glacier storage class
Using AWS CLI, I can search through this bucket and grep the files I need out. What I'd like to do is now initiate an S3 restore job to expedited retrieval (so ~1-5 minute recovery time), and then maybe 30 minutes later run a command to download the files.
My efforts so far:
aws s3 ls s3://exetel-logs/ --recursive | grep .*042222222.* | cut -c 32-
Retreives the key of about 200 files. I am unsure of how to proceed next, as aws s3 cp wont work for any objects in storage class.
Cheers,
The AWS CLI has two separate commands for S3: s3 ands3api. s3 is a high level abstraction with limited features, so for restoring files, you'll have to use one of the commands available with s3api:
aws s3api restore-object --bucket exetel-logs --key your-key
If you afterwards want to copy the files, but want to ensure to only copy files which were restored from Glacier, you can use the following code snippet:
for key in $(aws s3api list-objects-v2 --bucket exetel-logs --query "Contents[?StorageClass=='GLACIER'].[Key]" --output text); do
if [ $(aws s3api head-object --bucket exetel-logs --key ${key} --query "contains(Restore, 'ongoing-request=\"false\"')") == true ]; then
echo ${key}
fi
done
Have you considered using a high-level language wrapper for the AWS CLI? It will make these kinds of tasks easier to integrate into your workflows. I prefer the Python implementation (Boto 3). Here is example code for how to download all files from an S3 bucket.
Trying to copy a local file named test.txt to my s3 bucket and add metadata to the file.
But it always prints error:
argument --metadata-directive: Invalid choice, valid choices are: COPY | REPLACE
Is it possible to do this with the cp command, as I understand the docs it should be possible.
AWS CLI CP DOCS
This is the commands I've tried:
aws s3 cp test.txt to s3://a-bucket/test.txt --metadata x-amz-meta-cms-id:34533452
aws s3 cp test.txt to s3://a-bucket/test.txt --metadata-directive COPY --metadata x-amz-meta-cms-id:34533452
aws s3 cp test.txt to s3://a-bucket/test.txt --metadata-directive COPY --metadata '{"x-amz-meta-cms-id":"34533452"}'
aws s3 cp test.txt to s3://a-bucket/test.txt --metadata '{"x-amz-meta-cms-id":"34533452"}'
aws --version:
aws-cli/1.9.7 Python/2.7.10 Darwin/16.1.0 botocore/1.3.7
OS: macOS Sierra version 10.12.1
Edit
Worth mentioning is that uploading a file without the --metadata flag works fine.
Hmm, I've checked the help for my version of cli with aws s3 cp help
Turns out it does not list --metadata as an option, as the docs at the given link above does.
If runnig older version of aws cli
Use aws s3api put-object
How to upload a file to a bucket and add metadata:
aws s3api put-object --bucket a-bucket --key test.txt --body test.txt --metadata '{"x-amz-meta-cms-id":"34533452"}'
Docs: AWS S3API DOCS
Indeed the support for metadata option has been added since 1.9.10
aws s3 Added support for custom metadata in cp, mv, and sync.
so upgrading your aws cli to this version (or even better to latest) - and the metadata value needs to be a map so
aws s3 cp test.txt s3://a-bucket/test.txt --metadata '{"x-amz-meta-cms-id":"34533452"}'
Install s3cmd tools (free) and invoke like so:
s3cmd modify --add-header x-amz-meta-foo:bar s3://<bucket>/<object>
With x-amz-meta-foo:bar header you will get foo as key and bar as value of that key.
There are special flags to set Content-Type and Content-Encoding
aws s3 cp test.gz. s3://a-bucket/test.gz --content-type application/octet-stream --content-encoding gzip
There is bug with metadata directive "COPY" option.
aws s3api copy-object --bucket testkartik --copy-source testkartik/costs.csv --key costs.csv --metadata-directive "COPY" --metadata "SomeKey=SomeValue"
Below are the three steps to understand cli command with JQ workaround.
Install JQ library to deal with json metadata using command line.
Read the existing metadata.
aws s3api head-object --bucket <bucket> --key <key> | jq '.Metadata' | jq --compact-output '. +{"new":"metadata", "another" : "metadata"}'
Add new metadata.
aws s3api copy-object --bucket <bucket-name> --copy-source <bucket/key> --key <key> --metadata-directive "REPLACE" --metadata $(READ-THE-EXISTING-From-Step-2)
Complete command in one go.
aws s3api copy-object --bucket <bucket-name> --copy-source <bucket/key> --key <key> --metadata-directive "REPLACE" --metadata $(aws s3api head-object --bucket <bucket> --key <key> | jq '.Metadata' | jq --compact-output '. +{"new":"metadata", "another" : "metadata"}')