AWS s3 bucket bulk encryption on all s3 buckets - amazon-web-services

I am trying to bulk update all s3 buckets with default encryption to that i generate a json file using below command
aws s3api list-buckets --query "Buckets[].Name" >> s3.json
My results was names of all s3 buckets.
How do i pass in that json file into the command so i can enable default encryption.
I also tried below
aws s3api list-buckets --query 'Buckets[*].[Name]' --output text | xargs -I {} bash -c 'aws s3api put-bucket-encryption --bucket {} --server-side-encryption-configuration '{"Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]}''
But iam getting below error
Error parsing parameter '--server-side-encryption-configuration': Invalid JSON: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
JSON received: {Rule
aws s3api put-bucket-encryption --bucket bucketnames --server-side-encryption-configuration '{"Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]}'
I have tried below but it does not work.
aws s3api put-bucket-encryption \
--bucket value \
--server-side-encryption-configuration '{"Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]}' \
--cli-input-json file://s3bucket.json
Pleas let me know how to update my command to enable default encryption.

Below is the code snippet to solve your problem:
# Check if bucket is SSE enabled and then encrypt using SSE AES256:
#!/bin/bash
#List all buckets and store names in a array.
arr=(`aws s3api list-buckets --query "Buckets[].Name" --output text`)
# Check the status before encryption:
for i in "${arr[#]}"
do
echo "Check if SSE is enabled for bucket -> ${i}"
aws s3api get-bucket-encryption --bucket ${i} | jq -r .ServerSideEncryptionConfiguration.Rules[0].ApplyServerSideEncryptionByDefault.SSEAlgorithm
done
# Encrypt all buckets in your account:
for i in "${arr[#]}"
do
echo "Encrypting bucket with SSE AES256 for -> ${i}"
aws s3api put-bucket-encryption --bucket ${i} --server-side-encryption-configuration '{"Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]}'
done

aws s3api list-buckets --query "Buckets[].Name" \
| jq .[] \
| xargs -I '{}' aws s3api put-bucket-encryption --bucket {} --server-side-encryption-configuration '{"Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]}'
Worked for me

If you wanted to do it in Python it would be something like this (not tested!):
import boto3
s3_client = boto3.client('s3')
response = s3_client.list_buckets()
for bucket in response['Buckets']
s3_client.put_bucket_encryption(
Bucket=bucket,
ServerSideEncryptionConfiguration={
'Rules': [
{
'ApplyServerSideEncryptionByDefault': {
'SSEAlgorithm': 'AES256'
}
},
]
}
)

Related

AWS CLI - SSL check on every bucket

I'm just getting started with learning AWS CLI, I was wondering is there a way of checking pre-existing buckets and seeing if they have SSL enabled?
Many Thanks
buckets=`aws s3api list-buckets | jq -r '.Buckets[].Name'`
for bucket in $buckets
do
#echo "$bucket"
if aws s3api get-bucket-policy --bucket $bucket --query Policy --output text &> /dev/null; then
aws s3api get-bucket-policy --bucket $bucket --query Policy --output text | jq -r 'select(.Statement[].Condition.Bool."aws:SecureTransport"=="false")' | wc | awk {'print $1'}`

Retrieving files from s3 with content-type

Is there any proper way to retrieve files from s3 with the Content-type using python or AWS CLI?
I've searched and made some queries as below but the first one seems not as intended.
aws s3 ls --summarize --human-readable --recursive s3://<Bucket Name> | egrep '*.jpg*'
And the following query seems working but it also returns 404 errors.
for KEY in $(aws s3api list-objects --bucket <Bucket Name> --query "Contents[].[Key]" --output text) do aws s3api head-object --bucket <Bucket Name> --key $KEY --query "[\`$KEY\`,ContentType]" --output text | awk '$2 == "image/jpeg" { print $1 }'done
One of the reason is, the variable is not expending in the query parameters
--query "[\`$KEY\`,ContentType]"
Here you can look for more details.
How to expand variable in aws-cli --query parameter
so you can try this as just test it out and seems like working.
#!/bin/bash
ContentType="application/octet-stream"
BUCKET=mybucket
MAX_ITME=100
OBJECT_LIST="$(aws s3api list-objects --bucket $BUCKET --query 'Contents[].[Key]' --max-items=$MAX_ITME --output text | tr '\n' ' ' )";
for KEY in ${OBJECT_LIST}
do
aws s3api head-object --bucket $BUCKET --key $KEY --query "[\``echo $KEY`\`,ContentType]" --output text | grep "$ContentType"
done

How to compare versions of an Amazon S3 object?

Versioning of Amazon S3 buckets is nice, but I don't see any easy way to compare versions of a file - either through the console or through any other app I found.
S3Browser seems to have the best versioning support, but no comparison.
Is there a way to compare versions of a file on S3 without downloading both versions and comparing them manually?
--
EDIT:
I just started thinking that some basic automation should not be too hard, see snippet below. Question remains though: is there any tool that supports this properly? This script may be fine for me, but not for non-dev users.
#!/bin/bash
# s3-compare-last-versions.sh
if [[ $# -ne 2 ]]; then
echo "Usage: `basename $0` <bucketName> <fileKey> "
exit 1
fi
bucketName=$1
fileKey=$2
latestVersionId=$(aws s3api list-object-versions --bucket $bucketName --prefix $fileKey --max-items 2 | json Versions[0].VersionId)
previousVersionId=$(aws s3api list-object-versions --bucket $bucketName --prefix $fileKey --max-items 2 | json Versions[1].VersionId)
aws s3api get-object --bucket $bucketName --key $fileKey --version-id $latestVersionId $latestVersionId".js"
aws s3api get-object --bucket $bucketName --key $fileKey --version-id $previousVersionId $previousVersionId".js"
diff $latestVersionId".js" $previousVersionId".js"
I wrote a bash script to download the last two versions of an object and compare it using colordiff. I stumbled across this questions after writing it. Thought I could share it here if anyone wanted to use it.
#!/bin/bash
#This script needs awscli, jq and colordiff. Please install them for your environment
#This script also needs the AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_DEFAULT_REGION.
#Please set them using the export command as follows or set them using envrc
#export AWS_ACCESS_KEY_ID=<Your AWS Access Key ID>
#export AWS_SECRET_ACCESS_KEY=<Your AWS Secret Access Key>
#export AWS_DEFAULT_REGION=<Your AWS Default Region>
set -e
if [ -z $1 ] || [ -z $2 ]; then
echo "Usage:"
echo "version_compare.sh *bucket_name* *file_name*"
echo
echo "Example"
echo "version_compare.sh bucket_name folder/filename.extension"
echo
exit 1;
fi
aws_bucket=$1
file_key=$2
echo Getting the last 2 versions of the file at ${file_key}..
echo
echo Executing:
cat << EOF
aws s3api list-object-versions --bucket ${aws_bucket} --prefix ${file_key} --max-items 2
EOF
echo
versions=$(aws s3api list-object-versions --bucket ${aws_bucket} --prefix ${file_key} --max-items 2)
version_1=$( jq -r '.["Versions"][0]["VersionId"]' <<< "${versions}" )
version_2=$( jq -r '.["Versions"][1]["VersionId"]' <<< "${versions}" )
mkdir -p state_comparison_files
echo Getting the latest version ${version_1} of the file at ${file_key}..
echo
echo Executing:
cat << EOF
aws s3api get-object --bucket ${aws_bucket} --key ${file_key} --version-id ${version_1} state_comparison_files/${version_1}
EOF
aws s3api get-object --bucket ${aws_bucket} --key ${file_key} --version-id ${version_1} state_comparison_files/${version_1} > /dev/null
echo
echo Getting older version ${version_2} of the file at ${file_key}..
echo
echo Executing:
cat << EOF
aws s3api get-object --bucket ${aws_bucket} --key ${file_key} --version-id ${version_2} state_comparison_files/${version_2}
EOF
aws s3api get-object --bucket ${aws_bucket} --key ${file_key} --version-id ${version_2} state_comparison_files/${version_2} > /dev/null
echo
echo Comparing the different versions.
echo If no differences are found, nothing will be shown
colordiff --unified state_comparison_files/${version_2} state_comparison_files/${version_1}
Here's the link to it
https://gist.github.com/mohamednajiullah/3edc88d314291be40f2dd3cf13ea0d7f
Note: It's pretty much the same as the script the question asker himself created except that it uses jq for json parsing and colordiff for showing the difference with different colors like in git diff.
I'm creating an electron.js based desktop app to do exactly this. It's currently in development but it can be used. I welcome contributions
https://github.com/mohamednajiullah/s3_object_version_comparator
You can't view file contents at all via S3, so you definitely can't compare the contents of files via S3. You would have to download the different versions and then use a tool like diff to compare them.
you can use MegaSparDiff an open source too that compares multiple types of datasources including S3
https://github.com/FINRAOS/MegaSparkDiff
the below pair will return inLeftButNotInRight and inRightButNotInLeft as DataFrames which you can save as files or you can examine the data via code.
SparkFactory.initializeSparkContext();
AppleTable leftAppleTable = SparkFactory.parallelizeTextSource("S3://file1","table1");
AppleTable rightAppleTable = SparkFactory.parallelizeTextSource("S3://file2","table2");
Pair<Dataset<Row>, Dataset<Row>> resultPair = SparkCompare.compareAppleTables(leftAppleTable, rightAppleTable);
resultPair.getLeft().show(100);
SparkFactory.stopSparkContext();

How can I read the metadata for every item in an S3 bucket?

I can set Cache-Control metadata on every item in an S3 bucket using the following command (from this answer):
aws s3 cp s3://mybucket s3://mybucket --recursive --metadata-directive REPLACE \
--cache-control max-age=86400
Is there a way to read the Cache-Control metadata for every item in a bucket?
This bash one-liner should work (but it is very slow since it sends separate request for each object):
IFS=$'\n'; for object in `aws s3 ls s3://my-bucket-name --recursive | tr -s ' ' | cut -d' ' -f4-`; do echo $object `aws s3api head-object --bucket my-bucket-name --key $object --query CacheControl` ; done

Does aws-cli confirm checksums when uploading files to S3, or do I need to manage that myself?

If I'm uploading data to S3 using the aws-cli (i.e. using aws s3 cp), does aws-cli do any work to confirm that the resulting file in S3 matches the original file, or do I somehow need to manage that myself?
Based on this answer and the Java API documentation for putObject(), it looks like it's possible to verify the MD5 checksum after upload. However, I can't find a definitive answer on whether aws-cli actually does that.
It matters to me because I'm intending to upload GPG-encrypted files from a backup process, and I'd like some confidence that what's been stored in S3 actually matches the original.
According to the faq from the aws-cli github, the checksums are checked in most cases during upload and download.
Key points for uploads:
The AWS CLI calculates the Content-MD5 header for both standard and
multipart uploads.
If the checksum that S3 calculates does not match
the Content-MD5 provided, S3 will not store the object and instead
will return an error message back the AWS CLI.
The AWS CLI will retry this error up to 5 times before giving up and exiting with a nonzero exit code.
The AWS support page How do I ensure data integrity of objects uploaded to or downloaded from Amazon S3? describes how to achieve this.
Firstly determine the base64 encoded md5sum of the file you wish to upload:
$ md5_sum_base64="$( openssl md5 -binary my-file | base64 )"
Then use the s3api to upload the file:
$ aws s3api put-object --bucket my-bucket --key my-file-name --body my-file-path --content-md5 "$md5_sum_base64"
Note the use of the --content-md5 flag, the help for this flag states:
--content-md5 (string) The base64-encoded 128-bit MD5 digest of the part data.
This does not say much about why to use this flag, but we can find this information in the API documentation for put object:
To ensure that data is not corrupted traversing the network, use the Content-MD5 header. When you use this header, Amazon S3 checks the object against the provided MD5 value and, if they do not match, returns an error. Additionally, you can calculate the MD5 while putting an object to Amazon S3 and compare the returned ETag to the calculated MD5 value.
Using this flag causes S3 to verify that the file hash serverside matches the specified value. If the hashes match s3 will return the ETag:
{
"ETag": "\"599393a2c526c680119d84155d90f1e5\""
}
The ETag value will usually be the hexadecimal md5sum (see this question for some scenarios where this may not be the case).
If the hash does not match the one you specified you get an error.
A client error (InvalidDigest) occurred when calling the PutObject operation: The Content-MD5 you specified was invalid.
In addition to this you can also add the file md5sum to the file metadata as an additional check:
$ aws s3api put-object --bucket my-bucket --key my-file-name --body my-file-path --content-md5 "$md5_sum_base64" --metadata md5chksum="$md5_sum_base64"
After upload you can issue the head-object command to check the values.
$ aws s3api head-object --bucket my-bucket --key my-file-name
{
"AcceptRanges": "bytes",
"ContentType": "binary/octet-stream",
"LastModified": "Thu, 31 Mar 2016 16:37:18 GMT",
"ContentLength": 605,
"ETag": "\"599393a2c526c680119d84155d90f1e5\"",
"Metadata": {
"md5chksum": "WZOTosUmxoARnYQVXZDx5Q=="
}
}
Here is a bash script that uses content md5 and adds metadata and then verifies that the values returned by S3 match the local hashes:
#!/bin/bash
set -euf -o pipefail
# assumes you have aws cli, jq installed
# change these if required
tmp_dir="$HOME/tmp"
s3_dir="foo"
s3_bucket="stack-overflow-example"
aws_region="ap-southeast-2"
aws_profile="my-profile"
test_dir="$tmp_dir/s3-md5sum-test"
file_name="MailHog_linux_amd64"
test_file_url="https://github.com/mailhog/MailHog/releases/download/v1.0.0/MailHog_linux_amd64"
s3_key="$s3_dir/$file_name"
return_dir="$( pwd )"
cd "$tmp_dir" || exit
mkdir "$test_dir"
cd "$test_dir" || exit
wget "$test_file_url"
md5_sum_hex="$( md5sum $file_name | awk '{ print $1 }' )"
md5_sum_base64="$( openssl md5 -binary $file_name | base64 )"
echo "$file_name hex = $md5_sum_hex"
echo "$file_name base64 = $md5_sum_base64"
echo "Uploading $file_name to s3://$s3_bucket/$s3_dir/$file_name"
aws \
--profile "$aws_profile" \
--region "$aws_region" \
s3api put-object \
--bucket "$s3_bucket" \
--key "$s3_key" \
--body "$file_name" \
--metadata md5chksum="$md5_sum_base64" \
--content-md5 "$md5_sum_base64"
echo "Verifying sums match"
s3_md5_sum_hex=$( aws --profile "$aws_profile" --region "$aws_region" s3api head-object --bucket "$s3_bucket" --key "$s3_key" | jq -r '.ETag' | sed 's/"//'g )
s3_md5_sum_base64=$( aws --profile "$aws_profile" --region "$aws_region" s3api head-object --bucket "$s3_bucket" --key "$s3_key" | jq -r '.Metadata.md5chksum' )
if [ "$md5_sum_hex" == "$s3_md5_sum_hex" ] && [ "$md5_sum_base64" == "$s3_md5_sum_base64" ]; then
echo "checksums match"
else
echo "something is wrong checksums do not match:"
cat <<EOM | column -t -s ' '
$file_name file hex: $md5_sum_hex s3 hex: $s3_md5_sum_hex
$file_name file base64: $md5_sum_base64 s3 base64: $s3_md5_sum_base64
EOM
fi
echo "Cleaning up"
cd "$return_dir"
rm -rf "$test_dir"
aws \
--profile "$aws_profile" \
--region "$aws_region" \
s3api delete-object \
--bucket "$s3_bucket" \
--key "$s3_key"