I'm just getting started with learning AWS CLI, I was wondering is there a way of checking pre-existing buckets and seeing if they have SSL enabled?
Many Thanks
buckets=`aws s3api list-buckets | jq -r '.Buckets[].Name'`
for bucket in $buckets
do
#echo "$bucket"
if aws s3api get-bucket-policy --bucket $bucket --query Policy --output text &> /dev/null; then
aws s3api get-bucket-policy --bucket $bucket --query Policy --output text | jq -r 'select(.Statement[].Condition.Bool."aws:SecureTransport"=="false")' | wc | awk {'print $1'}`
Related
Command used to restore.
aws s3api list-object-versions \
--bucket Bucket-Name \
--prefix "folders-to-restore" \
--output json \
--query 'DeleteMarkers[?IsLatest==`true`] | [?LastModified > `2022-10-20`] | [?LastModified < `2022-10-22`].[Key, VersionId]' \
| jq -r '.[] | "--key '\''" + .[0] + "'\'' --version-id " + .[1]' \
| xargs -L1 aws s3api delete-object --bucket Bucket-Name
The error appeared when having special characters in files and folders.
unmatched single quote; by default quotes are special to xargs unless you use the -0 option
I have used 0 in place of 1 where xargs -L1 but not working tried all possible ways.
Need to help to resolve the issue
Is there any proper way to retrieve files from s3 with the Content-type using python or AWS CLI?
I've searched and made some queries as below but the first one seems not as intended.
aws s3 ls --summarize --human-readable --recursive s3://<Bucket Name> | egrep '*.jpg*'
And the following query seems working but it also returns 404 errors.
for KEY in $(aws s3api list-objects --bucket <Bucket Name> --query "Contents[].[Key]" --output text) do aws s3api head-object --bucket <Bucket Name> --key $KEY --query "[\`$KEY\`,ContentType]" --output text | awk '$2 == "image/jpeg" { print $1 }'done
One of the reason is, the variable is not expending in the query parameters
--query "[\`$KEY\`,ContentType]"
Here you can look for more details.
How to expand variable in aws-cli --query parameter
so you can try this as just test it out and seems like working.
#!/bin/bash
ContentType="application/octet-stream"
BUCKET=mybucket
MAX_ITME=100
OBJECT_LIST="$(aws s3api list-objects --bucket $BUCKET --query 'Contents[].[Key]' --max-items=$MAX_ITME --output text | tr '\n' ' ' )";
for KEY in ${OBJECT_LIST}
do
aws s3api head-object --bucket $BUCKET --key $KEY --query "[\``echo $KEY`\`,ContentType]" --output text | grep "$ContentType"
done
I am looking for list of unused s3 buckets from last 90 days and also for empty bucket list.
In order to get it, I have tried writing code as below:
#/bin/sh
for bucketlist in $(aws s3api list-buckets --query "Buckets[].Name");
do
listobjects=$(\
aws s3api list-objects --bucket $bucketlist \
--query 'Contents[?contains(LastModified, `2020-08-06`)]')
done
This code prints following output: [I have added results for only one bucket for reference]
{
"Contents": [
{
"Key": "test2/image.png",
"LastModified": "2020-08-06T17:19:10.000Z",
"ETag": "\"xxxxxx\"",
"Size": 179008,,
"StorageClass": "STANDARD",
}
]
}
Expectations:
In above code I want to print only bucket list which objects are not modified/used in last 90 days.
I am also looking for empty bucket list
I am not good in programming, Can anyone guide me on this?
Thank you in advance for your support.
I made this small bash script to find empty buckets in my account:
#!/bin/zsh
for b in $(aws s3 ls | cut -d" " -f3)
do
echo -n $b
if [[ "$(aws s3api list-objects-v2 --bucket $b --max-items 1)" == "" ]]
then
echo " BUCKET EMPTY"
else
echo ""
fi
done
I listed the objects using the list-objects-v2 with maximum items of 1. If there are no items - the result is empty and I print "BUCKET EMPTY" alongside the bucket name.
Note 1: You must have access to list the objects.
Note 2: I'm not sure how it'll work for versioned buckets with deleted objects (appears to be empty, but actually contains older versions of deleted objects)
Here's a script I wrote today. It doesn't change anything, but it does give you the commandlines to make the changes.
#!/bin/bash
profile="default"
olddate="2020-01-01"
smallbucketsize=10
emptybucketlist=()
oldbucketlist=()
smallbucketlist=()
#for bucketlist in $(aws s3api list-buckets --profile $profile | jq --raw-output '.Buckets[6,7,8,9].Name'); # test this script on just a few buckets
for bucketlist in $(aws s3api list-buckets --profile $profile | jq --raw-output '.Buckets[].Name');
do
echo "* $bucketlist"
if [[ ! "$bucketlist" == *"shmr-logs" ]]; then
listobjects=$(\
aws s3api list-objects --bucket $bucketlist \
--query 'Contents[*].Key' \
--profile $profile)
#echo "==$listobjects=="
if [[ "$listobjects" == "null" ]]; then
echo "$bucketlist is empty"
emptybucketlist+=("$bucketlist")
else
# get size
aws s3 ls --summarize --human-readable --recursive --profile $profile s3://$bucketlist | tail -n1
# get number of files
filecount=$(echo $listobjects | jq length )
echo "contains $filecount files"
if [[ $filecount -lt $smallbucketsize ]]; then
smallbucketlist+=("$bucketlist")
fi
# get number of files older than $olddate
listoldobjects=$(\
aws s3api list-objects --bucket $bucketlist \
--query "Contents[?LastModified<=\`$olddate\`]" \
--profile $profile)
oldfilecount=$(echo $listoldobjects | jq length )
echo "contains $oldfilecount old files"
# check if all files are old
if [[ $filecount -eq $oldfilecount ]]; then
echo "all the files are old"
oldbucketlist+=("$bucketlist")
fi
fi
fi
done
echo -e "\n\n"
echo "check the contents of these buckets which only contain old files"
for oldbuckets in ${oldbucketlist[#]};
do
echo "$oldbuckets"
done
echo -e "\n\n"
echo "check the contents of these buckets which don't have many files"
for smallbuckets in ${smallbucketlist[#]};
do
echo "aws s3api list-objects --bucket $smallbuckets --query 'Contents[*].Key' --profile $profile"
done
echo -e "\n\n"
echo "consider deleting these empty buckets"
for emptybuckets in "${emptybucketlist[#]}";
do
echo "aws s3api delete-bucket --profile $profile --bucket $emptybuckets"
done
I am trying to bulk update all s3 buckets with default encryption to that i generate a json file using below command
aws s3api list-buckets --query "Buckets[].Name" >> s3.json
My results was names of all s3 buckets.
How do i pass in that json file into the command so i can enable default encryption.
I also tried below
aws s3api list-buckets --query 'Buckets[*].[Name]' --output text | xargs -I {} bash -c 'aws s3api put-bucket-encryption --bucket {} --server-side-encryption-configuration '{"Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]}''
But iam getting below error
Error parsing parameter '--server-side-encryption-configuration': Invalid JSON: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
JSON received: {Rule
aws s3api put-bucket-encryption --bucket bucketnames --server-side-encryption-configuration '{"Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]}'
I have tried below but it does not work.
aws s3api put-bucket-encryption \
--bucket value \
--server-side-encryption-configuration '{"Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]}' \
--cli-input-json file://s3bucket.json
Pleas let me know how to update my command to enable default encryption.
Below is the code snippet to solve your problem:
# Check if bucket is SSE enabled and then encrypt using SSE AES256:
#!/bin/bash
#List all buckets and store names in a array.
arr=(`aws s3api list-buckets --query "Buckets[].Name" --output text`)
# Check the status before encryption:
for i in "${arr[#]}"
do
echo "Check if SSE is enabled for bucket -> ${i}"
aws s3api get-bucket-encryption --bucket ${i} | jq -r .ServerSideEncryptionConfiguration.Rules[0].ApplyServerSideEncryptionByDefault.SSEAlgorithm
done
# Encrypt all buckets in your account:
for i in "${arr[#]}"
do
echo "Encrypting bucket with SSE AES256 for -> ${i}"
aws s3api put-bucket-encryption --bucket ${i} --server-side-encryption-configuration '{"Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]}'
done
aws s3api list-buckets --query "Buckets[].Name" \
| jq .[] \
| xargs -I '{}' aws s3api put-bucket-encryption --bucket {} --server-side-encryption-configuration '{"Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]}'
Worked for me
If you wanted to do it in Python it would be something like this (not tested!):
import boto3
s3_client = boto3.client('s3')
response = s3_client.list_buckets()
for bucket in response['Buckets']
s3_client.put_bucket_encryption(
Bucket=bucket,
ServerSideEncryptionConfiguration={
'Rules': [
{
'ApplyServerSideEncryptionByDefault': {
'SSEAlgorithm': 'AES256'
}
},
]
}
)
Versioning of Amazon S3 buckets is nice, but I don't see any easy way to compare versions of a file - either through the console or through any other app I found.
S3Browser seems to have the best versioning support, but no comparison.
Is there a way to compare versions of a file on S3 without downloading both versions and comparing them manually?
--
EDIT:
I just started thinking that some basic automation should not be too hard, see snippet below. Question remains though: is there any tool that supports this properly? This script may be fine for me, but not for non-dev users.
#!/bin/bash
# s3-compare-last-versions.sh
if [[ $# -ne 2 ]]; then
echo "Usage: `basename $0` <bucketName> <fileKey> "
exit 1
fi
bucketName=$1
fileKey=$2
latestVersionId=$(aws s3api list-object-versions --bucket $bucketName --prefix $fileKey --max-items 2 | json Versions[0].VersionId)
previousVersionId=$(aws s3api list-object-versions --bucket $bucketName --prefix $fileKey --max-items 2 | json Versions[1].VersionId)
aws s3api get-object --bucket $bucketName --key $fileKey --version-id $latestVersionId $latestVersionId".js"
aws s3api get-object --bucket $bucketName --key $fileKey --version-id $previousVersionId $previousVersionId".js"
diff $latestVersionId".js" $previousVersionId".js"
I wrote a bash script to download the last two versions of an object and compare it using colordiff. I stumbled across this questions after writing it. Thought I could share it here if anyone wanted to use it.
#!/bin/bash
#This script needs awscli, jq and colordiff. Please install them for your environment
#This script also needs the AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_DEFAULT_REGION.
#Please set them using the export command as follows or set them using envrc
#export AWS_ACCESS_KEY_ID=<Your AWS Access Key ID>
#export AWS_SECRET_ACCESS_KEY=<Your AWS Secret Access Key>
#export AWS_DEFAULT_REGION=<Your AWS Default Region>
set -e
if [ -z $1 ] || [ -z $2 ]; then
echo "Usage:"
echo "version_compare.sh *bucket_name* *file_name*"
echo
echo "Example"
echo "version_compare.sh bucket_name folder/filename.extension"
echo
exit 1;
fi
aws_bucket=$1
file_key=$2
echo Getting the last 2 versions of the file at ${file_key}..
echo
echo Executing:
cat << EOF
aws s3api list-object-versions --bucket ${aws_bucket} --prefix ${file_key} --max-items 2
EOF
echo
versions=$(aws s3api list-object-versions --bucket ${aws_bucket} --prefix ${file_key} --max-items 2)
version_1=$( jq -r '.["Versions"][0]["VersionId"]' <<< "${versions}" )
version_2=$( jq -r '.["Versions"][1]["VersionId"]' <<< "${versions}" )
mkdir -p state_comparison_files
echo Getting the latest version ${version_1} of the file at ${file_key}..
echo
echo Executing:
cat << EOF
aws s3api get-object --bucket ${aws_bucket} --key ${file_key} --version-id ${version_1} state_comparison_files/${version_1}
EOF
aws s3api get-object --bucket ${aws_bucket} --key ${file_key} --version-id ${version_1} state_comparison_files/${version_1} > /dev/null
echo
echo Getting older version ${version_2} of the file at ${file_key}..
echo
echo Executing:
cat << EOF
aws s3api get-object --bucket ${aws_bucket} --key ${file_key} --version-id ${version_2} state_comparison_files/${version_2}
EOF
aws s3api get-object --bucket ${aws_bucket} --key ${file_key} --version-id ${version_2} state_comparison_files/${version_2} > /dev/null
echo
echo Comparing the different versions.
echo If no differences are found, nothing will be shown
colordiff --unified state_comparison_files/${version_2} state_comparison_files/${version_1}
Here's the link to it
https://gist.github.com/mohamednajiullah/3edc88d314291be40f2dd3cf13ea0d7f
Note: It's pretty much the same as the script the question asker himself created except that it uses jq for json parsing and colordiff for showing the difference with different colors like in git diff.
I'm creating an electron.js based desktop app to do exactly this. It's currently in development but it can be used. I welcome contributions
https://github.com/mohamednajiullah/s3_object_version_comparator
You can't view file contents at all via S3, so you definitely can't compare the contents of files via S3. You would have to download the different versions and then use a tool like diff to compare them.
you can use MegaSparDiff an open source too that compares multiple types of datasources including S3
https://github.com/FINRAOS/MegaSparkDiff
the below pair will return inLeftButNotInRight and inRightButNotInLeft as DataFrames which you can save as files or you can examine the data via code.
SparkFactory.initializeSparkContext();
AppleTable leftAppleTable = SparkFactory.parallelizeTextSource("S3://file1","table1");
AppleTable rightAppleTable = SparkFactory.parallelizeTextSource("S3://file2","table2");
Pair<Dataset<Row>, Dataset<Row>> resultPair = SparkCompare.compareAppleTables(leftAppleTable, rightAppleTable);
resultPair.getLeft().show(100);
SparkFactory.stopSparkContext();