Listing S3 bucket objects with specific storage class - amazon-web-services

It's very time consuming to get objects from Glacier so I decided to use S3 IA storage class instead.
I need to list all the objects in my bucket that have Glacier storage class (I configured it via LifeCycle policy) and to change it to S3 IA.
Is there any script or a tool for that?

You can do that using list-objects
list-objects will return the StorageClass, in your case you want to filter for values where it is GLACIER
aws s3api list-objects --bucket %bucket_name% --query 'Contents[?StorageClass==`GLACIER`]'
What you want then is to get only the list of Key that matches
aws s3api list-objects --bucket %bucket_name% --query 'Contents[?StorageClass==`GLACIER`][Key]' --output text
Then you will need to copy the object with changing the storage class of the Key
aws s3api list-objects --bucket %bucket_name% --query 'Contents[?StorageClass==`GLACIER`][Key]' --output text
| xargs -I {} aws s3 cp s3://bucket_name/{} s3://bucket_name/{} --storage-class STANDARD_IA

and ... if you need to run this from Powershell in windows, I had to do this:
aws s3api list-objects --bucket Your_Bucket --query 'Contents[?StorageClass==`STANDARD`][Key]' --output text | foreach { aws s3 cp s3://Your_Bucket/$_ s3://Your_Bucket/$_ --storage-class REDUCED_REDUNDANCY }

Related

How to get the storage class of an object stored in aws bucket

I need to check the storage class of an object inside an S3 bucket.
Is there a way to get the storage class of an S3 object using the AWS CLI v2?
You could use list-objects-v2.
For example:
aws s3api list-objects-v2 --bucket <bucket-name> --prefix <object_key> --query "Contents[*].StorageClass" --output text
Output:
GLACIER

How to get size of all files in an S3 bucket with versioning?

I know this command can provide the size of all files in a bucket:
aws s3 ls mybucket --recursive --summarize --human-readable
But this does not account for versioning.
If I run this command:
aws s3 ls s3://mybucket/myfile --human-readable
It will show something like "100 MiB" but it may have 10 versions of this file which will be more like "1 GiB" total.
The closest I have is getting the sizes of every version of a given file:
aws s3api list-object-versions --bucket mybucket --prefix "myfile" --query 'Versions[?StorageClass=`STANDARD`].Size' > /tmp/s3_myfile_version_sizes
Then take the sum of all version sizes.
But I would have to rerun this command for every file in a bucket.
Is there an easier way to do this?
You can run list-object-versions on the bucket as a whole:
aws s3api list-object-versions --bucket my-bucket --query 'Versions[*].Size'
Use jq to sum it up:
aws s3api list-object-versions --bucket my-bucket --query 'Versions[*].Size' | jq add
Or, if you need a human readable output:
aws s3api list-object-versions --bucket my-bucket --query 'Versions[*].Size' | jq add | numfmt --to=iec-i --suffix=B
You can also add a prefix in case you want to know the size of a given "folder" and maybe get also the number of version objects:
aws s3api list-object-versions --bucket my-bucket --prefix my-folder --query 'Versions[*].Size' | jq 'length|add'
Or you can use jq filtering to write more complex filters, for example, including only non-current objects:
aws s3api list-object-versions --bucket my-bucket --prefix my-folder | jq '[.Versions[]|select(.IsLatest == false)|.Size] | length,add'
If jq is not available, using the --output text option unfortunately results in tab-separated values, so here's a hack to force it to separate lines and then add up the total:
aws s3api list-object-versions --bucket my-bucket --query 'Versions[*].[Size,Size]' --output text | awk '{s+=$1} END {printf "%.0f", s}'
If you have a large number of objects, it might be better to use data provided by the Amazon S3 Storage Inventory:
Amazon S3 inventory provides a comma-separated values (CSV) flat-file output of your objects and their corresponding metadata on a daily or weekly basis for an S3 bucket or a shared prefix (that is, objects that have names that begin with a common string).
Use CloudWatch, it will give result with all versioning.

aws s3 ls filter storage class(STANDARD)

How to list files but I want to list all standard class only.
I want to exclude glacier class.
Currently here is my command:
aws s3 ls s3://Videos/Action/ --human-readable --summarize
The aws s3 ls command doesn't display the Storage Class, but you can do it with this command:
aws s3api list-objects-v2 --bucket Videos --prefix Action --query "Contents[?StorageClass=='STANDARD'].Key" --output text
The output is tab-separated, so you may have to massage the output to get it in your desired format, eg:
aws s3api list-objects-v2 --bucket Videos --prefix Action --query "Contents[?StorageClass=='STANDARD'].Key" --output text | sed 's/\t/\n/g'
To gain an understanding of how to selectively use the --query command, see:
How to Filter the Output with the --query Option
JMESPath Tutorial

AWS CLI move all files with condition

I must move into another bucket only files changed in the year 2015. How can I write this condition?
aws s3 mv <condition??> s3://bucket1 s3://bucket2 --recursive
I don't think you can directly do that through through the s3 option.
what you can do though is a 2 steps approach:
get the list of files that have been modified after a date
aws s3api list-objects --bucket bucket1" --query 'Contents[?LastModified > `2015-01-01`].[Key]' --output text
Based on this list you can move the items.
I have not tried and not an shell expert but something around this
aws s3api list-objects --bucket "<YOUR_BUCKET>" --query 'Contents[?LastModified > `2015-01-01`].[Key]' --output text | xargs aws s3 mv s3://bucket2/ -

Fast way to get AWS S3 key count in bucket

Is there anyone out there knowing a fast way to get the count of my keys in S3?
I usually do s3cmd ls s3://bucket/ | wc -l but my bucket contains a huge number of keys which makes this operation impossible to finish.
Try this to count the bucket object using aws s3 api -
aws s3api list-objects-v2 --bucket $bucketNameToUse --query '[length(Contents[?LastModified].{Key: Key})]' --output text
Done.
This is a super old question, but I figured I'd add my $.02.
aws s3api list-objects-v2 --bucket $YOUR_BUCKET --no-cli-pager --query "Contents[].Key" --output text | wc -w
This will give you a count thats pretty close and works better for me that Vipin's answer