We want to find out files uploaded/creation date of oldest object present in AWS S3 bucket.
Could you please suggest how we can get it.
You can use the AWS Command-Line Interface (CLI) to list objects sorted by a field:
aws s3api list-objects --bucket MY-BUCKET --query 'sort_by(Contents, &LastModified)[0].[Key,LastModified]' --output text
This gives an output like:
foo.txt 2021-08-17T21:53:46+00:00
See also: How to list recent files in AWS S3 bucket with AWS CLI or Python
Related
aws s3api list-objects-v2 --bucket BUCKET_NAME --query 'Contents[?LastModified>=`YYYY-MM-DD`].Key'
In the above you have to manually put in the date, but I want it just to get the todays date and subtract 1 day or more depends on the demand.
How do the syntax look if you want files from 1 day back ?
is it possible to get todays date and use it with some[-1] with either the aws s3api or aws s3 ls?
No, you can't just do this within that single aws cli command.
You have to build the date variable prior to running the aws cli command.
DATE=$(date +'20%y-%m-%d')
aws s3api list-objects-v2 --bucket BUCKET_NAME --query 'Contents[?LastModified>=`$DATE`].Key'
I currently use AWS S3 to retrieve a whole text file from S3 using the following command:
aws --profile=cloudian --endpoint-url=https://s3-abc.abcstore.abc.net s3 cp s3://abc-store/STORE1/abc2/ABC/test_08.txt test.txt
How can I just get the file size?
If you just want the size, use head object operation.
aws --profile=cloudian --endpoint-url=https://s3-abc.abcstore.abc.net s3api head-object --bucket abc-store --key STORE1/abc2/ABC/test_08.txt --query 'ContentLength'
I want to copy the Latest CSV file which has the date appended from an AWS S3 bucket to a local drive.
I have the basic code that will download the file but it downloads all the files in the bucket I only want the file uploaded that day, latest file.
Download latest object by modified date
If you only wish to grab the file that was last stored on Amazon S3, you could use:
aws s3 cp s3://my-bucket/`aws s3api list-objects-v2 --bucket my-bucket --query 'sort_by(Contents, &LastModified)[-1].Key' --output text` .
This command does the following:
The inner aws s3api list-objects-v2 command lists the bucket, sorts by date (reversed), then returns the Key (filename) of the object that was last modified
The outer aws s3 cp command downloads that object to the local directory
Download latest object based on filename
If your filenames are like:
some_file_20190130.csv
some_file_20190131.csv
some_file_20190201.csv
then you can list by prefix and copy the last one:
aws s3 cp s3://my-bucket/`aws s3api list-objects-v2 --bucket my-bucket --prefix some_file_ --query 'sort_by(Contents, &Key)[-1].Key' --output text` .
This command does the following:
The inner aws s3api list-objects-v2 command lists the bucket, only shows files with a given prefix of some_file_, sorts by Key (reversed), then returns the Key (filename) of the object that is at the end of the sort
The outer aws s3 cp command downloads that object to the local directory
Our bucket structure goes from MyBucket -> CustomerGUID(folder) -> [actual files]
I'm having a hell of a time trying to use the AWS CLI (on windows) --query option to try and locate a file across all of the customer folders. Can someone look at my --query and see what i'm doing wrong here? Or tell me the proper way to search for a specific file name?
This is an example of how i'm able to list ALL the files in the bucket LastModified by a date.
I need to limit the output based on filename, and that is where i'm getting stuck. When I look at the individual files in S3, I can see other files have a "Key", is the Key the 'name' of the file?
See Photo
aws s3 ls s3://mybucket --recursive --output text --query "Contents[?contains(LastModified) > '2018-12-8']"
The aws s3 ls command only returns a text list of objects.
If you wish to use --query, then use: aws s3api list-objects
See: list-objects — AWS CLI Command Reference
We have ~400,000 files on a private S3 bucket that are inbound/outbound call recordings. The files have a certain pattern to it that lets me search for numbers both inbound and outbound. Note these calls are on the Glacier storage class
Using AWS CLI, I can search through this bucket and grep the files I need out. What I'd like to do is now initiate an S3 restore job to expedited retrieval (so ~1-5 minute recovery time), and then maybe 30 minutes later run a command to download the files.
My efforts so far:
aws s3 ls s3://exetel-logs/ --recursive | grep .*042222222.* | cut -c 32-
Retreives the key of about 200 files. I am unsure of how to proceed next, as aws s3 cp wont work for any objects in storage class.
Cheers,
The AWS CLI has two separate commands for S3: s3 ands3api. s3 is a high level abstraction with limited features, so for restoring files, you'll have to use one of the commands available with s3api:
aws s3api restore-object --bucket exetel-logs --key your-key
If you afterwards want to copy the files, but want to ensure to only copy files which were restored from Glacier, you can use the following code snippet:
for key in $(aws s3api list-objects-v2 --bucket exetel-logs --query "Contents[?StorageClass=='GLACIER'].[Key]" --output text); do
if [ $(aws s3api head-object --bucket exetel-logs --key ${key} --query "contains(Restore, 'ongoing-request=\"false\"')") == true ]; then
echo ${key}
fi
done
Have you considered using a high-level language wrapper for the AWS CLI? It will make these kinds of tasks easier to integrate into your workflows. I prefer the Python implementation (Boto 3). Here is example code for how to download all files from an S3 bucket.