I'm trying to list the 'folders' in a S3 bucket under a given prefix.
aws --profile my-profile s3api list-objects-v2 --bucket my-bucket --prefix releases/com/example/app/ --delimiter / --query 'CommonPrefixes[*].Prefix'
There are dozens of folders under the prefix, each containing many files, I I should be able to list the folders., i.e. they do exist
There is no CommonPrefixes returned by this query, so I get null as an output. What am I doing wrong?
running aws-cli/2.0.0 Python/3.7.5 Windows/10 botocore/2.0.0dev4 in Git Bash terminal on Windows.
Related
I have a cron job set that moves the files from an EC2 instance to S3
aws s3 mv --recursive localdir s3://bucket-name/ --exclude "*" --include "localdir/*"
After that I use aws s3 sync s3://bucket-name/data1/ E:\Datafolder in .bat file and run task scheduler in Windows to run the command.
The issue is that s3 sync command copies all the files in /data1/ prefix.
So let's say I have the following files:
Day1: file1 is synced to local.
Day2: file1 and file2 are synced to local because file1 is removed from the local machine's folder.
I don't want them to occupy space on local machine. On Day 2, I just want file2 to be copied over.
Can this be accomplished by AWS CLI commands? or do I need to write a lambda function?
I followed the answer from Get last modified object from S3 using AWS CLI
but on Windows, the | and awk commands are not working as expected.
To obtain the name of the object that has the most recent Last Modified date, you can use:
aws s3api list-objects-v2 --bucket BUCKET-NAME --query 'sort_by(Contents, &LastModified)[-1].Key' --output text
Therefore (using shell syntax), you could use:
object=`aws s3api list-objects-v2 --bucket BUCKET-NAME --prefix data1/ --query 'sort_by(Contents, &LastModified)[-1].Key' --output text`
aws s3 cp s3://BUCKET-NAME/$object E:\Datafolder
You might need to tweak it to get it working on Windows.
Basically, it gets the bucket listing, sorts by LastModified, then grabs the name of the last object in the list.
Modified answer to work with Windows .bat file. Uses Windows cmd.exe
for /f "delims=" %%i in ('aws s3api list-objects-v2 --bucket BUCKET-NAME --prefix data1/ --query "sort_by(Contents, &LastModified)[-1].Key" --output text') do set object=%%i
aws s3 cp s3://BUCKET-NAME/%object% E:\Datafolder
I am using a command using aws cli in my windows machine to get latest file from s3 bucket .
aws s3 ls s3://Bucket-name --recursive | sort |tail -n 1
It is listing all the files in sorted manner according to date upto here:
aws s3 ls s3://Bucket-name --recursive | sort
But writing the full command throws error:
'Tail is not recognized as an internal or external command'.
Is there some other alternative for tail or for the full command.
The AWS CLI permits JMESPath expressions in the --query parameter.
This command shows the most recently-updated object:
aws s3api list-objects --bucket my-bucket --query 'sort_by(Contents, &LastModified)[-1].Key' --output text
It's basically saying:
Sort by LastModified
Obtain the last [-1] entry
Show the Key (filename)
I want to copy the Latest CSV file which has the date appended from an AWS S3 bucket to a local drive.
I have the basic code that will download the file but it downloads all the files in the bucket I only want the file uploaded that day, latest file.
Download latest object by modified date
If you only wish to grab the file that was last stored on Amazon S3, you could use:
aws s3 cp s3://my-bucket/`aws s3api list-objects-v2 --bucket my-bucket --query 'sort_by(Contents, &LastModified)[-1].Key' --output text` .
This command does the following:
The inner aws s3api list-objects-v2 command lists the bucket, sorts by date (reversed), then returns the Key (filename) of the object that was last modified
The outer aws s3 cp command downloads that object to the local directory
Download latest object based on filename
If your filenames are like:
some_file_20190130.csv
some_file_20190131.csv
some_file_20190201.csv
then you can list by prefix and copy the last one:
aws s3 cp s3://my-bucket/`aws s3api list-objects-v2 --bucket my-bucket --prefix some_file_ --query 'sort_by(Contents, &Key)[-1].Key' --output text` .
This command does the following:
The inner aws s3api list-objects-v2 command lists the bucket, only shows files with a given prefix of some_file_, sorts by Key (reversed), then returns the Key (filename) of the object that is at the end of the sort
The outer aws s3 cp command downloads that object to the local directory
I want to get list of all files in S3 bucket with particular naming pattern.
For Eg if i have files like
aaaa2018-05-01
aaaa2018-05-23
aaaa2018-06-30
aaaa2018-06-21
I need to get list of all files for 5th month.Output should look like:
aaaa2018-05-01
aaaa2018-05-23
I executed the following command and the result was empty:
aws s3api list-objects --bucket bucketname --query "Contents[?contains(Key, 'aaaa2018-05-*')]" > s3list05.txt
when i check the s3list05.txt its empty. Also i tried the below command and
aws s3 ls s3:bucketname --recursive | grep aaaa2018-05* > s3list05.txt
this command lists me all the objects present in the file.
Kindly let me know the exact command to get desired output.
You are almost there. Try this:
aws s3 ls s3://bucketname --recursive | grep aaaa2018-05
or
aws s3 ls bucketname --recursive | grep aaaa2018-05
The Contains parameter doesn't need a wildcard:
aws s3api list-objects --bucket bucketname --query "Contents[?contains(Key, 'aaaa2018-05')].[Key]" --output text
This provides a list of Keys.
--output text removes the JSON formatting.
Using [Key] instead of just Key puts them all on one line.
How to list files but I want to list all standard class only.
I want to exclude glacier class.
Currently here is my command:
aws s3 ls s3://Videos/Action/ --human-readable --summarize
The aws s3 ls command doesn't display the Storage Class, but you can do it with this command:
aws s3api list-objects-v2 --bucket Videos --prefix Action --query "Contents[?StorageClass=='STANDARD'].Key" --output text
The output is tab-separated, so you may have to massage the output to get it in your desired format, eg:
aws s3api list-objects-v2 --bucket Videos --prefix Action --query "Contents[?StorageClass=='STANDARD'].Key" --output text | sed 's/\t/\n/g'
To gain an understanding of how to selectively use the --query command, see:
How to Filter the Output with the --query Option
JMESPath Tutorial