Logrotate Postrotate aws s3 Wildcards - amazon-web-services

I am trying to rotate a bunch of log files and upload them to S3 with the postrotate command.
However, it appears that the postrotate script is not expanding the * glob wildcard:
My logrotate configuration:
/var/log/application/*.log {
missingok
dateext
size 500M
notifempty
copytruncate
compress
rotate 1512
postrotate
/usr/bin/aws s3 mv /var/log/application/*.gz s3://mygreatbucket/
endscript
}
The error I see when running logrotate with that configuration:
The user-provided path /var/log/application/*.gz does not exist.
This is a message from aws cli s3 command. Which I can replicate if I manually run my command:
/usr/bin/aws s3 mv '/var/log/application/*.gz' s3://mygreatbucket
(note the single quotes).
What can I do so that the glob wildcard is expanded during the postrotate step?

The AWS cli documentation states that their CLI tool does directly support glob wildcards. Instead you should use --include or --exclude parameters.
I ended up using:
/usr/bin/aws s3 mv /var/log/application/ s3://mybucket --exclude '*' --include '*.gz' --recursive
The --recursive flag is important, otherwise it won't work.

Related

How to copy multiple files matching name pattern to AWS S3 bucket using AWS CLI?

I would like to copy files matching a file name pattern from my machine to an AWS S3 bucket using AWS CLI. Using the standard unix file name wildcards does not work:
$ aws s3 cp *.csv s3://wesam-data/
Unknown options: file1.csv,file2.csv,file3.csv,s3://wesam-data/
I followed this SO answer addressing a similar problem that advises using the --exclude and --include filters as explained here as shown below without success.
$ aws s3 cp . s3://wesam-data/ --exclude "*" --include "*.csv"
Solution
$ aws s3 cp . s3://wesam-data/ --exclude "*" --include "*.csv" --recursive
Explanation
It turns out that I have to use the --recursive flag with the --include & --exclude flags since this is a multi-file operation.
The following commands are single file/object operations if no --recursive flag is provided.
cp
mv
rm

Deleting S3 files using AWS data pipeline

I want to delete all S3 keys starting with some prefix using AWS data Pipeline.
I am using AWS Shell Activity for this.
These are the argument
"scriptUri": "https://s3.amazonaws.com/my_s3_bucket/hive/removeExitingS3.sh",
"scriptArgument": "s3://my_s3_bucket/output/2017-03-19",
I want to delete all S3 keys starting with 2017-03-19 in output folder. What should be command to do this?
I have tried this command in .sh file
sudo yum -y upgrade aws-cli
aws s3 rm $1 --recursive
This is not working.
Sample files are
s3://my_s3_bucket/output/2017-03-19/1.txt
s3://my_s3_bucket/output/2017-03-19/2.txt
s3://my_s3_bucket/output/2017-03-19_3.txt
EDIT:
The date(2017-03-19) is dynamic and this is output of #{format(#scheduledStartTime,"YYYY-MM-dd")}. So effectively
"scriptArgument": "s3://my_s3_bucket/output/{format(#scheduledStartTime,"YYYY-MM-dd")}"
Try
aws s3 rm $1 --recursive --exclude "*" --include "2017-03-19*" --include "2017-03-19/*"
with
"scriptArgument": "s3://my_s3_bucket/output/"
EDIT:
As the date is a dynamic param, pass it as the second scriptArgument to the Shell command activity,
aws s3 rm $1 --recursive --exclude "*" --include "$2*" --include "$2/*"

Glob pattern with amazon s3

I want to move files from one s3 bucket to another s3 bucket.I want to move only files whose name starts with "part".I can do it by using java.But is it possible to do it with amazon CLI. Can we use GlobPattern in CLI.
my object name are like:
part0000
part0001
Yes, this is possible through the aws CLI, using the --include and --exclude options.
As an example, you can use the aws s3 sync command to sync your part files:
aws s3 sync --exclude '*' --include 'part*' s3://my-amazing-bucket/ s3://my-other-bucket/
You can also use the cp command, with the --recursive flag:
aws s3 cp --recursive --exclude '*' --include 'part*' s3://my-amazing-bucket/ s3://my-other-bucket/
Explanation:
aws: The aws CLI command
s3: The aws service to interface with
sync: The command to the service to do
--exclude <value>: The UNIX-style wildcard to ignore, except by include statements
--include <value>: The UNIX-style wildcard to act upon.
As noted in the documentation, you can also specify --include and --exclude multiple times.

Selective file download in AWS CLI

I have files in S3 bucket. I was trying to download files based on a date, like 08th aug, 09th Aug etc.
I used the following code, but it still downloads the entire bucket:
aws s3 cp s3://bucketname/ folder/file \
--profile pname \
--exclude \"*\" \
--recursive \
--include \"" + "2015-08-09" + "*\"
I am not sure, how to achieve this. How can I download selective date file?
This command will copy all files starting with 2015-08-15:
aws s3 cp s3://BUCKET/ folder --exclude "*" --include "2015-08-15*" --recursive
If your goal is to synchronize a set of files without copying them twice, use the sync command:
aws s3 sync s3://BUCKET/ folder
That will copy all files that have been added or modified since the previous sync.
In fact, this is the equivalent of the above cp command:
aws s3 sync s3://BUCKET/ folder --exclude "*" --include "2015-08-15*"
References:
AWS CLI s3 sync command documentation
AWS CLI s3 cp command documentation
Bash Command to copy all files for specific date or month to current folder
aws s3 ls s3://bucketname/ | grep '2021-02' | awk '{print $4}' | aws s3 cp s3://bucketname/{} folder
Command is doing the following thing
Listing all the files under a bucket
Filtering out all the files of 2021-02 i.e. all files of feb month of 2021
Filtering out only the name of them
running command aws s3 cp on specific files
In case your bucket size is large in the upwards of 10 to 20 gigs,
this was true in my own personal use case, you can achieve the same
goal by using sync in multiple terminal windows.
All the terminal sessions can use the same token, in case you need to generate a token for prod environment.
$ aws s3 sync s3://bucket-name/sub-name/another-name folder-name-in-pwd/
--exclude "*" --include "name_date1*" --profile UR_AC_SomeName
and another terminal window (same pwd)
$ aws s3 sync s3://bucket-name/sub-name/another-name folder-name-in-pwd/
--exclude "*" --include "name_date2*" --profile UR_AC_SomeName
and another two for "name_date3*" and "name_date4*"
Additionally, you can also do multiple excludes in the same sync
command as in:
$ aws s3 sync s3://bucket-name/sub-name/another-name my-local-path/
--exclude="*.log/*" --exclude=img --exclude=".error" --exclude=tmp
--exclude="*.cache"
This Bash Script will copy all files from one bucket to another by modified-date using aws-cli.
aws s3 ls <BCKT_NAME> --recursive | sort | grep "2020-08-*" | cut -b 32- > a.txt
Inside Bash File
while IFS= read -r line; do
aws s3 cp s3://<SRC_BCKT>/${line} s3://<DEST_BCKT>/${line} --sse AES256
done < a.txt
aws cli is really slow at this. I waited hours and nothing really happened. So I looked for alternatives.
https://github.com/peak/s5cmd worked great.
supports globs, for example:
s5cmd -numworkers 30 cp 's3://logs-bucket/2022-03-30-19-*' .
is really blazing fast, so you can work with buckets that have s3 access logs without much fuss.

uploading all files of a certain extension type

I'm trying to upload all files of type .flv to an S3 bucket using the AWS CLI from a Windows server 2008 command line.
I do this:
aws s3 sync . s3://MyBucket --exclude '*.png'
And it begins uploading .png files instead.
I'm trying to follow the documentation and it gives an example that reads:
Local directory contains 3 files:
MyFile1.txt
MyFile2.rtf
MyFile88.txt
'''
aws s3 sync . s3://MyBucket/MyFolder --exclude '*.txt'
upload: MyFile2.rtf to s3://MyBucket/MyFolder/MyFile2.rtf
So what am I doing wrong?
Use:
aws s3 sync . s3://MyBucket/ --exclude "*" --include "*.flv"
It excludes all files, then includes .flv files. The order of parameters is important.
You can also use:
aws s3 cp . s3://MyBucket/ --recursive --exclude "*" --include "*.flv"
The difference is that sync will not re-copy a file that already exists in the destination.