AWS S3 `--exclude` being ignored - amazon-web-services

I can't make --exclude work on AWS S3. Neither of the three versions of the commands work. No matter how I exclude the directories, they are still being uploaded.
root#taurus [/]# aws s3 sync / s3://server.taurus --exclude "disk3/*" --exclude "backup/*"
root#taurus [/]# aws s3 sync / s3://server.taurus --exclude 'disk3/*' --exclude 'backup/'
root#taurus [/]# aws s3 sync / s3://server.taurus --exclude 'disk3/' --exclude 'backup/'
Please see my AWS CLI version below.
root#taurus [/]# aws --version
aws-cli/1.10.14 Python/2.6.6 Linux/2.6.32-531.29.2.lve1.3.11.1.el6.x86_64.debug botocore/1.4.5
root#taurus [/]#
What could be wrong?

From the AWS Command-Line Interface (CLI) documentation for the sync command:
--include (string) Don't exclude files or objects in the command that match the specified pattern. See Use of Exclude and Include Filters for details.
--exclude (string) Exclude all files or objects from the command that matches the specified pattern.
So (strange as it may seem), you must must specify objects to --include AND objects to --exclude. Using --include * is acceptable.
Specifying --exclude on its own will not match any files.

Related

Exclude macOS hidden files from AWS S3 sync

I'm syncing the entire contents of an external hard drive, used with macOS, to an S3 bucket. I'd like to exclude all macOS hidden files.
I've tried:
aws s3 sync --dryrun --exclude "^\." --exclude "\/\." ./ s3://bucketname
However, the result when I run that is exactly the same as just:
aws s3 sync --dryrun . s3://bucketname
So, I must be doing something wrong.
Any suggestions?
Thanks.
aws s3 sync --dryrun . s3://bucketname --exclude ".*" --exclude "*/.*"
Adding two exclusion arguments will hide both the specified files in the current directory as well as any in subfolders.
This seems to work:
aws s3 sync --dryrun . s3://bucketname --exclude ".*"
However, I don't think it will exclude such files in sub-directories.
Try this:
aws s3 sync --dryrun --exclude '*/.*'
This should remove any hidden files, including in subfolders.
aws s3 sync --recursive --dryrun --exclude '/.'

AWS CLI search a file in s3 bucket and copy to different folder

I am trying to copy only files from AWS S3 Folder_Test1 folder to a Folder_Test2 folder in the same bucket.
Folder_Test1:
T1_abc_june21.csv
T1_abc_june25.csv
T2_abc_june29.csv
T1_abc_def_june21.csv
T2_abc_def_june25.csv
T3_abc_def_june29.csv
T3_xyz_june29.csv
I have to filter the file name having only abc and exclude the files abc_def:
I tried:
aws s3 cp s3://$bucket/Folder_Test1/ s3://$bucket/Folder_Test2/ --exclude "*abc_def*" --include "*abc*"
but it is not working.
From s3 — AWS CLI 1.18.123 Command Reference:
Any number of these parameters can be passed to a command. You can do this by providing an --exclude or --include argument multiple times, e.g. --include ".txt" --include ".png". When there are multiple filters, the rule is the filters that appear later in the command take precedence over filters that appear earlier in the command.
Therefore, the problem is that your command is excluding *abc_def* but is then including *abc*, which adds the *abc_def* files again.
You should be able to fix it by swapping the order:
aws s3 cp s3://$bucket/Folder_Test1/ s3://$bucket/Folder_Test2/ --include "*abc*" --exclude "*abc_def*"
If it is copying other files that you do not want (eg xyz), then add an exclude:
aws s3 cp s3://$bucket/Folder_Test1/ s3://$bucket/Folder_Test2/ --exclude "*" --include "*abc*" --exclude "*abc_def*"
This will apply these rules in order:
Exclude everything
Add *abc*
Exclude *abc_def*

How to copy multiple files matching name pattern to AWS S3 bucket using AWS CLI?

I would like to copy files matching a file name pattern from my machine to an AWS S3 bucket using AWS CLI. Using the standard unix file name wildcards does not work:
$ aws s3 cp *.csv s3://wesam-data/
Unknown options: file1.csv,file2.csv,file3.csv,s3://wesam-data/
I followed this SO answer addressing a similar problem that advises using the --exclude and --include filters as explained here as shown below without success.
$ aws s3 cp . s3://wesam-data/ --exclude "*" --include "*.csv"
Solution
$ aws s3 cp . s3://wesam-data/ --exclude "*" --include "*.csv" --recursive
Explanation
It turns out that I have to use the --recursive flag with the --include & --exclude flags since this is a multi-file operation.
The following commands are single file/object operations if no --recursive flag is provided.
cp
mv
rm

Glob pattern with amazon s3

I want to move files from one s3 bucket to another s3 bucket.I want to move only files whose name starts with "part".I can do it by using java.But is it possible to do it with amazon CLI. Can we use GlobPattern in CLI.
my object name are like:
part0000
part0001
Yes, this is possible through the aws CLI, using the --include and --exclude options.
As an example, you can use the aws s3 sync command to sync your part files:
aws s3 sync --exclude '*' --include 'part*' s3://my-amazing-bucket/ s3://my-other-bucket/
You can also use the cp command, with the --recursive flag:
aws s3 cp --recursive --exclude '*' --include 'part*' s3://my-amazing-bucket/ s3://my-other-bucket/
Explanation:
aws: The aws CLI command
s3: The aws service to interface with
sync: The command to the service to do
--exclude <value>: The UNIX-style wildcard to ignore, except by include statements
--include <value>: The UNIX-style wildcard to act upon.
As noted in the documentation, you can also specify --include and --exclude multiple times.

Use AWS CLI to Copy from S3 to EC2

I have zipped files in an S3 bucket that I need to bring back to my EC2 instance. In the past, I moved the documents to S3 with the following command:
aws s3 cp /my/ec2/path/ s3://my/s3/path/ --exclude '*' --include '2014-01*’ —-recursive
To move files from January 2014 back to EC2, I have tried the following command:
aws s3 cp s3://my/s3/path/ //my/ec2/path/ --exclude '*' --include '2014-01*' --recursive
My understanding is that this command excludes all files but then includes all files with the prefix '2014-01'. I have confirmed that this is how the files I want start. I have also tried only one forward slash before mainstorage and including fewer files.
I have followed these two links from Amazon:
http://docs.aws.amazon.com/cli/latest/reference/s3/index.html
http://docs.aws.amazon.com/cli/latest/userguide/using-s3-commands.html
Figured it out. The key was to define the filepath in --include , i.e. --include '2014-1'. Correct command:
aws s3 cp s3://my/s3/path //my/ec2/path/ --exclude '*' --include '*2014-01*' --recursive