Exclude macOS hidden files from AWS S3 sync - amazon-web-services

I'm syncing the entire contents of an external hard drive, used with macOS, to an S3 bucket. I'd like to exclude all macOS hidden files.
I've tried:
aws s3 sync --dryrun --exclude "^\." --exclude "\/\." ./ s3://bucketname
However, the result when I run that is exactly the same as just:
aws s3 sync --dryrun . s3://bucketname
So, I must be doing something wrong.
Any suggestions?
Thanks.

aws s3 sync --dryrun . s3://bucketname --exclude ".*" --exclude "*/.*"
Adding two exclusion arguments will hide both the specified files in the current directory as well as any in subfolders.

This seems to work:
aws s3 sync --dryrun . s3://bucketname --exclude ".*"
However, I don't think it will exclude such files in sub-directories.

Try this:
aws s3 sync --dryrun --exclude '*/.*'
This should remove any hidden files, including in subfolders.

aws s3 sync --recursive --dryrun --exclude '/.'

Related

Use the aws client to copy s3 files from a single directory only (non recursively)

Consider an aws bucket/key structure along these lines
myBucket/dir1/file1
myBucket/dir1/file2
myBucket/dir1/dir2/dir2file1
myBucket/dir1/dir2/dir2file2
When using:
aws s3 cp --recursive s3://myBucket/dir1/ .
Then we will copy down dir2file[1,2] along with file[1,2]. How to only copy the latter files and not files under subdirectories ?
Responding to a comment: . I am not interested in putting a --exclude for every subdirectory so this is not a duplicate of excluding directories from aws cp
As far as I understood, you want to make sure that the files present in current directories are copied but anything in child directories should not be copied. I think you can use something like that.
aws s3 cp s3://myBucket/dir1/ . --recursive --exclude "*/*"
Here we are excluding files which will have a path separator after "dir1".
You can exclude paths using the --exclude option, e.g.
aws s3 cp s3://myBucket/dir1/ . --recursive --exclude "dir1/dir2/*"
More options and examples can be found by using the aws cli help
aws s3 cp help
There is no way you can control the recursion depth while copying files using aws s3 cp. Neither it is supported in aws s3 ls.
So, if you do not wish to use --exclude or --include options, I suggest you:
Use aws s3 ls command without --recursive option to list files directly under a directory, extract only the file names from the output and save the names to a file. Refer this post
Then write a simple script to read the file names and for each execute aws s3 cp
Alternatively, you may use:
aws s3 cp s3://spaces/dir1/ . --recursive --exclude "*/*"

S3 cli includes not working

Alright I'm very confused by aws cli
I have an S3 bucket:
s3://my-bucket
directory/
file1
file2
backup-logs-1234
backup-logs-5678
I've verified that the files are in the s3 bucket, and I can see them with aws s3 ls s3://my-bucket
I'm trying to delete all the backup logs in the folder (8000 of them). I've tried every combination of includes/excludes I can think of
1) For some reason aws s3 rm "s3://my-bucket/" --include "*backup-logs*" --dryrun tries to delete s3://my-bucket/directory/
2) aws s3 rm "s3://my-bucket/" --exclude "*" --include "*backup-logs*" --dryrun doesn't see any files to delete
3) I've also tried different substrings of "backup" (eg. b, ba, back)
4) I've also tried adding recursive (even though I don't want it to be) and it finds all the files in directory/ that match the pattern, but none of the top level ones
I'm sure I'm doing something stupid. Thanks in advance for the help
aws s3 rm s3://my-bucket/ --recursive --exclude "*" --include "*backup-logs*" should work.
When you want to delete multiple objects within your bucket
--recursive (boolean) Command is performed on all files or objects
under the specified directory or prefix.
You can read on http://docs.aws.amazon.com/cli/latest/reference/s3/index.html#use-of-exclude-and-include-filters about include/exclude use

uploading all files of a certain extension type

I'm trying to upload all files of type .flv to an S3 bucket using the AWS CLI from a Windows server 2008 command line.
I do this:
aws s3 sync . s3://MyBucket --exclude '*.png'
And it begins uploading .png files instead.
I'm trying to follow the documentation and it gives an example that reads:
Local directory contains 3 files:
MyFile1.txt
MyFile2.rtf
MyFile88.txt
'''
aws s3 sync . s3://MyBucket/MyFolder --exclude '*.txt'
upload: MyFile2.rtf to s3://MyBucket/MyFolder/MyFile2.rtf
So what am I doing wrong?
Use:
aws s3 sync . s3://MyBucket/ --exclude "*" --include "*.flv"
It excludes all files, then includes .flv files. The order of parameters is important.
You can also use:
aws s3 cp . s3://MyBucket/ --recursive --exclude "*" --include "*.flv"
The difference is that sync will not re-copy a file that already exists in the destination.

Use AWS CLI to Copy from S3 to EC2

I have zipped files in an S3 bucket that I need to bring back to my EC2 instance. In the past, I moved the documents to S3 with the following command:
aws s3 cp /my/ec2/path/ s3://my/s3/path/ --exclude '*' --include '2014-01*’ —-recursive
To move files from January 2014 back to EC2, I have tried the following command:
aws s3 cp s3://my/s3/path/ //my/ec2/path/ --exclude '*' --include '2014-01*' --recursive
My understanding is that this command excludes all files but then includes all files with the prefix '2014-01'. I have confirmed that this is how the files I want start. I have also tried only one forward slash before mainstorage and including fewer files.
I have followed these two links from Amazon:
http://docs.aws.amazon.com/cli/latest/reference/s3/index.html
http://docs.aws.amazon.com/cli/latest/userguide/using-s3-commands.html
Figured it out. The key was to define the filepath in --include , i.e. --include '2014-1'. Correct command:
aws s3 cp s3://my/s3/path //my/ec2/path/ --exclude '*' --include '*2014-01*' --recursive

how to include and copy files that are in current directory to s3 (and not recursively)

I have some files that I want to copy to s3.
Rather than doing one call per file, I want to include them all in one single call (to be as efficient as possible).
However, I only seem to get it to work if I add the --recursive flag, which makes it look in all children directories (all files I want are in the current directory only)
so this is the command I have now, that works
aws s3 cp --dryrun . mybucket --recursive --exclude * --include *.jpg
but ideally I would like to remove the --recursive to stop it traversing,
e.g. something like this (which does not work)
aws s3 cp --dryrun . mybucket --exclude * --include *.jpg
(I have simplified the example, in my script I have several different include patterns)
AWS CLI's S3 wildcard support is a bit primitive, but you could use multiple --exclude options to accomplish this. Note: the order of includes and excludes is important.
aws s3 cp --dryrun . s3://mybucket --recursive --exclude "*" --include "*.jpg" --exclude "*/*"
Try the command:
aws s3 cp --dryrun . s3://mybucket --recursive --exclude "*/"
Hope it help.
I tried the suggested answers and could not get aws to skip nested folders. Saw some weird outputs about calculating size, and 0 size objects, despite using the exclude flag.
I eventually gave up on the --recursive flag and used bash to perform a single s3 upload for each file matched. Remove --dryrun once you're ready to roll!
for i in *.{jpg,jpeg}; do aws --dryrun s3 cp ${i} s3://your-bucket/your-folder/${i}; done
I would suggest to go for a utility called s4cmd which provides us unix like file system operations and it also allows us to include the wild cards
https://github.com/bloomreach/s4cmd