AWS sync with --delete option set to ignore a folder - amazon-web-services

I am using the aws sync an S3 bucket, it has content at the root and in a specific folder - let's call it files/.
I am using the delete option because I want to remove the files that don't exist in destination in the source as well but just in the root folder. The folder files/* I want to keepintact.
Would that be possible with any of the command's options?

I think you can combine two sync commands to get the desired result:
aws s3 sync <from> <to> --delete --include "*" --exclude "files/*"
aws s3 sync <from> <to> --exclude "*" --include "files/*"
The first one should sync all files with the delete flag except the ones in files/ and the second one should sync only files in the files/ directory. Please be aware that the order of the filter parameters (--include / --exclude) plays a role, see Use of Exclude and Include Filters for an example.
Hope this helps!

Related

How to push all zip files on my specific folder to my s3 bucket folder?

I have problem where I can't push through all of my zip files to my s3 bucket, it happens right now when i run the bat files it just a second of loading of cmd and it will automatically close. when i refresh my s3 bucket folder there is no copy of zip files.
Command:
AWS S3 BUCKET:
My Script:
aws s3 cp s3://my_bucket/07-08-2020/*.zip C:\first_folder\second_folder\update_folder --recursive
The issue is with the *.zip. In order to copy file with specific extension use the following syntax :
aws s3 cp [LOCAL_PATH] [S3_PATH] --recursive --exclude "*" --include "*.zip"
From the docs:
Note that, by default, all files are included. This means that
providing only an --include filter will not change what files are
transferred. --include will only re-include files that have been
excluded from an --exclude filter. If you only want to upload files
with a particular extension, you need to first exclude all files, then
re-include the files with the particular extension.
More info can be found here.
#AmitBaranes is right. I checked on a Windows box. You could also simplify your command by using sync instead of cp.
So the command using sync could be:
aws s3 sync "C:\first_folder\second_folder\update_folder" s3://my_bucket/07-08-2020/ --exclude "*" --include "*.zip"

How to loop through an S3 bucket to copy certain list of folders from S3 bucket to local server

Have over 2000+ folders reside in S3 bucket. I do not want to copy all folders to my local server.
Is there a way or a script to loop through to copy 200 folders out of 2000+ folders from that particular bucket. for eg.
Need to copy over 200-400 folders out of 2000+ from s3 bucket, is there a regex group capture or script to automate to copy certain list of folders
input.....
faob/
halb/
mcgb/
mgvb/
nxhb/
ouqb/
pdyb/
qwdb/
output...
ouqb/
pdyb/
qwdb/
aws s3 cp s3://s3-bucket/* /tmp/
Yes, you can use multiple --include parameters to specify multiple input locations.
aws s3 cp s3://bucket-name /local/folder --recursive --exclude "*" --include "faob/*" --include "halb/*" --include "mcgb/*"
But you can't have multiple destination folders.
hope this helps.
This seems to work:
aws s3 cp --recursive s3://my-bucket /tmp/ --exclude "*" --include "*b/*"
For information about using wildcards in aws s3 cp, see: Use of Exclude and Include Filters

Use the aws client to copy s3 files from a single directory only (non recursively)

Consider an aws bucket/key structure along these lines
myBucket/dir1/file1
myBucket/dir1/file2
myBucket/dir1/dir2/dir2file1
myBucket/dir1/dir2/dir2file2
When using:
aws s3 cp --recursive s3://myBucket/dir1/ .
Then we will copy down dir2file[1,2] along with file[1,2]. How to only copy the latter files and not files under subdirectories ?
Responding to a comment: . I am not interested in putting a --exclude for every subdirectory so this is not a duplicate of excluding directories from aws cp
As far as I understood, you want to make sure that the files present in current directories are copied but anything in child directories should not be copied. I think you can use something like that.
aws s3 cp s3://myBucket/dir1/ . --recursive --exclude "*/*"
Here we are excluding files which will have a path separator after "dir1".
You can exclude paths using the --exclude option, e.g.
aws s3 cp s3://myBucket/dir1/ . --recursive --exclude "dir1/dir2/*"
More options and examples can be found by using the aws cli help
aws s3 cp help
There is no way you can control the recursion depth while copying files using aws s3 cp. Neither it is supported in aws s3 ls.
So, if you do not wish to use --exclude or --include options, I suggest you:
Use aws s3 ls command without --recursive option to list files directly under a directory, extract only the file names from the output and save the names to a file. Refer this post
Then write a simple script to read the file names and for each execute aws s3 cp
Alternatively, you may use:
aws s3 cp s3://spaces/dir1/ . --recursive --exclude "*/*"

awscli s3 sync wildcards

I'm trying to sync all files in a directory that start with "model.ckpt" to an S3 bucket path, by trying this:
aws s3 sync ./model.ckpt* $S3_CKPT_PATH
But I'm getting the error:
Unknown options: ./model.ckpt-0.meta,<my S3_CKPT_PATH path>
However, aws s3 sync . $S3_CKPT_PATH works, but gives me a lot of additional files I don't want.
Anybody know how I can do this?
When using aws s3 sync, all files in a folder are included.
If you wish to specify wildcards, you will need to Use Exclude and Include Filters.
For example:
aws s3 sync mydir s3://bucket/folder/ --exclude "*" --include "model.ckpt*"

Amazon S3: Use aws s3 cp without downloading existing files?

Is there a way using the AWS CLI to download files using --recursive and --exclude + --include and no overwrite files I have already downloaded? It likes to just rewrite files even if they haven't changed, and won't resume downloads after a crash.
I think you are looking for the sync command. It assumes --recursive flag by default:
Syncs directories and S3 prefixes. Recursively copies new and updated
files from the source directory to the destination. Only creates
folders in the destination if they contain one or more files.
Something like this will work:
aws s3 sync s3://bucket/path/to/folder/ . --exclude '*' --include 'filesToMatch*.txt'
As said by hjpotter92, --recursive is implied vs. cp.
And you can always include the --dryrun flag to verify what will run before actually executing it.