AWS s3 mv command deletes the source directory - amazon-web-services

The objective is to move all the files from s3 "dir2" directory to EMR directory "mydir".
I am using the command:
aws s3 mv s3:///dir1/dir2/ /mnt/mydir/ --recursive
This command gets executed but the dir2 directory from s3 gets deleted.
The files within dir2 although moves to mydir of EMR.
How can I only move the files from source dir of s3 without removing the source directory?

When dealing with multiple objects, you want to use sync not cp or mv:
aws s3 sync s3:///dir1/dir2/ /mnt/mydir/
There are ways to load data into EMR directly from S3, so you may want to look into those.
Update: I have confirmed that:
aws s3 mv s3://bucket/f1/f2 . --recursive
Will move all of the files inside f2/ while leaving f2 in the bucket.

Directories or folders do not actually exist in S3. What you are calling a directory is simply a common file name prefix in S3. When there are no files with that prefix anymore then the "directory" does not exist anymore in S3.

Related

AWS CLI to download file with its entire folder structure from S3 to local and/or one S3 to another S3

AWS CLI to download file with its entire folder structure from S3 to local and/or one S3 to another S3
I am looking to download the file from S3 bucket to local with its entire folder structure. For example,
s3://test-s3-dev/apps/test-prd/test/data/sets/frs/bblr/type/level=low/type=data/bd=2022-08-25/region=a/entity=c/ss=tt/dev=mtp/datasetV=1/File123.txt
Above is the S3 path which i need to download on local with it's entire folder structure from S3.
However, by
cp --recursive and synch both are only downloading the File123.txt in current local folder and not downloading the FIle123.txt file with its entire folder structure.
**Please advice how to achieve the File gets downloaded from S3 with its entire folder structure from S3 for ->
To download on local system and/or
Copy from one s3 connection to another S3 connection.**
aws --endpoint-url http://abc.xyz.pqr:9020 s3 cp --recursive s3://test-s3-dev/apps/test-prd/test/data/sets/frs/bblr/type/level=low/type=data/bd=2022-08-25/region=a/entity=c/ss=tt/dev=mtp/datasetV=1/File123.txt ./
OR
aws --endpoint-url http://abc.xyz.pqr:9020 s3 cp --recursive s3://test-s3-dev/apps/test-prd/test/data/sets/frs/bblr/type/level=low/type=data/bd=2022-08-25/region=a/entity=c/ss=tt/dev=mtp/datasetV=1/ ./
OR
aws --endpoint-url http://abc.xyz.pqr:9020 s3 sync s3://test-s3-dev/apps/test-prd/test/data/sets/frs/bblr/type/level=low/type=data/bd=2022-08-25/region=a/entity=c/ss=tt/dev=mtp/datasetV=1/ ./
Above Three aws commands are downloading the file directly in current local folder without copying/sync the file entire directory structure from S3.

Issue with aws cli with s3 sync not copying files from sub folder

I have an S3 bucket thats like this:
bucket_name/v1.0.0/file1.js
bucket_name/v1.0.0/file2.js
bucket_name/file3.js
bucket_name/file4.js
I'm trying to copy the files from the v1.0.0 subdirectory to the root directory or bucket, and delete the old files.
aws s3 sync s3://bucket_name/v1.0.0 s3://bucket_name/ --delete --exclude 'v*'
To exclude deleting the version subdirectories, but for some reason this is just deleting the file3.js and file4.js from the root/bucket but its not syncing any files from the v1.0.0 sub directory like i was expecting.
I was expecting to end with:
bucket_name/v1.0.0/file1.js
bucket_name/v1.0.0/file2.js
bucket_name/file1.js
bucket_name/file2.js
You can copy the files from the v1.0.0 directory to the root directory with:
aws s3 sync s3://bucket_name/v1.0.0/ s3://bucket_name/
The AWS CLI sync command will never delete the original files. The --delete option tells it to delete any files in the destination that are not present in the source. For example, let's say you run the sync once per day and somebody has delete a file from the source during the day. In this situation, the file will also be deleted from the destination by the sync process.
If you wish to move the objects, then you can use:
aws s3 mv s3://bucket_name/v1.0.0/ s3://bucket_name/ --recursive
This will, effectively, copy the objects and then delete the original objects.
I have had a problem similar and later discovered that the folder in S3 had a space after it which was preventing the files from copying

Trying to copy one file with Amazon S3 CLI

I made a folder with 3 .jpg files in it to test. This folder is called c:\Work\jpg.
I am trying to upload it to a bucket with this command:
aws s3 cp . s3://{bucket}/Test
I get the following every time:
[Errno 2] No such file or directory: "C:\Work\jpg\".
Obviously, it correctly translated the current folder "." into the correct folder, but then it says it doesn't exist?!?
Any help out there to simply copy 3 files?
Are you confusing aws s3 sync with aws s3 cp. For copy, you need to specify the source file. The destination file can be current directory.
aws s3 cp test.txt s3://mybucket/test2.txt
Ensure that your path is correctly written.
Remember add --recursive option, because is folder
aws s3 cp ./ s3://{bucket}/Test --recursive

AWS CLI cp doesn't copy the files second time

I'm trying to copy/move/sync the files from local directory to S3 using the AWS Command-Line Interface (CLI).
I was able to successfully upload files for the very first time to the S3 bucket but when I try to run the same command again for uploading the second time it fails to upload. The command doesn't throw any error.
Here is the command which I ran for moving the files.
aws s3 mv --recursive my-directory s3://my-files/
For instance, I had files file1.pdf, file2.pdf and file3.pdf.
If I delete file2.pdf from the s3 bucket and try to copy the file again using cp or sync or mv. It won't be uploading the file back again to s3 bucket.
AWS CLI Version: aws-cli/1.15.10 Python/2.6.6 Linux/2.6.32-642.6.2.el6.x86_64 botocore/1.10.10
Any thoughts?
Initially I ran the aws s3 mv --recursive my-directory s3://my-files/ which transfers the files and deletes them from the local directory. Only the files were deleted, folders still exist. Files didn't exist in those folders so the subsequent cp & sync commands didn't work.

AWS S3 sync --delete, removed new files in local

aws s3 sync --delete removed some new files.
For example:
There is a file in the bucket - S3://my-bucket/images/1.jpg
Then, I uploaded a file to the server: 2.jpg
There are 2 files in the server: 1.jpg and 2.jpg
Start running the sync cronjob:
aws s3 sync s3://my-bucket/ ./ --delete
aws s3 sync ./ s3://my-bucket/ --delete
Why do we add --delete - we want to delete the files in s3 and sync it to the server.
We will upload files to the server and remove the files in s3.
Is there any way to fix it?
By default, the aws sync command (see documentation) does not delete files. It simply copies new or modified files to the destination.
Using the --delete option deletes files that exist in the destination but not in the source.
So, if your source contains: 1.jpg and 2.jpg and the destination contains 1.jpg, 2.jpg and 3.jpg, then using the --delete option will delete 3.jpg from the destination.
I see that you are running the sync command in both directions. Your first command (that syncs S3 to a local directory) will delete any local files that are not in S3.
If your goal is to copy all local files to S3 and all S3 files to the local directory, without deleting any files, then do not use the --delete option.
--delete option to remove files or objects from the target not present in the source.
For the new s3cmd you can use the option --delete-removed to remove extra files in the destination, that are not present the source.
Create minimal folder path for sync that is empty, do sync with delete for desired path.