How to send standard output (like echo) to S3 file - amazon-web-services

I'm trying to send the output of an 'echo' to an S3 file. Similar to how we can do something like echo 'Hello World' > file.txt, I'm doing
aws s3 cp s3://dirname/dirsubfolder/file.txt > echo 'Hello World'. However, I get Key "file.txt" does not exist. I know the file doesn't exist, but I want to copy the output as that file - is there a way to do this?

This feature was added in 2014, but I could not find it in the cli help docs.
echo "Hello World" | aws s3 cp - s3://example-bucket/hello.txt

I don't think you can pipe command output into the aws s3 cp command line tool, just like you can't pipe text like that into the standard cp command. Also, the command you are trying:
aws s3 cp s3://dirname/dirsubfolder/file.txt > echo 'Hello World'
is actually piping the output of aws s3 cp command into the echo command, which is the exact opposite of what you say you are trying to do.
You're going to need to script this in a couple steps like:
echo 'Hello World' > /tmp/file.txt
aws s3 cp /tmp/file.txt s3://dirname/dirsubfolder/file.txt
rm /tmp/file.txt

Related

how to download files from aws s3 using a list

I have list of files in a bucket in aws s3, but when i execute the aws cp command it gives me an error saying "unknown option".
my list
s3://<bucket>/cms/imagepool/5f84dc7234bf5.jpg
s3://<bucket>/cms/imagepool/5f84daa19b7df.jpg
s3://<bucket>/cms/imagepool/5f84dcb12f9c5.jpg
s3://<bucket>/cms/imagepool/5f84dcbf25d4e.jpg
My bash script is below:
#!/bin/bash
while read line
do
aws s3 cp "${line}" ./
done <../links.txt
This is the error I get:
Unknown options: s3:///cms/imagepool/5f84daa19b7df.jpg
Does anybody know how to solve this issue.
Turns out the solution below worked(had to include the --no-cli-auto-prompt flag):
#!/bin/bash
while read line
do
aws s3 cp --no-cli-auto-prompt "${line}" ./
done <../links.txt

Amazon S3 filenames: Replace double spaces from all files

I have a bucket on Amazon S3 with thousands of files that contain double spaces in their names.
How can I replace all the double spaces with one space?
like: folder1/folder2/file name.pdf to folder1/folder2/file name.pdf
Option 1: Use a spreadsheet
One 'cheat method' I sometimes use is to create a spreadsheet and then generate commands:
Extract a list of all files with double-spaces:
aws s3api list-objects --bucket bucket-name --query 'Contents[].[Key]' --output text | grep '\ \ ' >file_list.csv
Open the file in Excel
Write a formula in Column B that creates a aws s3 mv command:
="aws s3 mv 's3://bucket-name/"&A1&"' 's3://bucket-name/"&SUBSTITUTE(A1," "," ")&"'"
Test it by copying the output and running it in a terminal
If it works, Copy Down to the other rows, copy and paste all the commands into a shell script, then run the shell script
Option 2: Write a script
Or, you could write a script in your favourite language (eg Python) that will:
List the bucket
Loop through each object
If the object Key has double-spaces:
Copy the object to a new Key
Delete the original object
According to the idea from #john-rotenstein
I build bash command that makes it in one line
aws s3 ls --recursive s3://bucket-name | cut -c32- | grep "\/.* .*" | (IFS='' ; while read -r line ; do aws s3 mv s3://bucket-name/"$line" s3://bucket-name/$(echo "$line" | xargs) --recursive; done)
get the list paths of the bucket
cut the result to get the only file path
search all paths that contain double spaces
move to new path with one space

Aws s3 file name is not printing using batch file in echo function

I am using a windows batch script to print the Aws s3 ls filename from the bucket by using this code.
#echo off
echo
set files =aws s3 ls "s3://my-test-files/01_Jan_2021.zip"
echo.%files%
pause
but when I run this command "aws s3 ls "s3://my-test-files/01_Jan_2021.zip" directly on cmd it gives me the file name but when I use this in the windows batch script and echo the file name it is not printing the name.
#Mic: your given solution provides me that data but I want to extract only the highlighted portion that is "01_Jan_2021.zip"
I think the problem is in the space in set files = command:
#echo off
set "files=aws s3 ls "s3://my-test-files/01_Jan_2021.zip""
%files%
pause
Is this what you're seeking?
#Set "files=aws s3 ls "s3://my-test-files/01_Jan_2021.zip""
#For /F "Tokens=3,* Delims= " %%G In ('%files%') Do #Echo=%%H
#Pause
#echo off
set files=aws s3 ls "s3://my-test-files/01_Jan_2021.zip
cls
echo %files%
pause

Powershell Pipe stdout from s3 cp command to gzip

Trying to use Powershell on Windows 10 to download a small .gz file from an s3 bucket using the aws s3 cp command.
I am piping the output of the s3 cp to gzip -d to decompress. My aim is to basically copy, unzip and display contents without saving the .gz file locally.
From reading the official Amazon documentation for the s3 cp command, the following is mentioned:
https://docs.aws.amazon.com/cli/latest/reference/s3/cp.html
Downloading an S3 object as a local file stream
WARNING:: PowerShell may alter the encoding of or add a CRLF to piped or >redirected output.
Here is the command I'm executing from powershell:
PS C:\> aws s3 cp s3://my-bucket/test.txt.gz - | gzip -d
Which returns the following error: gzip: stdin: not in gzip format
The command works fine when run from Windows Command Prompt but I just can't seem to get it working with Powershell.
From a Windows Command prompt, it works fine:
C:\Windows\system32>aws s3 cp s3://my-bucket/test.txt.gz - | gzip -d
With some sample test data output as follows:
first_name last_name
---------- ----------
Ellerey Place
Cherie Glantz
Isaak Grazier

copy data from s3 to local with prefix

I am trying to copy data from s3 to local with prefix using aws-cli.
But I am getting error with different regex.
aws s3 cp s3://my-bucket-name/RAW_TIMESTAMP_0506* . --profile prod
error:
no matches found: s3://my-bucket-name/RAW_TIMESTAMP_0506*
aws s3 cp s3://my-bucket/ <local directory path> --recursive --exclude "*" --include "<prefix>*"
This will copy only files with given prefix
The above answers to not work properly... for example I have many thousands of files in a directory by date, and I wish to retrieve only the files that are needed.. so I tried the correct version per the documents:
aws s3 cp s3://mybucket/sub /my/local/ --recursive --exclude "*" --include "20170906*.png"
and it did not download the prefixed files, but began to download everything
so then I tried the sample above:
aws s3 cp s3://mybucket/sub/ . /my/local --recursive --include "20170906*"
and it also downloaded everything... It seems that this is an ongoing issue with aws cli, and they have no intention to fix it... Here are some workarounds that I found while Googling, but they are less than ideal.
https://github.com/aws/aws-cli/issues/1454
Updated: Added --recursive and --exclude
The aws s3 cp command will not accept a wildcard as part of the filename (key). Instead, you must use the --include and --exclude parameters to define filenames.
From: Use of Exclude and Include Filters
Currently, there is no support for the use of UNIX style wildcards in a command's path arguments. However, most commands have --exclude "<value>" and --include "<value>" parameters that can achieve the desired result. These parameters perform pattern matching to either exclude or include a particular file or object. The following pattern symbols are supported.
So, you would use something like:
aws s3 cp --recursive s3://my-bucket-name/ . --exclude "*" --include "RAW_TIMESTAMP_0506*"
If you don't like silent consoles, you can pipe aws ls thru awk and back to aws cp.
Example
# url must be the entire prefix that includes folders.
# Ex.: url='s3://my-bucket-name/folderA/folderB',
# not url='s3://my-bucket-name'
url='s3://my-bucket-name/folderA/folderB'
prefix='RAW_TIMESTAMP_0506'
aws s3 ls "$url/$prefix" | awk '{system("aws s3 cp '"$url"'/"$4 " .")}'
Explanation
The ls part is pretty simple. I'm using variables to simplify and shorten the command. Always wrap shell variables in double quotes to prevent disaster.
awk {print $4} would extract only the filenames from the ls output (NOT the S3 Key! This is why url must be the entire prefix that includes folders.)
awk {system("echo " $4")} would do the same thing, but it accomplishes this by calling another command. Note: I did NOT use a subshell $(...), because that would run the entire ls | awk part before starting cp. That would be slow, and it wouldn't print anything for a looong time.
awk '{system("echo aws s3 cp "$4 " .")}' would print commands that are very close to the ones we want. Pay attention to the spacing. If you try to run this, you'll notice something isn't quite right. This would produce commands like aws s3 cp RAW_TIMESTAMP_05060402_whatever.log .
awk '{system("echo aws s3 cp '$url'/"$4 " .")}' is what we're looking for. This adds the path to the filename. Look closely at the quotes. Remember we wrapped the awk parameter in single quotes, so we have to close and reopen the quotes if we want to use a shell variable in that parameter.
awk '{system("aws s3 cp '"$url"'/"$4 " .")}' is the final version. We just remove echo to actually execute the commands created by awk. Of course, I've also surrounded the $url variable with double quotes, because it's good practice.