AWS S3api put-object: unknown options (checksum-crc32) - amazon-web-services

So I want to upload a file and have AWS perform a specified CRC32 (let's say the CRC is ABCD1234) check after the upload, but i keep getting this error.
usage: aws [options] [ ...]
[parameters] To see help text, you can run:
aws help aws help aws help
Unknown options: --checksumcrc32, ABCD1234
The command I use goes as follows (brackets [] for variables)
aws s3api put-object --bucket [BUCKET_NAME] --checksum-crc32
"ABCD1234" --key [NAME_OF_FILE] --body [DESTINATION_PATH] --profile
[PROFILE_NAME]
Uploads without the --checksum-crc32 work just fine.
Version: aws-cli/2.4.4
Any guesses why I get this error?
Thanks in advance!

The documentation says that the CRC needs to be Base-64 encoded, not hexadecimal:
--checksum-crc32 (string)
This header can be used as a data integrity check to verify that the
data received is the same data that was originally sent. This header
specifies the base64-encoded, 32-bit CRC32 checksum of the object. For
more information, see Checking object integrity in the Amazon S3 User
Guide .
So your ABCD1234 would need to be either q80SNA== or NBLNqw==, depending on whether they expect the 32 bits to be rendered in big-endian or little-endian order, respectively. I didn't see anything in the documentation that says which it is.

The CRC32 doesn't match their calculation. Make sure you're encoding it properly.
You don't need to specify the checksum on the cli, you can have the client calculate it by removing --checksum-crc32 and replacing it with --checksum-algorithm "crc32"
If your goal is data integrity, consider a cryptographically secure algorithm like SHA256, which can also be automatically calculated by the cli.

Related

AWS CLI - Put output into a readable format

So I have ran the following command in my CLI and it returned values, however, they are unreadable how would I format this into a table with a command?
do
echo "Check if SSE is enabled for bucket -> ${i}"
aws s3api get-bucket-encryption --bucket ${i} | jq -r .ServerSideEncryptionConfiguration.Rules[0].ApplyServerSideEncryptionByDefault.SSEAlgorithm
done
Would I need to change the command above?
You can specify an --output parameter when using the AWS CLI, or configure a default format using the aws configure command.
From Setting the AWS CLI output format - AWS Command Line Interface:
The AWS CLI supports the following output formats:
json – The output is formatted as a JSON string.
yaml – The output is formatted as a YAML string.
yaml-stream – The output is streamed and formatted as a YAML string. Streaming allows for faster handling of large data types.
text – The output is formatted as multiple lines of tab-separated string values. This can be useful to pass the output to a text processor, like grep, sed, or awk.
table – The output is formatted as a table using the characters +|- to form the cell borders. It typically presents the information in a "human-friendly" format that is much easier to read than the others, but not as programmatically useful.

AWS CLI S3API find newest folder in path

I've got a very large bucket (hundreds of thousands of objects). I've got a path (lets say s3://myBucket/path1/path2). /path2 gets uploads that are also folders. So a sample might look like:
s3://myBucket/path1/path2/v6.1.0
s3://myBucket/path1/path2/v6.1.1
s3://myBucket/path1/path2/v6.1.102
s3://myBucket/path1/path2/v6.1.2
s3://myBucket/path1/path2/v6.1.25
s3://myBucket/path1/path2/v6.1.99
S3 doesn't take into account version number sorting (which makes sense) but alphabetically the last in the list is not the last uploaded. In that example .../v6.1.102 is the newest.
Here's what I've got so far:
aws s3api list-objects
--bucket myBucket
--query "sort_by(Contents[?contains(Key, \`path1/path2\`)],&LastModified)"´
--max-items 20000
So one problem here is max-items seems to start alphabetically from the all files recursively in the bucket. 20000 does get to my files but it's a pretty slow process to go through that many files.
So my questions are twofold:
1 - This is still searching the whole bucket but I just want to narrow it down to path2/ . Can I do this?
2 - This lists just objects, is it possible to pull up just a path list instead?
Basically the end goal is I just want a command to return the newest folder name like 'v6.1.102' from the example above.
To answer #1, you could add the --prefix path1/path2 to limit what you're querying in the bucket.
In terms of sorting by last modified, I can only think of using an SDK to combine the list_objects_v2 and head_object (boto3) to get last modified on the objects and programmatically sort
Update
Alternatively, you could reverse sort by LastModified in jmespath and return the first item to give you the most recent object and gather the directory from there.
aws s3api list-objects-v2 \
--bucket myBucket \
--prefix path1/path2 \
--query 'reverse(sort_by(Contents,&LastModified))[0]'
If you want general purpose querying e.g. "lowest version", "highest version", "all v6.x versions" then consider maintaining a separate database with the version numbers.
If you only need to know the highest version number and you need that to be retrieved quickly (quicker than a list object call) then you could maintain that version number independently. For example, you could use a Lambda function that responds to objects being uploaded to path1/path2 where the Lambda function is responsible for storing the highest version number that it has seen into a file at s3://mybucket/version.max.
Prefix works with list_object using boto3 client. But using boto3 resource might give some issues. Paginator in pagination is a great concept and works nice!. to find the latest changes(additions of objects) : sort_by(contents)[-1]

How to supply a key on the command line that's not Base 64 encoded

Regarding the AWS S3 tool "sync" and a "customer-provided encryption key", it says here,
--sse-c-key (string) The customer-provided encryption key to use to server-side encrypt the object in S3. If you provide this value,
--sse-c be specfied as well. The key provided should not be base64 encoded.
How does one supply a key on the command line that is not base64 encoded?
If the key is not base64 encoded, then surely some of the key's bytes would not be expressible as characters?
At first glance, this seems like a HUGE oversight in the aws cli. However, buried deep in the CLI documentation is a blurb on how to provide binary data on the command line.
https://docs.aws.amazon.com/cli/latest/userguide/cli-usage-parameters-file.html
(updated link per #Chris's comment)
This did in fact work for me...
aws s3 cp --sse-c AES256 --sse-c-key fileb://key.bin large_file s3://mybucket/
The fileb:// part is the answer

getting malformed policy error from cloudfront

I am trying create signed urls for cloudfront. i followed the docs from amazon and was able to configure the cloudfront and s3 using console. but the problem is when i create the signed url(i generated the policy and signature using the linux commands) and prepared the below url
http://1q2w3e4r5t6y7u.cloudfront.net/4/myimage.jpg?Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9kbHIyamJoZGdobTE4LmNsb3VkZnJvbnQubmV0LzQvM2IwYWNiMjYtYTUyOC00MTYwLWE1Y2YtNDEzZWI3NGRkNjcxLmpwZyIsIkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTQwODczOTQwMH0sfX1dfQ0K&Signature=jOv/hpQSO7ChSYQ3w9k2EVh7MUrBxQ2dqbjQNPuEFcWgCKcBT6BufQoMnGWmVLHnIvFr8/ErQC2Q6iAxTyxHoHN7K9FMB2QmLbqaenKaRh8RIcufTmOlsbWXxMpQTwFOquQX7JE/2i4m6OGZBi4Chwse9fQwzHdQ4A6FPr/r8l0rDHLBXF58z8mq3tqJIqiE3joxJoy2K5dY4tzIXWCGZ25L941O8dkpSrmDbmQii8iGiJUGE0bFICpndlEbDVDUkHZsMSPXYt8fjJ2YTIbL58QtaVLMJeXY0kuDq4IUZ8ryp7BZ1Cqj5RKnkToIO4Qe5fNbfl9g-6nydcUbr6q72g__&Key-Pair-Id=xxxxxxxxxxxxxxxxxxxx
But i keep on getting "Malformed url" error. please help!!
Well, it does look malformed... the signature has several / characters, and it shouldn't.
The docs indicate that this pipeline can be used to build the signature:
cat policy | openssl sha1 -sign private-key.pem | openssl base64 | tr '+=/' '-_~'
If you do that, there shouldn't be any / left in your signature -- they would all have been converted to the ~ character.

Cannot delete Amazon S3 key that contains bad character

I just began to use S3 recently. I accidentally made a key that contains a bad character, and now I can't list the contents of that folder, nor delete that bad key. (I've since added checks to make sure I don't do this again).
I was using an old "S3" python module from 2008 originally. Now I've switched to boto-2.0, and I still cannot delete it. I did quite a bit of research online, and it seems the problem is I have an invalid XML character, so it seems a problem at the lowest level, and no API has helped so far.
I finally contacted Amazon, and they said to use "s3-curl.pl" from http://aws.amazon.com/code/128. I downloaded it, and here's my key:
<Key>info/[01</Key>
I think I was doing a quick bash for loop over some files at the time, and I have "lscolors" set up, and so this happened.
I tried
./s3curl.pl --id <myID> --key <myKEY> -- -X DELETE https://mybucket.s3.amazonaws.com/info/[01
(and also tried putting the URL in single/double quotes, and also tried to escape the '[').
Without quotes on the URL, it hangs. With quotes, I get "curl: (3) [globbing] error: bad range specification after pos 50". I edited the s3-curl.pl to do curl --globoff and still get this error.
I would appreciate any help.
This solved the issue, just delete the main folder:
aws s3 rm "s3://BUCKET_NAME/folder/folder" --recursive
You can use the s3cmd tool from here. You first need to run
s3cmd fixbucket <bucket name that contains bad file>.
You can then delete the file using
s3cmd del <bucket>/<file>
In my case there were newlines in the key (however that happened..). I was able to fix it with the aws cli like this:
aws cli rm "s3://my_bucket/Icon"$'\r'
I also had versioning enabled, so I also needed to do this, for all the versions (versions ids are visible in the UI when enabling the version view):
aws s3api delete-object --bucket my_bucket --key "Icon"$'\r' --version-id <version_id>
I was in this situation recently, to list the items you can use:
aws s3api list-objects-v2 --bucket my_bucket --encoding-type url
the bad keys will come back url encoded like:
"Key": "%01%C3%B4%C2%B3%C3%8Bu%C2%A5%27%40yr%3E%60%0EQ%14%C3%A5.gif"
spaces became + and I had to change those to %20 and * wasn't encoded I had to replace those with %2A before I was able to delete them.
To actually delete them, I wasn't able to use the aws cli because it would urlencode the already urlencoded key resulting in a 404, so to get around that I manually hit the rest API with the DELETE verb.
I recently encountered this case. I had newline at the end of my bucket. The following command solved the matter.
aws s3 rm "bucket_name"$'\r' --recursive