Why AccessKeyId is included in s3 pre-signed URL? Is it really necessary? The pre-signed URL already includes the Signature field, why it still requires the AccessKeyId as well? Wouldn't be the Signature sufficient?
The signature is used to prove two things:
that the signer authorizes this specific request, and
that the signer was in possession of the secret key associated with the specified access-key-id.
Importantly... the signature does not actually contain any meaningful information. It's either right or it's wrong.
It's an HMAC-based hash of public (the request being made) and private (the secret key) information. The service doesn't "decode" it or interpret it or learn anything from it.
Instead, the service -- using the access-key-id -- looks up the associated secret key,¹ takes the request; and internally generates the signature you should have generated for the same request... then it checks to see if that's what you actually generated.² If not, the error is SignatureDoesNotMatch. The error is not more specific because the signature for any given request at any moment in time has only one possible value. Any other signature is simply the wrong signature.
But the access-key-id must be specified so the service knows who's making the request. The signature does not contain any reversible/decodable/decryptable information.
¹ looks up the associated secret key is probably an oversimplification when using Signature Version 4 because there are layers of (date, region, service, signing) keys derived from the IAM user's secret key... and the structure and nesting implies that individual services have access only to the relevant values they need.
² you generated is an important phrase, since there is some potential for misunderstanding of the source of pre-signed URLs. These are generated entirely in your code, with no interaction with the service. S3 is unaware of the existence of any pre-signed URLs until they are actually used. This has implications that can sometimes be useful; for example, it is entirely possible to generate a pre-signed URL for an object that does not yet exist, and create the object later. Also, disabling or deleting the aws-access-key-id that was used to generate a pre-signed URL immediately invalidates all URLs that key ever generated.
Related
I know based on the AWS docs here
https://docs.aws.amazon.com/AmazonS3/latest/dev/PresignedUrlUploadObject.html
that its possible to generate a URL which can can used to
upload a specific object to your bucket
and that
You can use the presigned URL multiple times, up to the expiration date and time.
It is also possible to generate a URL (perhaps a base s3 presigned URL) which would allow multiple different unique documents to be uploaded based on a single URL?
For example, lets imagine a client application would like to upload multiple unique/distinct documents to s3 using some type of presigned URL. I dont necessarily want to force them to get a batch of presigned URLs since that would require much more on the part of the client (they would have to request batch of presigned URLs, rather than a single URL)
Here is the flow for a single document upload.
What is the simplest known solution for allowing a client to use some type of presigned url to upload multiple documents?
It is also possible to generate a URL (perhaps a base s3 presigned URL) which would allow multiple different unique documents to be uploaded based on a single URL?
A presigned URL is limited to a single single object key. You can't, for example, presign a key of foo and then use it to upload foo/bar (because that's a different key).
That means that, if you want to provide the client with a single pre-signed URL, the client code will have to combine the files itself. For example, you require the client to upload a ZIP file, then trigger a Lambda that unpacks the files in that ZIP.
Another approach is to use the AWS SDK from the client, and use the Assume Role operation to generate temporary access credentials that are restricted to uploading files with a specified prefix using an inline session policy.
A third approach is to hide the URL requests. You don't say what your client application does, but assuming that you let the user select some number of files, you could simply loop over those files and retrieve a URL for each one without ever letting your user know that's happening.
It is possible to upload multiple files with a single pre-signed URL and properly configured 'Starts-with' policy. Please, refer to the following AWS documentation: Browser-Based Uploads Using POST
I try to send request to AWS with signature using AWS Signature v4 Implementation for Web Browsers.
My request look like:
GET /test?id=ID-12
I get the 403 error with message like:
The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.
The Canonical String for this request should have been
'GET
/test
accept:*/*
...
so, as you see, here no param, which I think the issue, but what I don't understand, how AWS can make a suggestion about what should have been? I mean signature is represent by hash, no? Or I'm missing something? Thanks in advance!
Simplifying the process a bit, AWS uses HMAC to generate the signature.
One of the key principles is that HMAC does not encrypt the message. The message must be sent alongside the HMAC hash. The receiving side will calculate the HMAC again and verify the results.
AWS explicitly talks a bit about this in the Signature Documentation:
When an AWS service receives the request, it performs the same steps that you did to calculate the signature you sent in your request. AWS then compares its calculated signature to the one you sent with the request. If the signatures match, the request is processed. If the signatures don't match, the request is denied.
To answer your explicit question: The string they're showing you that they used to generate the "Canonical String" is derived from the HTTP request itself. The HTTP type as "GET" in the first line, the path passed to GET on the second line, and so on.
In other words, they're expecting the caller to understand what the request will look like, generate the Canonical String themselves beforehand, run their signature algorithm on it and the shared secret of the Access Key's Secret, and include the resulting hash in the request. Then on their side they take elements from the HTTP request, run this process again, and verify the result is correct.
For your failure, if you post how you're generating the presigned URL, we might be able to diagnose where the failure is.
Whenever an object is read/downloaded I want to trigger a Google Cloud Storage event to be listened by the Google Cloud functions to handle this event calling an auth endpoint to check if the user has access to the requested file, if so, the user receives the file.
Use case: When a user requests a stored file I want to first check if the user is allowed to download that specific file by calling an auth endpoint to authenticate and authorize the user access to the file. And only if he/she is allowed the file will be downloaded.
I've only found 4 event types supported by Google Cloud Storage Triggers: finalize, delete, archive and metadataUpdate.
https://cloud.google.com/functions/docs/calling/storage#event_types
Answering your question: "Is there a way to trigger a Google Cloud Storage download object event?"
Basically you are asking how to "download" an object. But under your use case would be: How to let some user download an specific object.
A solution for that would a Signed URL, which can let you provide the user with an URL to download the object for a limited time. If you redirect the user directly to that URL the download would start immediately.
You can use signed URL which essentially can be attached to a file is a URL which permits access to private objects stored on GCS. It is a means of keeping objects secure, yet grant temporary access to a specific object. Note there is no concept of SIgned URL for a directory of files.
It is created via a hash calculation based on the object path, expiry time and a shared Secret Access Key belonging to an account that has permission to access the GCS object. As such, each signed URL is unique.
Essentially, anyone who has this signed URL can do any ops which this URL is allowed to do. The only way to secure this further in case of any breach would be to rotate signing keys or changing the signing keys.
If I set up my app to generate pre-signed URLs for access to S3 media (so that I can set the files to be private, unless accessed via a logged in user) then would I be right in saying that, if someone has access to the URL (within the expiry time) they can see the file, despite it being "private"?
So if someone was to send the URL to someone else, then it's not really private any more.
I guess there's no other way but this just seems odd to me.
Yes, you are correct that a signed URL can be "shared" because it is valid until it expires (or until the credentials that signed it expire or are otherwise invalidated, whichever comes first).
One common solution is for your application to generate signed URLs as the page is being rendered, using very short expiration times.
Another is for the link to the secured content to actually be a link back to the application, which verifies the user's authority to access the object, and then returns an HTTP redirect to a freshly-generated signed URL with a short expiration time (e.g. 5 seconds).
HTTP/1.1 302 Found
Location: https://example-bucket.s3.amazonaws.com/...?X-Amz-...
Signed URLs cannot be tampered with using currently feasible compute capabilities, so it is impractical to the point of impossibility for a signed URL to be modified by a malicious user.
Note also that a signed URL (for either S3 or CloudFront) only needs to be not-yet-expired when the download starts. The time required for the download to actually finish can be arbitrarily long, and the download will not be interrupted.
There is no ready-made service for the following option, but using a combination of CloudFront Lambda#Edge triggers and DynamoDB, it is possible to create a genuinely single-use URL, which consists of a randomly generated "token" stored in the Dynamo table and associated with the target object. When the URL is accessed, you use a DynamoDB conditional update in the Lambda trigger to update the (e.g.) "view_count" value from 0 to 1. If the token isn't in the table or the view count isn't 0, the conditional update fails, so access is denied; otherwise CloudFront allows the request to proceed -- exactly once. CloudFront accesses the S3 content using an Origin Access Identity, which all happens behind the scenes, so nothing related to the actual authentication of the request between CloudFront and S3 is accessible to the user. (For cryptographic-quality random token generation, you can also use KMS's GenerateRandom API action.)
There are a number of alternative approaches, including other uses of Lambda#Edge triggers to do things like inspect a request for an application-provided cookie and then querying the application server to authenticate the user.
CloudFront also supports signed cookies that it parses and interprets, itself, but these provide wildcard-based access to all your assets matching a specific URL and path (e.g. /images/*) and there is nothing to prevent a user from sharing their cookies, so these are probably not useful for your use case.
CloudFront signed URLs do support the option of allowing access only if the signed URL is used from a specific source (client) IP address, but this has potential problems in there is no assurance that a 1:1 correlation exists between users and IP addresses. Many users can be behind the same address (particularly in corporate network environments) or a single user's address can change at any moment.
The complexity of the possible implementations varies wildly, and what you need depends in part on how secure you need for your content to be. In many cases, more extreme solutions accomplish little more than discouraging honest users, because the user can still download the resource and share it via other means.
That would still be a separate user requesting content. For a separate user, the certificate would not longer be valid.
Source: https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/private-content-signed-urls.html
I am writing a Chalice deployment, and experiencing behavior that I cannot explain.
My root endpoint accepts PUT requests, and verifies some basic authorization credentials.
from chalice import Chalice
from base64 import b64decode
app = Chalice(app_name='test-basic-auth-issue')
#app.route('/', methods=['PUT'])
def index():
auth = app.current_request.headers['Authorization'].split()
username, password = b64decode(auth[1]).split(':')
if username == 'test-user' and password == 'test-password':
return {username: password}
else:
raise Exception('Unauthorized')
Using curl to interface with this API:
curl https://test-user:test-password#<API-URL>/dev/ --upload-file test.txt
I get the following response:
{"message":"Authorization header requires 'Credential' parameter. Authorization header requires 'Signature' parameter. Authorization header requires 'SignedHeaders' parameter. Authorization header requires existence of either a 'X-Amz-Date' or a 'Date' header. Authorization=Basic dGVzdC11c2VyOnRlc3QtcGFzc3dvcmQ="}
However - when including any parameters in the URL:
curl https://test-user:test-password#<API-URL>/dev/?ANYTHING --upload-file test.txt
I get the expected response of:
{"test-user": "test-password"}
I'm not sure why specifying parameters is effecting the Authorization.
I have no inside information, so I have no way of knowing whether the following is actually correct, but this seems like a reasonable explanation for the behavior you're seeing.
AWS APIs generally support two alternatives for supplying your credentials: query string parameters, or the Authorization: header..
To the layer in their stack that checks the Authorization: header, your value seems wrong, so they throw an error, since your supplied credentials are not in the correct format...
...unless it sees a query string in the URI... in which case, it could be choosing to allow the request to proceed, on the assumption that the authorization might be done at that layer.
So the request is handed off to a different layer, which is responsible for query string handling. It doesn't find any credentials in the query string, but it is also aware of no credentials having been found while headers were being processed, previously, so the request is processed as an anonymous request if those are allowed.
So, you're slipping through a hole: by adding a query string, any query string, you prevent the Authorization: header from throwing the error.
It's not a security vulnerability, in my estimation, but rather a case where something in the URI changes how headers are interpreted -- specifically, whether a malformed (for its purposes) authorization header will trigger an exception or be allowed to pass.
I think there's a reasonable case for calling this behavior "broken," but at the same time I suspect it may be out of the hands of the API Gateway developers, who are working behind an unnamed front-end component that is common to multiple AWS services. API Gateway is a bit of an exception, in the AWS ecosystem, in that the customer can define how headers are manipulated... so this may very well be simply a platform limitation.
I disagree -- in part, and on a technicality -- with #LorenzodeLara's assertion that API Gateway is not compliant with RFC-7235. There is no requirement that a server respond with WWW-Authenticate: -- from the RFC:
Upon receipt of a request for a protected resource that omits credentials, contains invalid credentials (e.g., a bad password) or partial credentials (e.g., when the authentication scheme requires more than one round trip), an origin server SHOULD send a 401(Unauthorized) response that contains a WWW-Authenticate header field with at least one (possibly new) challenge applicable to the requested resource.
The words SHOULD and RECOMMENDED in RFCs indicate desired behavior for which there may be valid exceptions. They are not mandatory requirements.
It is, on the other hand, fully accurate that API Gateway does not at all support authenticating requests from its perspective with HTTP basic auth... but you are only trying to get it to pass those credentials through to the code running inside, which does appear to work, given that your user agent submits credentials without a challenge... assuming you prevent the front-end system from thwarting you, with the addition of a query string.
That's my analysis, subject to correction by someone with access to more authoritative information. In that light, using basic auth may not be the best plan, since it appears to work somewhat by accident.
I hate to come to that conclusion. Basic auth gets a bad rap, which is not wholly merited when combined with HTTPS -- unlike request signing, it typically "just works," right out of the box, even for a user who doesn't understand the difference between GET and POST, much less how to generate a hex-encoded HMAC digest.