Query AWS SNS Endpoints by User Data - amazon-web-services

Simple question, but I suspect it doesn't have a simple or easy answer. Still, worth asking.
We're creating an implementation for push notifications using AWS with our Web Server running on EC2, sending messages to a queue on SQS, which is dealt with using Lambda, which is sent finally to SNS to be delivered to the iOS/Android apps.
The question I have is this: is there a way to query SNS endpoints based on the custom user data that you can provide on creation? The only way I see to do this so far is to list all the endpoints in a given platform application, and then search through that list for the user data I'm looking for... however, a more direct approach would be far better.
Why I want to do this is simple: if I could attach a User Identifier to these Device Endpoints, and query based on that, I could avoid completely having to save the ARN to our DynamoDB database. It would save a lot of implementation time and complexity.
Let me know what you guys think, even if what you think is that this idea is impractical and stupid, or if searching through all of them is the best way to go about this!
Cheers!

There isn't the ability to have a "where" clause in ListTopics. I see two possibilities:
Create a new SNS topic per user that has some identifiable id in it. So, for example, the ARN would be something like "arn:aws:sns:us-east-1:123456789:know-prefix-user-id". The obvious downside is that you have the potential for a boat load of SNS topics.
Use a service designed for this type of usage like PubNub. Disclaimer - I don't work for PubNub or own stock but have successfully used it in multiple projects. You'll be able to target one or many users this way.

According the the [AWS documentation][1] if you try and create a new Platform Endpoint with the same User Data you should get a response with an exception including the ARN associated with the existing PlatformEndpoint.
It's definitely not ideal, but it would be a round about way of querying the User Data Endpoint attributes via exception.
//Query CustomUserData by exception
CreatePlatformEndpointRequest cpeReq = new CreatePlatformEndpointRequest().withPlatformApplicationArn(applicationArn).withToken("dummyToken").withCustomUserData("username");
CreatePlatformEndpointResult cpeRes = client.createPlatformEndpoint(cpeReq);
You should get an exception with the ARN if an endpoint with the same withCustomUserData exists.
Then you just use that ARN and away you go.

Related

AWS IOT - RegisterThing fails -- InvalidCertificateOwnershipToken

This is a real challenge. I have been successful in everything up to this point in Fleet provisioning on an embedded device. I have subscribed and published to topics and received new certificates and keys. But, when I take the certificateOwnershipToken that has been given to me and I try to trigger a DeviceRegistration, I get:
{"statusCode":400,"errorCode":"InvalidCertificateOwnershipToken","errorMessage":"Certificate ownership token cannot be empty."}
My token is 466 characters long and I send it with 2 other items in this string:
{"certificateOwnershipToken":"eyF1ZXJzaW9uIjoiMjAxOTEwMjMiLCJjaXBoZXIiOiJBaURqMUdYMjBiVTUwQTFsTUV4eEJaM3ZXREU1dXZSSURoTy80cGpLS1o1VkVHdlZHQm81THNaS1VydG0zcTdoZGtVR0l1cmJZS0dLVkx2dTZNL2ViT2pkVFdIeDEwU3o3aFZPeExERkxWVlJ4OUIvL2RzcXRIeVp1WVo2RXZoU1k0L0txQ0doZ1lyRklwZGlLK05pUlNHMXlLQXJUSGJXSkNlVUxHcHRPWHJtdHJaNWJMUyt1MHFUcjNJVnlVLzNpcGZVVm1PanpmL3NCYzdSNkNyVGJPZ05Nc2xmOXdHSVRWM0tPUjd1aFFSbnZySnY0S1ZtU2hYc2REODI4K1crRE1xYnRYZGUxSXlJU29XZTVTSHh6NVh2aFF3OGl3V09FSzBwbG15Zi82dUgyeERoNjB1WS9lMD0ifQ==","parameters":{"SerialNumber":"82B910","CertificateId":"175b43a3d605f22d30821c4a920a6231978e5d846d3f2e7a15d2375d2fd5098c"}}
My templates looks right, my policy looks correct. The role which is attached to my template seem to cover my needs. I just don't know how AWS is failing without more information.
Does anyone have ideas on how to proceed?
I found my problem. In C/C++ aws iot sdk -- there is a data structure where you must specify the payload string and a few other things. One of those data elements is the length of the payload and I forgot to set that length before sending my payload to the $aws/provisioning-templates//provision/json topic. Once I set that length, then the submission worked and the template was acted upon and the thing was created

Send S3 document to Textract using Go

I'm trying to use Go to send objects in a S3 bucket to Textract and collect the response.
I'm using the aws go sdk package and able to connect to my S3 bucket and list all the objects contained within. So far so good. I now need to be able to send one of those objects (a .pdf file) to Textract and collect the response(s).
The AWS Go SDK content for interacting with Textract seem to be quite extensive but I cannot find a good example for how to do this.
I would be very grateful for a sample or advice on how to do this.
To start a job, you invoke StartDocumentTextDetection, using a DocumentLocation to specify the file, and you specify a SNS topic where Textract will publish a notification when it has finished to process your job.
You have now two possibilities:
Subscribe to the SNS topic, and when you receive a message retrieve the result
Create a lambda function triggered by the SNS topic, which retrieves the result.
The second option is IMO better 'cause it use less computation time (doesn't run until the job hasn't finished).
To retrieve the job, you use GetDocumentTextDetection
If anyone else reaches this site searching for an answer:
I understood the documentation as if I could just call the StartDocumentAnalysis function through the textract SDK but in fact what was missing is the fact that you need to create a new Session first and do the calls based on the session:
https://docs.aws.amazon.com/sdk-for-go/api/service/textract/#New

Should I store failed login attempts in AWS Cognito or Dynamo DB?

I have a requirement to build a basic "3 failed login attempts and your account gets locked" functionality. The project uses AWS Cognito for Authentication, and the Cognito PreAuth and PostAuth triggers to run a Lambda function look like they will help here.
So the basic flow is to increment a counter in the PreAuth lambda, check it and block login there, or reset the counter in the PostAuth lambda (so successful logins dont end up locking the user out). Essentially it boils down to:
PreAuth Lambda
if failed-login-count > LIMIT:
block login
else:
increment failed-login-count
PostAuth Lambda
reset failed-login-count to zero
Now at the moment I am using a dedicated DynamoDB table to store the failed-login-count for a given user. This seems to work fine for now.
Then I figured it'd be neater to use a custom attribute in Cognito (using CognitoIdentityServiceProvider.adminUpdateUserAttributes) so I could throw away the DynamoDB table.
However reading https://docs.aws.amazon.com/cognito/latest/developerguide/cognito-dg.pdf the section titled "Configuring User Pool Attributes" states:
Attributes are pieces of information that help you identify individual users, such as name, email, and phone number. Not all information about your users should be stored in attributes. For example, user data that changes frequently, such as usage statistics or game scores, should be kept in a separate data store, such as Amazon Cognito Sync or Amazon DynamoDB.
Given that the counter will change on every single login attempt, the docs would seem to indicate I shouldn't do this...
But can anyone tell me why? Or if there would be some negative consequence of doing so?
As far as I can see, Cognito billing is purely based on storage (i.e. number of users), and not operations, whereas Dynamo charges for read/write/storage.
Could it simply be AWS not wanting people to abuse Cognito as a storage mechanism? Or am I being daft?
We are dealing with similar problem and main reason why we have decided to store extra attributes in DB is that Cognito has quotas for all the actions and "AdminUpdateUserAttributes" is limited to 25 per second.
More information here:
https://docs.aws.amazon.com/cognito/latest/developerguide/limits.html
So if you have a pool with 100k or more it can create a bottle neck if wanted to update a Cognito user records with every login etc.
Cognito UserAttributes are meant to store information about the users. This information can then be read from the client using the AWS Cognito SDK, or just by decoding the idToken on the client-side. Every custom attribute you add will be visible on the client-side.
Another downside of custom attributes is that:
You only have 25 values to set
They cannot be removed or changed once added to the user pool.
I have personally used custom attributes and the interface to manipulate them is not excellent. But that is just a personal thought.
If you want to store this information, and not depend on DynamoDB, you can use Amazon Cognito Sync. Besides the service, it offers a client with great features that you can incorporate to your app.
AWS DynamoDb appears to be your best option, it is commonly used for such use cases. Some of the benefits of using it:
You can store separate record for each login attempt with as much info as you want such as ip address, location, user-agent etc. You can also add datetime that can be used by pre-auth Lambda to query by time range for example failed attempt within last 30 minutes
You don't need to manage table because you can set TTL for DynamoDb record so that record will be deleted automatically after specified time.
You can also archive items in S3

What kind of data should/can each SQS message contain?

Suppose I have a task of updating an user via a third party API call. Is it okay to put the actual user data inside the message (if it fits)? Or should I only provide an ID in the message so the worker can retrieve the updated record from my local database?
You need to check what level of compliance is required for your infrastructure, to see what kind of data you want to put in the queue.
If there aren't any compliance restrictions, you are free to put any kind of data in your own infrastructure on AWS.

When using AWS SQS, is there any reason to prefer using GetQueueUrl to building a queue url from the region, account id, and name?

I have an application that uses a single SQS queue.
For the sake of flexibility I would like to configure the application using the queue name, SQS region, and AWS account id (as well as the normal AWS credentials and so forth), rather than giving a full queue url.
Does it make any sense to use GetQueueUrl to retrieve a url for the queue when I can just build it with something like the following (in ruby):
region = ENV['SQS_REGION'] # 'us-west-2'
account_id = ENV['SQS_AWS_ACCOUNT_ID'] # '773083218405'
queue_name = ENV['SQS_QUEUE_NAME'] # 'test3'
queue_url = "https://sqs.#{region}.amazonaws.com/#{account_id}/#{queue_name}
# => https://sqs.us-west-2.amazonaws.com/773083218405/test3
Possible reasons that it might not:
Amazon might change their url format.
Others???
I don't think you have any guarantee that the URL will have such a form. The official documentation states the GetQueueUrl call as the official method for obtaining queue urls. So while constructing it using the method above may be a very good guess, it may also fail at any time because Amazon can change the URL scheme (e.g. for new queues).
If Amazon changes the queue URL in a breaking way it will not be immediate and will be deprecated slowly, and will take effect moving up a version (i.e. when you upgrade your SDK).
While the documentation doesn't guarantee it, Amazon knows that it would be a massively breaking change for thousands of customers.
Furthermore, lots of customers use hard coded queue URLs which they get from the console, so those customers would not get the updated queue URL format either.
In the end, you will be safe either way. If you have LOTs of queues, then you will be better off formatting them yourself. If you have a small number of queues, then it shouldn't make much difference either way.
I believe for safety purposes the best way to get the URL is through the sqs.queue.named method. What you can do is memoize the queues by name to avoid multiple calls, something like that:
# https://github.com/phstc/shoryuken/blob/master/lib/shoryuken/client.rb
class Client
##queues = {}
class << self
def queues(queue)
##queues[queue.to_s] ||= sqs.queues.named(queue)
end
end
end