Architecture for AWS configuration application - amazon-web-services

I am creating an application which should only store some configuration. I am using AWS AppConfig as the configuration store.
I want to be able to update this configuration data through code. So when an event happens, I want to call SQS to create a message which holds the new configuration data to be appended. The SQS should call a lambda. The Lambda should get the latest configuration from AppConfig, append the new configurations, then deploy to AppConfig.
As a result, I want AppConfig to have the old configurations, and the new ones appended.
Is there a simple way to achieve this using only AWS services?

I've not tried any of this or used AppConfig directly but it shouldn't be difficult for you to piece information together from the web.
Create SQS Queue to hold the updates.
Create Lambda to read from the SQS Queue.
Write code for the Lambda which receives the message from the Queue, pulls the AppConfig and updates with the new values. Use one of the many AWS SDK's for your preferred language.
One thing that you should be aware of is that Lambdas can run multiple at a time so assuming your AppConfig looks like this:
{
"version": 1
}
then two updates get pushed to the SQS Queue at the same time:
{
"update1": "abc"
}
and
{
"update1": "xyz"
}
They could be executed at the same time and a race condition may occur where both save but one overwrwites the other.
I don't see the benefit of the SQS Queue here or understand the full use case or reason for using this set up but I think there may be a better way to achieve what you're trying to achieve.

Related

Configure multiple delete event in S3/Lambda

I am trying to build a Lambda function that gets triggered on S3 delete events. If multiple items are deleted at once, I want to use an S3 batch job. What I can't figure out or find in the documentation is what an event like that would look like. I'd assume it would just have multiple similar items in Records and I could iterate through, get all the keys, and then batch delete, but I can't confirm that. I've searched the documentation, and I built a test Lambda that would just log the event, but that came through as multiple distinct events. I'm stumped as to how to do what I'm trying here.
The s3 event you need to subscribe to is s3:ObjectRemoved:Delete that by documentation is used to track an object or a batch of objects being removed:
By using the ObjectRemoved event types, you can enable notification when an object or a batch of objects is removed from a bucket.
You can expect an event structured as detailed here.
However since in the comment you said you just wanted to "copy the objects pre-delete to another bucket" you may want to explore S3 buckets versioning capabilities.
Enabling versioning will allow you to preserve in a "deleted" state the objects, leaving room for future restores, as per delete workflow here.

Can I create temporary users through Amazon Cognito?

Does Amazon Cognito support temporary users? For my use case, I want to be able to give access to external users, but limited to a time period (e.g. 7 days)
Currently, my solution is something like:
Create User in User Group
Schedule cron job to run in x days
Job will disable/remove User from User Group
This all seems to be quite manual and I was hoping Cognito provides something similar automatically.
Unfortunately there is no functionality used to automate this workflow so you would need to devise your own solution.
I would suggest the below approach to handling this:
Create a Lambda function that is able to post process a user sign up. This Lambda function would create a CloudWatch Event with a schedule for 7 days in the future. Using the SDK you would create the event and assign a target of another Lambda function. When you specify the target in the put_targets function use the Input parameter to pass in your own JSON, this should contain a metadata item related to the user.
You would then create a post confirmation Lambda trigger which would trigger the Lambda you created in the above step. This would allow you to schedule an event every time a user signs up.
Finally create the target Lambda for the CloudWatch event, this will access the input passed in from the trigger and can use the AWS SDK to perform any cognito functions you might want to use such as deleting the user.
The benefit to using these services rather a cron, is that you can perform the most optimal processing only when it is required. If you have many users in this temporary group you would need to loop through every user and compare if its ready to be removed for a one time script (and perhaps sometimes never remove users).
My solution for this is the following: Instead of creating a post confirmation lambda trigger you can also create a pre authentication lambda trigger. This trigger will check for the user attribute "valid_until" which contains a unix timestamp. The pre authentication lambda trigger will only let the user in if the value of the "valid_until" attribute is in the future. Main benefit of this solution is that you don't need any cron-jobs.

Architectural advice for AWS firehose or similar when collecting a lot of events in real-time

I would like to ask you about getting some advice about handling many application events on AWS. My application sends a lot of different events about everything what a user did in real-time. For collecting those events, I’m using AWS firehose (kinesis) - I have few data streams where I push some different events. Some events, before storing on S3/Redshift contains data which I want to extract and store to other databases (DynamoDB) or to other S3 files — for that case I’m using lambda which is assigned to a specific stream.
My problem is that business adds more and more new events which they need to collect or do something with data and for every new event or „group” events I need create separate data stream + s3/rs/es + lambda for extracting data. Also, events on S3 are stored in one format and there is not possible to group that events e.g. by userId from an application or even name of the event in the stream filename. Ideal s3 with that events would look like events/{user_id}/{date}/{event-name}{timestamp}.json.
Maybe I’m wrong using firehose or I have wrong thinking about firehose in my case, maybe there are other, better services on AWS for my case which can give me more control. Maybe simple SQS + lambdas as a listener on S3 is better solution in this case?
Thanks for any advice.
EDIT 12th Nov 2020
This was supposed to be a comment for #Lina, but it was too long to put a comment, so I updated my question with the solution which I pick.
I resolved my issue as I "felt", so it may not be a good way to repeat, but: I've written a nodejs routing application which I connected on firehose and I wrote a few microservices where data is sent from firehose by my routing app. So now, I have a firehose tube and I'm taking 10 different event types. When some event came, my routing application decides what microservice should be run with what data based on the event type (the raw firehose event is still stored on s3 automatically). This gives me needed flexibility as I can extract specific data from the event, do with that data what I need, by running every other microservices from the whole system and still have a raw event in the s3 in case of needed revert history of events.
Some of the events are not passing to any service, it is just stored as a raw s3 file e.g. application logs - I can do many things with that files on S3 PUT/CREATE event.
I hope that it will help someone with a similar problem.

Send S3 document to Textract using Go

I'm trying to use Go to send objects in a S3 bucket to Textract and collect the response.
I'm using the aws go sdk package and able to connect to my S3 bucket and list all the objects contained within. So far so good. I now need to be able to send one of those objects (a .pdf file) to Textract and collect the response(s).
The AWS Go SDK content for interacting with Textract seem to be quite extensive but I cannot find a good example for how to do this.
I would be very grateful for a sample or advice on how to do this.
To start a job, you invoke StartDocumentTextDetection, using a DocumentLocation to specify the file, and you specify a SNS topic where Textract will publish a notification when it has finished to process your job.
You have now two possibilities:
Subscribe to the SNS topic, and when you receive a message retrieve the result
Create a lambda function triggered by the SNS topic, which retrieves the result.
The second option is IMO better 'cause it use less computation time (doesn't run until the job hasn't finished).
To retrieve the job, you use GetDocumentTextDetection
If anyone else reaches this site searching for an answer:
I understood the documentation as if I could just call the StartDocumentAnalysis function through the textract SDK but in fact what was missing is the fact that you need to create a new Session first and do the calls based on the session:
https://docs.aws.amazon.com/sdk-for-go/api/service/textract/#New

AWS S3 folder put event notification

I've written a function in Python that uploads a folder and its content to S3. Now I would like S3 to generate an event (so I can send it to a lambda function). S3 allows to generate events only at file level, in fact folders on s3 are just a visualization layer, which means that S3 has no internal representation for folders, keys with the same root are simply grouped together. That said, as for now I've come up with three approaches that revolves around the idea of a 'poison pill'.
Send a special file at the end of the folder upload process, the creation of which sends an event to lambda that can open the file to read custom directives to act on. Seems that this approach is quite flexible, however it poses serious concerns security-wise (I know that ACLs are in place for this reason but I'm not quite sure if it's enough), and generates some overhead while downloading/uploading/deleting the file from/to local memory.
Map an event to the target lambdas and fire it directly. The difference in approaches is simply that in this case I'm not really creating a file on S3, I'm just making S3 believe so. I would use CloudWatch to fire custom S3-object-created events with the name of the folder for lambda to pick up. This approach feels a little more hacky than the other two, plus when I did my research on the matter it seemed like it shouldn't be possible to generate "mock" events on AWS (i.e. Trigger S3 create event). To my understanding however, the function put_events should do the trick.
Using SQS would allow to put the folder name into an SQS task that can be later consumed by lambda. This has some advantages over the other two approaches, since SQS has now a LIFO variant that allows for exactly-once-delivery, failures reprocessing (via dead letters queue), etc, however this generates a non-trivial amount of complexity compared to the other approaches.
At this point I'm trying to opt for the most 'correct' approach, and
in order to do so I'm trying to weight pros and cons to make an informed decision, which led me to some questions:
Is there another way I'm missing out to proceed that does not involve client notification ? (all the aforementioned approaches rely on the client sending the notification in one way or another, which is not very "cloudy")?
Is there a substantial difference between approaches 2 and 3, considering that both rely on sending the information in and out of a stream (CloudWatch and SQS respectively)?
Have you consider using the prefix option of S3 bucket event, I tested it and it worked fine. In my S3 bucket I created two folder test1 and test2. On s3 event I added prefix test1 with that in place every time put/copy operation happen on bucket lambda is trigger.
I think your question nets down to "how can I trigger a Lambda function after I have uploaded a folder full of files to S3?"
Unless you have some information a priori server-side that you can use to determine when the folder upload has completed, the client is going to have to tell you.
Options I would consider:
change your client to publish a message to SNS or to SQS upon the completion of uploading to S3. That message can then trigger your Lambda function.
after the last file has been uploaded to folder images/dogs/, upload a zero-sized object whose key is the same as the folder (images/dogs/). This is a 'sentinel file'. Use an S3 event trigger with suffix of / to detect the upload of that 'folder' object and trigger your Lambda.
I prefer the 1st option. It achieves the end goal without resulting in extraneous S3 objects. With SNS you can also configure multiple downstream processes in response to the ‘finished upload’ message (a fan out) if needed.