How to subscribe to changes in DynamoDB - amazon-web-services

I don't know how to subscribe to changes in DynamoDB database. Let me show an example: User A sends a message (which is saved in the database) to User B and in the User B's app the message automatically appears.
I know this is possible with recently released AWS AppSync, but I couldn't integrate it with Ionic (which I am using). However, there must be an alternative since AWS AppSync was released only at the end of 2017/beginning of 2018.
I've also seen something called Streams in DynamoDB but not sure if that's what I need.

DynamoDB Streams is designed specifically for capturing/subscribing to table activity. You can set up a Lambda Function with your notification logic to process the stream and send notifications accordingly.

Related

AWS Step Function manual approval process

I am working on the requirement where the data entered in the form needs to be validated manually and once validated , a approval mail be be sent out and then data will be stored in the database.I plan to use AWS step function for this with token.
https://aws.amazon.com/blogs/compute/implementing-serverless-manual-approval-steps-in-aws-step-functions-and-amazon-api-gateway/
I plan to use a similar design like in the link above.However is there a way not to use API Gateway for sending back the task token to step function to resume processing.Did anybody worked on the similar requirement and how the functionality was achieved. Thank you.
Step function can be invoked by the AWS Lambda function as well.
Once the form is validated and stored in database, you can trigger the Lambda function based on the database events(ex- if DynamoDB used then based on the DynamDB streams), and the lambda can start the step function.

Architectural advice for AWS firehose or similar when collecting a lot of events in real-time

I would like to ask you about getting some advice about handling many application events on AWS. My application sends a lot of different events about everything what a user did in real-time. For collecting those events, I’m using AWS firehose (kinesis) - I have few data streams where I push some different events. Some events, before storing on S3/Redshift contains data which I want to extract and store to other databases (DynamoDB) or to other S3 files — for that case I’m using lambda which is assigned to a specific stream.
My problem is that business adds more and more new events which they need to collect or do something with data and for every new event or „group” events I need create separate data stream + s3/rs/es + lambda for extracting data. Also, events on S3 are stored in one format and there is not possible to group that events e.g. by userId from an application or even name of the event in the stream filename. Ideal s3 with that events would look like events/{user_id}/{date}/{event-name}{timestamp}.json.
Maybe I’m wrong using firehose or I have wrong thinking about firehose in my case, maybe there are other, better services on AWS for my case which can give me more control. Maybe simple SQS + lambdas as a listener on S3 is better solution in this case?
Thanks for any advice.
EDIT 12th Nov 2020
This was supposed to be a comment for #Lina, but it was too long to put a comment, so I updated my question with the solution which I pick.
I resolved my issue as I "felt", so it may not be a good way to repeat, but: I've written a nodejs routing application which I connected on firehose and I wrote a few microservices where data is sent from firehose by my routing app. So now, I have a firehose tube and I'm taking 10 different event types. When some event came, my routing application decides what microservice should be run with what data based on the event type (the raw firehose event is still stored on s3 automatically). This gives me needed flexibility as I can extract specific data from the event, do with that data what I need, by running every other microservices from the whole system and still have a raw event in the s3 in case of needed revert history of events.
Some of the events are not passing to any service, it is just stored as a raw s3 file e.g. application logs - I can do many things with that files on S3 PUT/CREATE event.
I hope that it will help someone with a similar problem.

Send S3 document to Textract using Go

I'm trying to use Go to send objects in a S3 bucket to Textract and collect the response.
I'm using the aws go sdk package and able to connect to my S3 bucket and list all the objects contained within. So far so good. I now need to be able to send one of those objects (a .pdf file) to Textract and collect the response(s).
The AWS Go SDK content for interacting with Textract seem to be quite extensive but I cannot find a good example for how to do this.
I would be very grateful for a sample or advice on how to do this.
To start a job, you invoke StartDocumentTextDetection, using a DocumentLocation to specify the file, and you specify a SNS topic where Textract will publish a notification when it has finished to process your job.
You have now two possibilities:
Subscribe to the SNS topic, and when you receive a message retrieve the result
Create a lambda function triggered by the SNS topic, which retrieves the result.
The second option is IMO better 'cause it use less computation time (doesn't run until the job hasn't finished).
To retrieve the job, you use GetDocumentTextDetection
If anyone else reaches this site searching for an answer:
I understood the documentation as if I could just call the StartDocumentAnalysis function through the textract SDK but in fact what was missing is the fact that you need to create a new Session first and do the calls based on the session:
https://docs.aws.amazon.com/sdk-for-go/api/service/textract/#New

AWS Appsync subscription is not updating my component

I've cloned a repo from here and trying to explore AWS AppSync's subscription. My understanding is that if there is real-time updates to server data, client should expect to see some notification or updates or sorts, so what I did was:
running the app on a simulator
Open DynamoDB console and add the records manually.
I was expecting there is some notifications received on my app but there isn't, and if I refresh the app it will have the updated records? Am I understand the subscription wrongly?
Subscriptions are not triggered from your dynamo db, but from your mutations (defined in your graphql schema). Try to add records via the mutation your subscription listens on. You can run a mutation from the app sync console under "queries".
If your client is set up correctly, it should update accordingly.
Hope this helps :)
Subscription can only be triggered by mutation. When you add record directly to your DB, the mutation is not called hence no subscription is triggered. Does not really servers the purpose for the external db updates. There is work around available.
Scenario 1: If you are making the change to the DB directly via some DB client, you need to call the mutation endpoint explicitly (from AWS console, postman etc.). This will trigger the subscription. I am guessing the direct DB change is done for testing.
Scenario 2: The direct DB change is done by some other external process and not via Appsync mutation. You need to call the none data source mapped mutation in your processor. This dummy mutation will trigger the subscription.
Here's [a link] (https://aws.amazon.com/premiumsupport/knowledge-center/appsync-notify-subscribers-real-time/ ) explaining how to create a none data source mapped mutation.

What kind of data should/can each SQS message contain?

Suppose I have a task of updating an user via a third party API call. Is it okay to put the actual user data inside the message (if it fits)? Or should I only provide an ID in the message so the worker can retrieve the updated record from my local database?
You need to check what level of compliance is required for your infrastructure, to see what kind of data you want to put in the queue.
If there aren't any compliance restrictions, you are free to put any kind of data in your own infrastructure on AWS.