Is there any equivalent feature to BPMN UserTask available in AWS Step functions? - amazon-web-services

We have our old 'Camunda based Spring boot application' which we currently deployed it into kubernetes which is running in an AWS EC2 Instance. This application acts as a backend for an angular based UI application.
Now we need to develop a new application similar to the above, which needs to interact with UI.
Our process flow will contain some UserTasks (BPMN) which will wait until manual interaction performed by human user via angular UI.
We are evaluating the possibility of using AWS stepfunctions instead of Camunda, if possible.
I googled but unable to find a concrete answer.
Is AWS stepfunctions have any feature similar to BPMN/Camunda's UserTask ?

Short answer: No.
====================================
Long answer:
After a whole day of study, I decided to continue with CamundaBPM because of below reasons.
AWS step-functions don't have an equivalent feature of UserTask in BPMN.
Step functions supports minimal human intervention via sending emails/messages by using AWS SQS(simple queue service) and AWS SNS(simple notification service).
Refer this link for full example. This manual interaction also based on 'task token'. So this interaction is limited to basic conversational style.
Step-function is NOT coming with in-built database & data management support, the developer has to take care of designing database schema, creating tables, their relationship etc.
On the other hand, Camunda is taking care of creating tables, their relationship, saving & fetching data.
No GUI modeler is available in step-functions, instead you need to draw workflow in a JSON-like language. This will be very difficult if your workflow becomes complex.
Drawing workflow in Camunda is just drag-and-drop using it's modeler.

Related

GCP Best way to manage multiple cloud function flow

I'm studying GCP and reading about different ways to communicate and manage cloud functions I end up wondering when to use each of the services that offer GCP.
So, I have been reading about GCP Composer, GCP Workflows, Cloud Pub/Sub and I don't see clearly when to use each one, or use simple HTTP calls.
I understand that it depends a lot on the application that you are building, but for example, If I'm building a payment gateway and some functions should be fired after the payment was verified, like sending emails, making not related business logic, adding the purchase to a sales platform. So which one should be the way I manage this flow and in which case would be better to use the others? Should I use events to create an async flow with Pub/Sub, or use complex solutions like composer and workflows? or just simple HTTP calls?
As always, it depends!! Even in your use case, it depends! Ok, after a payment you want to send an email, make business logic, adding the order to your databases,...
But, is all theses actions can be done in parallel, or you need to execute them in a certain order and if a step fails, you stop the process?
In the first case, you can use Cloud PubSub with 1 message published (payment OK) and then a fan out to several functions in parallel. Else, you can use workflow to test the response of the fonction and then to call, or not the following fonctions. With composer you can perform much more checks and actions.
You can also imagine to send another email 24h after to thank the customer for their order, and use Cloud Task to delayed an action.
You talked about Cloud Functions, but you also have other solutions to host code on GCP: App Engine and Cloud Run. Cloud function is, most of the time, single purpose. Sending an email is perfect for a function.
Now, if you have "set of functions" to browse your stock, view the object details, review the price, and book an object (validate an order "books" the order content in your warehouse), the "functions" are all single purpose but related to the same domain: warehouse management. Thus you can create a webserver that propose different path to manage the warehouse (a microservice for the warehouse if you prefer) and host it on CloudRun or App Engine.
Each product has its strength and weakness. You will also see this when you will learn about the storage on GCP. Most of the time, you can achieve things with several product, but if you don't use the right one, it will be slower, or cost much more.

How create a combined response from multiple microservices (cloud run containers) in a single api endpoint using Google Cloud Endpoints (gateway)?

I am familiar with firebase platform, but I am relatively a new user of the google cloud platform as whole.
I am working on a project built using a microservices structure, and I do have so many question for which I cannot find an answer or better I cannot find any example.
Unfortunately all the example that I am able to find are way to simple to be able to extrapolate a viable answer for my issues.
I adopted the new cloud run offer, and I decided to play with the full managed version (not kubernetes). I built few microservices (each service is built using express for node or flask for python - depending on what the services does). Each microservices expose it's own endpoint and has it's own api to call the methods - and I use a service account to allow the application to perform the internal calls.
I now want to expose the application to the external (specifically to my client built using vuejs technology), and I was trying to leverage another google product to create and expose an api: the google endpoints.
My question (specifically referred to the cloud run structure) is related to how is possible and what I need to do to create an api endpoints to communicate with the client app, that internally calls multiple services and combine their response in one.
Just to be clear, let's make an example:
Cloud run service 1 -> crud user api
Cloud run service 2 -> crud product api
Cloud endpoint external visible api -> get user from service 1, and after get products from service 2 and return the combined response all green products for user Jane Doe.
How I can aggregate the response directly in the endpoint gateway, check for failure and if everything goes smooth send the aggregate response to the client?
I need to build the aggregate endpoint in something else, like a cloud function for example? or I can do it directly in the google endpoints gateway?
Note that for cloud run the google endpoints is another cloud run container.
Thanks guys for some help, running pretty much out of option here.
As per my understanding, API Gateway should just work as a proxy, presenting all micro services as a single endpoint. To this scenarios I think you can have following 2 approaches :
1: Implement a new micro service (or on any of the existing one) which will do invocations and aggregation of responses.
2: Client(like UI) can invoke the services and do the aggregation on their side as well.
I feel, it is not a good idea to do it at api-gateway.
In my opinion, from an architectural point of view, the best option for you is to create a new microservice which will take the responses from the other two and then, it will aggregate them.
I understand that you want to aggregate the responses in a api-geteway and you are not able to find code examples for it. Here I was able to find a guide on what are you wanting to implement. The full code implementation can be found in this repository.
Keep in mind though, this idea of implementation is not a best practice.
This is ok, only if those two services that are going to be combined are independent. Meaning there is no functional/business relation between them and the concurrency or inconsistency problem will not occur in the process of aggregating.

Planning an architecture in GCP

I want to plan an architecture based on GCP cloud platform. Below are the subject areas what I have to cover. Can someone please help me to find out the proper services which will perform that operation?
Data ingestion (Batch, Real-time, Scheduler)
Data profiling
AI/ML based data processing
Analytical data processing
Elastic search
User interface
Batch and Real-time publish
Security
Logging/Audit
Monitoring
Code repository
If I am missing something which I have to take care then please add the same too.
GCP offers many products with functionality that can overlap partially. What product to use would depend on the more specific use case, and you can find an overview about it here.
That being said, an overall summary of the services you asked about would be:
1. Data ingestion (Batch, Real-time, Scheduler)
That will depend on where your data comes from, but the most common options are Dataflow (both for batch and streaming) and Pub/Sub for streaming messages.
2. Data profiling
Dataprep (which actually runs on top of Dataflow) can be used for data profiling, here is an overview of how you can do it.
3. AI/ML based data processing
For this, you have several options depending on your needs. For developers with limited machine learning expertise there is AutoML that allows to quickly train and deploy models. For more experienced data scientists there is ML Engine, that allows training and prediction of custom models made with frameworks like TensorFlow or scikit-learn.
Additionally, there are some pre-trained models for things like video analysis, computer vision, speech to text, speech synthesis, natural language processing or translation.
Plus, it’s even possible to perform some ML tasks in GCP’s data warehouse, BigQuery in SQL language.
4. Analytical data processing
Depending on your needs, you can use Dataproc, which is a managed Hadoop and Spark service, or Dataflow for stream and batch data processing.
BigQuery is also designed with analytical operations in mind.
5. Elastic search
There is no managed Elastic search service directly provided by GCP, but you can find several options on the marketplace, like an API service or a Kubernetes app for Google’s Kubernetes Engine.
6. User interface
If you are referring to a user interface for your own use, GCP’s console is what you’d be using. If you are referring to a UI for end-users, I’d suggest using App Engine.
If you are referring to a UI for data exploration, there is Datalab, which is essentially a managed notebook service, and Data Studio, where you can build plots of your data in real time.
7. Batch and Real-time publish
The publishing service in GCP, for both synchronous and asynchronous messages is Pub/Sub.
8. Security
Most security concerns in GCP are addressed here. Which is a wide topic by itself and should probably need a separate question.
9. Logging/Audit
GCP uses Stackdriver for logging of most of its products, and provides many ways to process and analyze those logs.
10. Monitoring
Stackdriver also has monitoring features.
11. Code repository
For this there is Cloud Source Repositories, which integrate with GCP’s automated build system and can also be easily synched with a Github repository.
12. Analytical data warehouse
You did not ask for this one, but I think it's an important part of a data analysis stack.
In the case of GCP, this would be BigQuery.

Best option to schedule payments: azure scheduler, WebJob or Azure Functions or a Worker Role?

I've hosted my website on azure and now I want to schedule payments on a monthly basis. I am using Authorize.net for payments but I cannot use their recurring billing feature as it gives very little control. I've to perform checks in the database, make payments and update records. What should I use Azure Scheduler, Azure WebJob or Azure Functions a Worker Role?
Definitely not a Worker Role. They are very heavyweight and generally not worth the effort for a single, simple job like this.
Web jobs might be a good solution. It can run in the context of your web app, so you can use this with no additional cost. But you'll need to do some development with this - you have to create an app that calls Authorize.net.
If you only need to fire a single HTTP request, then using Azure Scheduler to schedule this HTTP action might be a good choice. You can configure the request itself (headers, payload) and it has error handling as well. But you might have to store sensitive information in the Azure portal, in the configuration of the scheduled job.
So I'd say forget about the Worker Role, then weigh simplicity against flexibility and development effort. That being sad, I would probably try it with the scheduler, and then move on to the WebJob, if I encounter something that is not feasible with the scheduler.
Edit:
Azure Functions can also be a good option - I'd say it's sort of a middle ground between the webjob and the simple scheduled option. It is part of the app services featureset, so it can run in the same appservice plan as the web app, so no costs. But here you have to code the http request to Authorize.net yourself as well. But Azure Functions is a lot more lightweight compared to webjobs - you do not have to create an exe (or ps script or whatever), you can just code the http request in a script editor inside the Azure portal. But you still have to do it yourself. This is a bit more flexible than the simple scheduled option though, which is something to consider when it comes to error handling.
So this is a good middleground, but I think it's still a lot of work given the complexity of the task (which is to fire a single HTTP request).
To get it working quickly, Logic Apps is a good choice. With Logic Apps, you can trigger it with a timer based on schedule you defined, use the out-of-box SQL/DocDB (depending on your exact scenario) to connect to your database. Although there's currently no Authorize.net connector available, you should be able to use the generic HTTP action to talk to its RESTful APIs. Most likely, you should be able to get this working very quickly. I'd also recommend submit a suggestion on aka.ms/logicapps-wish so we can track the request for Authorize.net connector, when available, is going to make this ever easier.

What is the "proper" way to use DynamoDB for an iOS app?

I've just started messing around with AWS DynamoDB in my iOS app and I have a few questions.
Currently, I have my app communicating directly to my DynamoDB database. I've been reading around lately and people are saying this isn't the proper way to go about getting data from my database.
By this I mean is I just have a function in my code querying my Dynamo database and returning the result.
How I do it works but is there a better way I should be going about this?
Amazon DynamoDB itself is a highly-scalable service and standing up another server in front of it requires scaling the service also in line with the RCU/WCU configured for your tables, which we can and should avoid.
If your mobile application doesn't need a backend server and you can perform all the business functions from the mobile device, then you should probably think about
Using the AWS DynamoDB SDK for iOS devices to write your client application that runs on the mobile device
Use AWS Token Vending Machine to authenticate your mobile users to grant them credentials to be used to run operations on DynamoDB tables.
Control access (i.e what operations should be allowed on tables etc.,) using IAM policies.
HTH.
From what you say, I can guess that you are talking about a way you can distribute data to many clients (ios apps).
There are few integration patterns (a very good book on this: Enterprise Integration Patterns), one of which is called shared database. It is essentially about using a common database for multiple clients to share the data. Main drawback for that pattern (in your case) is that you are doing assumption about how the database schema looks like. It can potentially bring you some headache supporting the schema in the future, if your business logic changes.
The more advanced approach would be sending events on every change in your data instead of directly writing changes to the database from client apps. This way you can add additional processing to the events before the data they carry is written to the database. For example, you may want to change the event format in the new version of your app, but still want to support legacy users, so you add translation procedure which transforms both types of events to the format which fits the database schema. It's basically a question of whether to work with diffs vs snapshots.
You should be aware of added complexity of working with events, and it can be an overkill if your app is simple and changes in schema are unlikely.
Also consider that you can do data preprocessing using DynamoDB Streams, which gives you some advantages of using events still keeping it simple to implement.