What the invocation flow of action sequence in openwhisk? - action

I am a little confused about the invocation flow of action sequence. I read the code, it shows that each sequence has a main action, which invoke each action in that sequence. In each invocation, the main action will issue an post to apihost, does it means the whole flow(from controller->kafka->dispather->invoker->container) will go through again and again?

Update:
Quite recently (per ca15c68d348a2a02cf9da54475e96b43d48a3dac) sequences got a huge overhaul. The "root" action mentioned below is no longer needed and invocation of all the actions is internally orchestrated by the controller itself.
Due to this change being quite recent (as of 21st of November 2016) this might not be deployed to all the production environments there are.
What you described is basically right. The "root" action serves as an orchestrator to the "leaf" actions. The root action invokes the leaf actions one by one through the usual API, thus repeating that flow over and over again.

Conceptually that is how one can implement a sequence directly. In this commit https://github.com/openwhisk/openwhisk/commit/ca15c68d348a2a02cf9da54475e96b43d48a3dac) the sequence "main" is internalized into the controller and bypasses repeated authentication and entitlement checks. Requests internally are still posted to Kafka since that makes them subject to load balancing.

Related

Writing Cucumber scenarios for reactive code

I am very new to BDD, and having a bit of trouble outlining the scenarios for some code that I wrote. The code basically queries a Couchbase bucket for abandoned orders, and then calls a cancel order API to cancel those orders. For each call to the cancel order API, it calls another service to generate access token.
The entire code has been written in RxJava. In case of errors, I have fallback observables in place (for example, should anything go awry while querying Couchbase, it fallbacks to an empty observable). I have similar fallbacks in other places in the code as well.
I want to write Cucumber scenarios for my code. But I can't figure out how to go about it. For example, should I assume that the service has a valid access token and an orderId to cancel (querying CB returns a bunch of orderIds that need to be passed on to the cancel order API along with an access token)?
Ideally, I should be testing the following:
Querying Couchbase fails. In this case, I should get an empty observable.
Call to the access token API fails. In this case as well, I should get an empty observable.
Call to the cancel order API fails. In this case as well, I should get an empty observable.
Call to cancel order API returns some response code other than 200. Then assert on the response codes.
The happy case.
Now suppose I want to test the first case, that is, querying CB fails. What will be the background in this case? How should I simulate a query failure?

Event Sourcing: concurrently creating conflicting events

I am trying to implement an Event Sourcing system using Kafka and have run into the following issue. During a new user sign-up I want to check if the username the user provided is already taken. However, consider the case where 2 users are trying to sign-up at the same time providing the same username.
In my understanding of how ES works the controller that processes the sign-up request will check if the request is valid, it will then send a new event (e.g. NewUser) to Kafka, and finally that event will be picked up by another controller which will persist it in a materialized view (e.g. Postgres DB). The problem is that the validation of the request is done against the materialized view but the actual persistence to it happens later. So because the 2 requests are being processed in parallel (by different service instances) they might both pass the validation, resulting in 2 NewUser messages. However, when the second controller tries to persist those 2 NewUser messages in the database saving the second event will fail because of the violation of the uniqueness constraint for the username.
Any ideas on how to address this?
Thanks.
UPDATE:
In particular, I would like to verify whether the following are accepted approaches to the problem:
use the username as the userId (restrictive)
send an event to a topic partitioned by username and when validation
is done send an event to another topic
Initial validation against the materialized view won't be enough in most scenarios where you have constraints. There can always be some relevant events haven't been materialized yet. There are two main concurrency control approaches to ensure that correct results are generated:
1. Pessimistic approach:
If you want to validate constraints before you publish an event, you need to lock relevant resources (entity, aggregate or data set). The locking means your services must not be able to publish events on these resources. After this point, to get the current state of your data:
You can wait until all events published before locking are materialized.
You can read current state from the database and apply events on it in a separate process.
2. Optimistic approach:
In this approach, you perform your validations after publishing events. To achieve this, you need to implement a feedback mechanism. The process which consumes events and performs validations should be able to publish validation results. You can perform the validations in-memory when possible. Otherwise, you can rely on your materialized data store.
Martin Kleppman talks about a two-step solution for exactly the same problem here and in his book. In this solution, there are two topics: "claims" and "registrations". First, you publish a claim to take the username, then try to write it to the database, and finally publish the result to the registrations topic. At conceptual level, it follows the same steps in the second approach you have mentioned. In validation step, it avoids implementing validation logic and keeping secondary indexes in memory by relying on the database.
During a new user sign-up I want to check if the username the user provided is already taken.
You may want to review Greg Young's essay on Set Validation.
In my understanding of how ES works the controller that processes the sign-up request will check if the request is valid, it will then send a new event (e.g. NewUser) to Kafka, and finally that event will be picked up by another controller which will persist it in a materialized view (e.g. Postgres DB).
That's a little bit different from the usual arrangement. (You may also want to review Greg's talk on polyglot data.)
Suppose we begin with two writers; that's fine, but if there is going to be a single point of truth, then you are going to need synchronization somewhere.
The usual arrangement is to use a form of optimistic concurrency; when processing a request, you reserve a copy of your original state, then you do your calculation, and finally you send the book of record a `replace(originalState,newState)'.
So at this point, we have two writes racing toward the book of record
replace(red,green)
replace(red,blue)
At the book of record, the writes are processed in series.
[...,replace(red,blue)...,replace(red,green)]
So when the book of record processes replace(red,blue), it performs a check that yes, the state is currently red, and swaps in blue. Later, when the book of record tries to process replace(red,green), the book of record performs the check, which fails because the state is no longer red.
So one of the writes has succeeded, and the other fails; the latter can propagate the failure outwards, or retry, or..., precisely what depends on the specific mechanics in question. A retry should mean, of course, reload the "original state", at which point the model would discover that some previous edit already claimed the username.
Any ideas on how to address this?
Single writer per stream makes the rest of the problem pretty simple, by eliminating the ambiguity introduced by having multiple in memory copies of the model.
Multiple writers using a synchronous write to the durable store is probably the most common design. It requires an event store that understands the idea of writing to a specific location in a stream -- aka "expected version".
You can perform an asynchronous write, and then start doing other work until you get an acknowledgement that the write succeeded (or not, or until you time out, or)....
There's no magic -- if you want uniqueness (or any other sort of invariant enforcement, for that matter), then everybody needs to agree on a single authority, and anybody else who wants to propose a change won't know if it has been accepted without getting word back from the authority, and needs to be prepared for a rejected proposal.
(Note: this shouldn't be a surprise -- if you were using a traditional design with current state stored in a RDBMS, then your authority would be a user table in the database, with a uniqueness constraint on the username column, and the race would be between the two insert statements trying to finish their transaction first....)

WS2ESB: Store state between sequence invocations

I was wondering about the proper way to store state between sequence invocations in WSO2ESB. In other words, if I have a scheduled task that invokes sequence S, at the end of iteration 0 I want to store some String variable (lets' call it ID), and then I want to read this ID at the start (or in the middle) of iteration 1, and so on.
To be more precise, I want to get a list of new SMS messages from an existing service, Twilio to be exact. However, Twilio only lets me get messages for selected days, i.e. there's no way for me to say give me only new messages (since I last checked / newer than certain message ID). Therefore, I'd like to create a scheduled task that will query Twilio and pass only new messages via REST call to my service. In order to do this, my sequence needs to query Twilio and then go through the returned list of messages, and discard messages that were already reported in the previous invocation. Now, to do this I need to store some state between different task/sequence invocations, i.e. at the end of the sequence I need to store the ID of the newest message in the current batch. This ID can then be used in subsequent invocation to determine which messages were already reported in the previous invocation.
I could use DBLookup and DB Report mediators, but it seems like an overkill (using a database to store a single string) and not very performance friendly. On the other hand, as far as I can see Class mediators are instantiated as singletons, therefore I could create a custom Class mediator that would manage this state and filter the list of messages to be sent to my service. I am quite sure that this will work, but I was wondering if this is the way to go, or there might be a more elegant solution that I missed.
We can think of 3 options here.
Using DBLookup/Report as you've suggested
Using the Carbon registry to store the values (this again uses DBs in the back end)
Using a Custom mediator to hold the state and read/write it from/to properties
Out of these three, obviously the third one will deliver the best performance since everything will be in-memory. It's also quite simple to implement and sometime back I did something similar and wrote a blog post here.
But on the other hand, the first two options can keep the state even when the server crashes, if it's a concern for your use case.
Since esb 490 you can persist and read properties from registry using property mediator.
https://docs.wso2.com/display/ESB490/Property+Mediator

REST API - Update of single resource changes multiple others

I'm looking for a way how to deal with a following problem:
Imagine you modify a resource and that subsequently causes update of other resources.
E.g. you issue a PUT to, say /api/orders/1234, which by definition changes state of all other Orders of given user. There may be UI clients that display the table of Orders and they should know that not only single item in the table was updated, but eventually other as well.
Now, is there any standard way how inform a clients about such a situation?
So far I can only think of sending back the 205 Reset Content HTTP status code to inform the client that he should refresh the state, as not just a single thing was changed.
There are multiple solutions.
You can define specific resources as non-cacheable, so the client does not cache them at all. (no-store)
You can try giving a max-age of 0, so the client will have to re-validate those resources always. In this case you might have to implement ETags and conditional GETs, but it will be easier on the server than option 1.
Some push method like WebSockets.
If you really want to "notify" potentially multiple clients of a change, then it sounds like you would need option 3.
However, correctly configured caching is normally good enough. For example you could mark not-yet-executed orders as not cached (max-age=0), but as soon as it is executed, you might mark it to be cached indefinitely, since it can not change anymore.

TMG SF_NOTIFY_POLICY_CHECK_COMPLETED Event

According to http://msdn.microsoft.com/en-us/library/ff823993%28v=VS.85%29.aspx, during this event the web filter can request GUID of the matching rule. I am assuming that is done by performing a GetServerVariable with type of SELECTED_RULE_GUID, since I could find no other readily identifiable means of doing so.
My problem comes from the fact that I want to see if the rule is allowing or blocking the request. If it's being blocked then my filter doesn't have to take any action, but if it's being allowed I need to do some work. SF_NOTIFY_POLICY_CHECK_COMPLETED seems to be the best event to watch, since it occurs last enough that authentication and various ms_auth traffic has been handled, but just before the request either gets routed or fetched from cache.
I had thought that perhaps I needed to use COM and the IFPC interfaces (following along with example code for registering Web Filters to TMG) to get details on the rule. However, going down via FPC -> FPCArray -> FPCArrayPolicy -> FPCPolicyRules, the only element-returning function takes either an index or a name.
Which is problematic given that I only have a GUID.
The FPCPolicyRule object (singular) doesn't seem have any field related to GUID either, which eliminates just iterating over the collection for it.
So my question boils down to, from the SF_NOTIFY_POLICY_CHECK_COMPLETED event, how would a web filter determine if the request has been allowed or denied?
After more investigation and testing, the GUID is accessible via the PersistentName of the FPCPolicyRule object. Since FPCPolicyRules->Item member only works on either Name or Index, I had to iterate through its items comparing each PersistentName against the GUID.
Apologies if this was obvious, took me a good day to work out :)