I have a long running POST request which updates the session with the requested result.
Now, when such concurrent POST requests are made from the same session, the updates done
in concurrent request don't get visibility across each other.
The effect is that, the session updates done in few concurrent request are eventually lost.
How is such a scenario normally handled ?
So the scenario is, in short, like, this:
long requests
from the same user
potentially concurrent
First clarification you should make is - is the concurrency inherent to the problem domain, or is it produced by user errors (e.g. clicking some control twice)?
In the latter case, both double submissions and double request processing can be prevented by lock-like mechanisms.
If concurrency is supposed to happen, you need to define the semantics of handling these concurrent requests.
Maybe last writer wins, or maybe the requests' operations are commutative and can be applied alltogether.
Once you determine these semantics, translating them to code should be easy. That said, it sounds like either Clojure's reference types (atom/agent/ref) or a database's facilities would be a better fit than a session object.
Related
My current application owns multiple «activatable» objects*. My intent is to "run" all those object in the same io_context and to add the necessary protection in order to toggle from single to multiple threads (to make it scalable)
If these objects were completely independent from each others, the number of threads running the associated io_context could grow smoothly. But since those objects need to cooperate, the application crashes in multithread despite the strand in each object.
Let's say we have objects of type A and type B, all of them served by the same io_context. Each of those types run asynchronous operations (timers and sockets - their handlers are surrounded with bind_executor(strand, handler)), and can build a cache based on information received via sockets and posted operations to them. Objects of type A needs to get information cached from multiple instances of B in order to perform their own work.
Would it be possible to access this information by using strands (without adding explicit mutex protection) and if yes how ?
If not, what strategy could be adopted to achieve the scalability?
I already tried playing with futures but that strategy leads unsurprisingly to deadlocks.
Thanx
(*) Maybe I'm wrong in the terminology: objects get a reference to an io_context and own their own strand, so I think they are activatable, because they don't own really a running thread
You're mixing vague words a bit. "Activatable", "Strandify", "inter coorporating". They're all close to meaningful concepts, yet, narrowly avoid binding to any precise meaning.
Deconstructing
Let's simplify using more precise concepts.
Let's say we have objects of type A and type B, all of them served by the same io_context
I think it's more fruitful to say "types A and B have associated executors". When you make sure all operations on A and B operate from that executor and you make sure that executor serializes access, then you basically get the Active Object pattern.
[can build a cache based on information received via sockets] and posted operations to them
That's interesting. I take that to mean you don't directly call members of the class, unless they defer the actual execution to the strand. This, again, would be the Active Object.
However, your symptoms suggest that not all operations are "posted to them". Which implies they run on arbitrary threads, leading to your problem.
Would it be possible to access this information by using strands (without adding explicit mutex protection) and if yes how ?
The key to your problems is here. Data dependencies. It's also, ;ole;y going to limit the usefulness of scaling, unless of course the generation of information to retrieve from other threads is a computationally expensive operation.
However, in the light of the phrase _"to get information cached from multiple instances of B'" suggests that in fact, the data is instantaneous, and you'll just be paying synchronization costs for accessing across threads.
Questions
Q. Would it be possible to access this information by using strands (without adding explicit mutex protection) and if yes how ?
Technically, yes. By making sure all operations go on the strand, and the objects become true active objects.
However, there's an important caveat: strands aren't zero-cost. Only in certain contexts they can be optimized (e.g. in immediate continuations or when the execution context has no concurrency).
But in all other contexts, they end up synchronizing at similar cost as mutexes. The purpose of a strand is not to remove the lock contention. Instead it rather allows one to declaratively specify the synchronization requirements for tasks, so that so that the same code can be correctly synchronized regardless of the methods of async completion (using callbacks, futures, coroutines, awaitables, etc) or the chosen execution context(s).
Example: I recently uncovered a vivid illustration of the cost of strand synchronization even in a simple context (where serial execution was already implicitly guaranteed) here:
sehe mar 15, 23:08 Oh cool. The strands were unnecessary. I add them for safety until I know it's safe to go without. In this case the async call chains form logical strands (there are no timers or full duplex sockets going on, so it's all linear). That... improves the situation :)
Now it's 3.5gbps even with the 1024 byte server buffer
The throughput increased ~7x from just removing the strand.
Q. If not, what strategy could be adopted to achieve the scalability?
I suspect you really want caches that contain shared_futures. So that the first retrieval puts the future for the result in cache, where subsequent retrievals get the already existing shared future immediately.
If you make sure your cache lookup datastructure is threadsafe, likely with a reader/writer lock (shared_mutex), you will be free to access it with minimal overhead from any actor, instead of requiring to go through individual strands of each producer.
Keep in mind that awaiting futures is a blocking operation. So, if you do that from tasks posted on the execution context, you may easily run out of threads. In such cases it maybe better to provide async_get in terms of boost::asio::async_result or boost::asio::async_completion so you can wait in non-blocking fashion.
I'm studying event sourcing and command/query segregation and I have a few doubts that I hope someone with more experience will easily answer:
A) should a command handler work with more than one aggregate? (a.k.a. should they coordinate things between several aggregates?)
B) If my command handler generates more than one event to store, how do you guys push all those events atomically to the event store? (how can I garantee no other command handler will "interleave" events in between?)
C) In many articles I read people suggest using optimistic locking to write the new events generated, but in my use case I will have around 100 requests / second. This makes me think that a lot of requests will just fail at huge rates (a lot of ConcurrencyExceptions), how you guys deal with this?
D) How to deal with the fact that the command handler can crash after storing the events in the event store but before publishing them to the event bus? (how to eventually push those "confirmed" events back to the event bus?)
E) How you guys deal with the eventual consistency in the projections? you just live with it? or in some cases people lock things there too? (waiting for an update for example)
I made a sequence diagram to better ilustrate all those questions
(and sorry for the bad english)
If my command handler generates more than one event to store, how do you guys push all those events atomically to the event store?
Most reasonable event store implementations will allow you to batch multiple events into the same transaction.
In many articles I read people suggest using optimistic locking to write the new events generated, but in my use case I will have around 100 requests / second.
If you have lots of parallel threads trying to maintain a complex invariant, something has gone badly wrong.
For "events" that aren't expected to establish or maintain any invariant, then you are just writing things to the end of a stream. In other words, you are probably not trying to write an event into a specific position in the stream. So you can probably use batching to reduce the number of conflicting writes, and a simple retry mechanism. In effect, you are using the same sort of "fan-in" patterns that appear when you have concurrent writers inserting into a queue.
For the cases where you are establishing/maintaining an invariant, you don't normally have many concurrent writers. Instead, specific writers have authority to write events (think "sharding"); the concurrency controls there are primarily to avoid making a mess in abnormal conditions.
How to deal with the fact that the command handler can crash after storing the events in the event store but before publishing them to the event bus?
Use pull, rather than push, as the primary subscription mechanism. Make sure that subscribers can handle duplicate messages safely (aka "idempotent"). Don't use a message subscription that can re-order events when you need events strictly ordered.
How you guys deal with the eventual consistency in the projections? you just live with it?
Pretty much. Views and reports have metadata information in them to let you know at what fixed point in "time" the report was accurate.
Unless you lock out all writers while a report is being consumed, there's a potential for any data being out of date, regardless of whether you are using events vs some other data model, regardless of whether you are using a single data model or several.
It's all part of the tradeoff; we accept that there will be a larger window between report time and current time in exchange for lower response latency, an "immutable" event history, etc.
should a command handler work with more than one aggregate?
Probably not - which isn't the same thing as always never.
Usual framing goes something like this: aggregate isn't a domain modeling pattern, like entity. It's a lifecycle pattern, used to make sure that all of the changes we make at one time are consistent.
In the case where you find that you want a command handler to modify multiple domain entities at the same time, and those entities belong to different aggregates, then have you really chosen the correct aggregate boundaries?
What you can do sometimes is have a single command handler that manages multiple transactions, updating a different aggregate in each. But it might be easier, in the long run, to have two different command handlers that each receive a copy of the command and decide what to do, independently.
I'm going through this tutorial: https://doc.akka.io/docs/akka/current/typed/guide/tutorial_3.html and don't quite understand when the at-most-once message semantics is preferable, since although we get performance gains, we lose resiliency of messages. It looks like the justification for this trade-off is explained here:
We only want to report success once the order has been actually fully processed and persisted. The only entity that can report success is the application itself, since only it has any understanding of the domain guarantees required. No generalized framework can figure out the specifics of a particular domain and what is considered a success in that domain.
In this particular example, we only want to signal success after a successful database write, where the database acknowledged that the order is now safely stored. For these reasons Akka lifts the responsibilities of guarantees to the application itself, i.e. you have to implement them yourself with the tools that Akka provides. This gives you full control of the guarantees that you want to provide. Now, let’s consider the message ordering that Akka provides to make it easy to reason about application logic.
, but I don't quite understand what it means. Any help in understanding this or some other considerations for this decision is appreciated.
I read this thread RPC semantics what exactly is the purpose which seemed to offer a clear definition of the use cases of at-most-once semantics with payment submission as the example of something you wouldn't want to duplicate. But from the quoted paragraph above, it sounds like the messages would be sent out into the ether with no regard for an ack that confirms success or failure of message delivery. I'm wondering if both descriptions of at-most-once semantics is correct to their respective domains, how to get the behavior in the other stackoverflow thread with an acknowledgement from akka.
All anything that doesn't know about the domain can offer with at-least or exactly-once delivery is that the message has been delivered (a guarantee that the message has been processed is also possible and practical in at least some (but not all) scenarios). This is fine if it's what you want, but conflating this with something higher level (like "order has been durably recorded") is virtually certain to lead to essentially impossible to debug bugs down the road.
At-least-once is quite easy to accomplish in Akka by having messages include a field containing an ActorRef to which to send an ack (or other response) and having the sender resend unacked messages (because it's eminently possible for the ack to get dropped, these retries are what leads to at-least-once). The ask pattern (included with Akka) provides this at a high level: in Akka Typed this is done by specifying an adapter function so that when actor A asks actor B, B can send a message in its protocol and A gets a message in its protocol (avoiding a chicken-and-egg problem); if no response is received in a specified timeframe, the adapter causes a failure message to be sent to actor A which (for at-least-once semantics would dictate that A eventually retry the message). The critical thing to remember is that it's actor B (or its designee: e.g. if B farms the work out to a worker actor, that worker actor can send the acknowledgement to A) that decides whether and when to respond, not Akka.
If doing at-least-once, it's very useful to design the messaging protocol around idempotence: a retry of a successful message doesn't result in a side effect beyond an ack. Idempotence plus at-least-once has been referred to as "effectively-once" and it's a lot easier to implement and lighter-weight than exactly-once.
Akka's docs on interaction patterns describe various messaging patterns in Akka, with a discussion of advantages and disadvantages. Fairly recently, especially when using Akka Cluster and Akka Persistence, there is a fairly heavyweight implementation of reliable delivery: in the maximum reliability mode (using Akka Persistence), because each message sent in this way is persisted to a datastore (e.g. local disk, or cassandra, or...), the latency for a message send is severely increased.
I am developing an app right now which creates and stores a connection to a local XMPP server in the Application scope. The connection methods are stored in a cfc that makes sure the Application.XMPPConnection is connected and authorized each time it is used, and makes use of the connection to send live events to users. As far as I can tell, this is working fine. BUT it hasn't been tested under any kind of stress.
My question is: Will this set up cause problems later on? I only ask because I can't find evidence of other people using Application variables in this way. If I weren't using railo I would be using CF's event gateway instead to accomplish the same task.
Size itself isn't a problem. If you were to initialize one object per request, you'd burn a lot more memory. The problem is access.
If you have a large number of requests competing for the same object, you need to measure the access time for that object vs. instantiation. Keep in mind that, for data objects, more than one thread can read them. My understanding, though, is that when an object's function is called, it locks that object to other threads until the function returns.
Also, if the object maintains state, you need to consider what to do when multiple threads are getting/setting that data. Will you end up with race conditions?
You might consider handling this object in the session scope, so that it is only instantiated per user (who, likely, will only make one or two simultaneous requests).
Of course you can use application scope for storing these components if they are used by all users in different parts of application.
Now, possible issues are :
size of the component(s)
time needed for initialization if these are set during application start
racing conditions between setting/getting states of these components
For the first, there are ways to calculate size of a component in memory. Lately there were lots of posts on this topic so it would be easy to find some. If you dont have some large structure or query saved inside, I guess you're ok here.
Second, again, if you are not filling this cfc with some large query from DB or doing some slow parsing, you're ok here too.
Third, pay attention to possible situations, where more users are changing states of these components. If so use cflock on each setting of the components the state.
I'm looking for an answer that describes a "continuation" mechanism in a web server vs. a programming language.
My understanding is that using continuations, it is trivial to have a "digits of pi" producer communicate with a "digits of pi" consumer, without explicit threading.
I've heard very good things about Jetty continuations. I am curious what others think.
I may have already found my answer, but I'm asking the question here anyway - for the record.
how do they compare to the continuations found in programming languages?
They have nothing in common apart from the name. It's merely a mechanism for freeing the current thread by giving Servlet an API for storing and restoring its state, but it's all rather manually managed as opposed to real continuations, where the state is automatically inferred from the current context.
The prototypical example for cases where this makes sense is layered (composed) web services, where one service needs to make many requests to other services, and while these requests are made, the current thread is freed. Upon completions of the requests (which can be done asynchronously on some other threads), the servlet's resume method is called, which then will assemble the response from the results of the requests.
According to this page:
continuations will be replaced by
standard Servlet-3.0 suspendable
requests once the specification is
finalized. Early releases of Jetty-7
are now available that implement the
proposed standard suspend/resume API
I have not used Jetty yet, but it seems that with continuations the server is not required to keep a thread for each client where normally when the server is "holding off" (i guess blocking) on sending a response to a client that continuously polls it with AJAX it would need a thread for each client which would be a scalability problem.