Play Framework 2.4 Sequential run of multiple Promises - akka

I have got a Play 2.4 (Java-based) application with some background Akka tasks implemented as functions returning Promise.
Task1 downloads bank statements via bank Rest API.
Task2 processes the statements and pairs them with customers.
Task3 does some other processing.
Task2 cannot run before Task1 finishes its work. Task3 cannot run before Task2. I was trying to run them through sequence of Promise.map() like this:
protected F.Promise run() throws WebServiceException {
return bankAPI.downloadBankStatements().map(
result -> bankProc.processBankStatements().map(
_result -> accounting.checkCustomersBalance()));
}
I was under an impression, that first map will wait until Task1 is done and then it will call Task2 and so on. When I look into application (tasks are writing some debug info into log) I can see, that tasks are running in parallel.
I was also trying to use Promise.flatMap() and Promise.sequence() with no luck. Tasks are always running in parallel.
I know that Play is non-blocking application in nature, but in this situation I really need to do things in right order.
Is there any general practice on how to run multiple Promises in selected order?

You're nesting the second call to map, which means what's happening here is
processBankStatements
checkCustomerBalance
downloadBankStatements
Instead, you need to chain them:
protected F.Promise run() throws WebServiceException {
return bankAPI.downloadBankStatements()
.map(statements -> bankProc.processBankStatements())
.map(processedStatements -> accounting.checkCustomersBalance());
}
I notice you're not using result or _result (which I've renamed for clarity) - is that intentional?

Allright, I found a solution. The correct answer is:
If you are chaining multiple Promises in the way I do. That means, in return of map() function you are expecting another Promise.map() function and so on, you should follow these rules:
If you are returning non-futures from mapping, just use map()
If you are returning more futures from mapping, you should use flatMap()
The correct code snippet for my case is then:
return bankAPI.downloadBankStatements().flatMap(result -> {
return bankProc.processBankStatements().flatMap(_result -> {
return accounting.checkCustomersBalance().map(__result -> {
return null;
});
});
});
This solution was suggested to me a long time ago, but it was not working at first. The problem was, that I had a hidden Promise.map() inside function downloadBankStatements() so the chain of flatMaps was broken in this case.

Related

Implementing a custom async task type and await

I am developing a C++ app in which i need to receive messages from an MQ and then parsing them according to their type and for a particular reason I want to make this process (receiving a single message followed by processing it) asynchronous. Since, I want to keep things as simple as possible in a way that the next developer would have no problem continuing the code, I have written a very small class to implement Asynchrony.
I first raise a new thread and pass a function to the thread:
task = new thread([&] {
result = fn();
isCompleted = true;
});
task->detach();
and in order to await the task I do the following:
while (!isCompleted && !(*cancelationToken))
{
Sleep(5);
}
state = 1; // marking the task as completed
So far there is no problem and I have not faced any bug or error but I am not sure if this is "a good way to do this" and my question is focused on determining this.
Read about std::future and std::async.
If your task runs in another core or processor, the variable isCompleted may become un-synchronized having two copies in core cache. So you may be waiting more than needed.
If you have to wait for something it is better to use a semaphore.
As said in comments, using standard methods is better anyway.

Dispatch queue clear explanation

I know there are already a lot of posts about dispatch queues, async tasks etc. ,but I can't retrieve a useful explanation out of these posts, because there are too many distractions because of the extra code. I there someone who can give me a clear instruction on how to make Task B start after task A has been finished?
I need some data from Task A in order to run Task B successfully and I know that I have to do something with DispatchQueue.async, but I don't know how exactly.
The typical process would be to dispatch asynchronously with async to some serial queue. So, let's say you want some queue for processing images, doing task A and then task B, and then do some UI updates when task B is done, you might do:
let queue = DispatchQueue(label: Bundle.main.bundleIdentifier! + ".images")
queue.async {
// do task A
}
queue.async {
// do task B
}
queue.async {
// do whatever else is needed after B here
DispatchQueue.main.async {
// update model objects and UI here
}
}
This is a pattern that avoids blocking the main queue, but lets you make sure that you do A and B serially.
Please note, that if either task A or task B are, themselves, asynchronous, the above won't work. (Nor would trying to use sync, if the underlying task was asynchronous.) Other patterns would apply in these cases. But your example is too generic and there are simple too many other possible patterns for us to enumerate them all. If you tell us specifically what task A and B are doing, we could offer more constructive counsel.
Also note that I'd explicitly advise against dispatching synchronously (with sync). Using sync has a certain intuitive appeal, but it is rarely the right approach. Blocking the calling thread (which is what sync does) largely defeats the purpose of having dispatch queue in the first place. The (largely) only reason one should use sync is if you're trying to have thread-safe access to some shared resource. But most of the time, you use dispatch queues explicitly for the purpose of getting some time consuming task off the current thread. So, dispatch A and B async to serial queue, and if you wanted to do something else, C, afterwards, then you'd dispatch that async to the same queue, too.
For a description see Concurrency Programming Guide: Dispatch Queues. The examples are in Objective-C, but all the concepts are the same. You can also go to WWDC videos and search for "GCD", and you'll get a number of great videos that walk through Grand Central Dispatch (the broader term for dispatch queue technologies).
How about something like this?
import Dispatch
let queue = DispatchQueue(label: "My dispatch queue") //TODO: Give better label
let result1 = queue.sync { // "Task A"
return "result 1"
}
let result2 = queue.sync { // "Task B", which uses result from Task A
return result1.uppercased()
}
print(result2)

How to lock a long async call in a WebApi action?

I have this scenario where I have a WebApi and an endpoint that when triggered does a lot of work (around 2-5min). It is a POST endpoint with side effects and I would like to limit the execution so that if 2 requests are sent to this endpoint (should not happen, but better safe than sorry), one of them will have to wait in order to avoid race conditions.
I first tried to use a simple static lock inside the controller like this:
lock (_lockObj)
{
var results = await _service.LongRunningWithSideEffects();
return Ok(results);
}
this is of course not possible because of the await inside the lock statement.
Another solution I considered was to use a SemaphoreSlim implementation like this:
await semaphore.WaitAsync();
try
{
var results = await _service.LongRunningWithSideEffects();
return Ok(results);
}
finally
{
semaphore.Release();
}
However, according to MSDN:
The SemaphoreSlim class represents a lightweight, fast semaphore that can be used for waiting within a single process when wait times are expected to be very short.
Since in this scenario the wait times may even reach 5 minutes, what should I use for concurrency control?
EDIT (in response to plog17):
I do understand that passing this task onto a service might be the optimal way, however, I do not necessarily want to queue something in the background that still runs after the request is done.
The request involves other requests and integrations that take some time, but I would still like the user to wait for this request to finish and get a response regardless.
This request is expected to be only fired once a day at a specific time by a cron job. However, there is also an option to fire it manually by a developer (mostly in case something goes wrong with the job) and I would like to ensure the API doesn't run into concurrency issues if the developer e.g. double-sends the request accidentally etc.
If only one request of that sort can be processed at a given time, why not implement a queue ?
With such design, no more need to lock nor wait while processing the long running request.
Flow could be:
Client POST /RessourcesToProcess, should receive 202-Accepted quickly
HttpController simply queue the task to proceed (and return the 202-accepted)
Other service (windows service?) dequeue next task to proceed
Proceed task
Update resource status
During this process, client should be easily able to get status of requests previously made:
If task not found: 404-NotFound. Ressource not found for id 123
If task processing: 200-OK. 123 is processing.
If task done: 200-OK. Process response.
Your controller could look like:
public class TaskController
{
//constructor and private members
[HttpPost, Route("")]
public void QueueTask(RequestBody body)
{
messageQueue.Add(body);
}
[HttpGet, Route("taskId")]
public void QueueTask(string taskId)
{
YourThing thing = tasksRepository.Get(taskId);
if (thing == null)
{
return NotFound("thing does not exist");
}
if (thing.IsProcessing)
{
return Ok("thing is processing");
}
if (!thing.IsProcessing)
{
return Ok("thing is not processing yet");
}
//here we assume thing had been processed
return Ok(thing.ResponseContent);
}
}
This design suggests that you do not handle long running process inside your WebApi. Indeed, it may not be the best design choice. If you still want to do so, you may want to read:
Long running task in WebAPI
https://blogs.msdn.microsoft.com/webdev/2014/06/04/queuebackgroundworkitem-to-reliably-schedule-and-run-background-processes-in-asp-net/

How does Luigi's require work?

I am using Luigi tool of Spotify to handle dependencies between several jobs.
def require(self):
yield task1()`
info = retrieve_info()
yield task2(info=info)
In my example, I'd like to require from task1, then retrieve some information that depends on the execution of task1 in order to pass it as an argument of task2. However, my function retrieve_info won't work because task1 has not runned yet.
My question is, since I am using yield, task1 should not process before the call of retrieve_info is made? Is Luigi iterating over the required function and then launching the processing of the different task?
If this last assumption is right, how can I use execution of a required task as an input of a second required class?

Perform function at certain clock time

I would like the user to input a time e.g. 1400h - which will then cause a function to run at 1400h.
How can I do this?
Context: I have a client-server program that works on the same computer - and I need several nodes to send messages simultaneously (which is the function as above)
edit: I do not want to use a sleep() function, ideally, as the issue is that the clients will be started at different times and it is much neater as a solution to call something that causes the function to execute at 1400h.
You can use std::this_thread::sleep_until, e.g.
void main()
{
auto fire_time = /**/;
std::thread thread([&]
{
std::this_thread::sleep_until(fire_time);
fire();
});
thread.join();
}
You can refactor that into a helper function, which is probably what you are looking for:
template<class Func, class Clock, class Duration>
void run_at(Func&& func, const std::chrono::time_point<Clock,Duration>& sleep_time)
{
std::thread(std::bind([&](const Func& func)
{
std::this_thread::sleep_until(sleep_time);
func();
}, std::move(func)))
.detach();
}
If the program is running the entire time, use a function such as sleep to wait the amount of time between now and 1400h. You might need to do this in a separate thread to allow the program to do other things, or replace the sleep with an event loop timeout (if the program is event-loop-based).
If the program must exit, then you must use a system facility, such as at on Unix, to arrange the program to be restarted and code to be executed at the specified time.
I believe you need a some kind of task manager. That's a basic model. Breeding sleeping threads is very wrong way to do that job. A single manager will know when to run a next task. How to run a task is another question. You can make new thread per task if you want them to be interactive. Or you can serialize them and run from within the manager thread.