Is there way to execute any particular flow after every other flow without need to plug it explicitly - akka

I have multiple flows(To process message received from queue) to execute and after every flow I need to check if there is any error in previous flow, if yes, then I filter out the message in process, otherwise continue to next flow.
Currently, I have to plug this error handler flow explicitly after every other flow. Is there any way this can be done with some functionality where this error flow can be configured to run after every other flow. Or any other better way to do this?
Example:
flow 1 -> Validate message, if error, mark message as error
error flow -> check if message is marked error, if yes filter, otherwise continue.
flow 2 -> persist message to db, mark in case of error.
error flow -> check if message is marked error, if yes filter, otherwise continue
flow 3 -> and so on.
Or is there way to wrap (flow 1 + error flow), (flow 2 -> error flow) ?

I am not sure it is exactly what you asked for, but I have sort of a solution. What can be done, is creating all flows, for instance we can look at:
val flows = Seq (
Flow.fromFunction[Int, Int](x => { println(s"flow1: Received $x"); x * 2 }),
Flow.fromFunction[Int, Int](x => { println(s"flow2: Received $x"); x + 1}),
Flow.fromFunction[Int, Int](x => { println(s"flow3: Received $x"); x * x})
)
Then, we need to append to each of the exsiting flows, the error handling. So let's define it, and add it to each of the elements:
val errorHandling = Flow[Int].filter(_ % 2 == 0)
val errorsHandledFlows = flows.map(flow => flow.via(errorHandling))
Now, we need a helper function, that will connect all of our new flows:
def connectFlows(errorsHandledFlows: Seq[Flow[Int, Int, _]]): Flow[Int, Int, _] = {
errorsHandledFlows match {
case Seq() => Flow[Int] // This line is actually redundant, But I don't want to have an unexhausted pattern matching
case Seq(singleFlow) => singleFlow
case head :: tail => head.via(connectFlows(tail))
}
}
And now, we need to execute all together, for example:
Source(1 to 4).via(connectFlows(errorsHandledFlows)).to(Sink.foreach(println)).run()
Will provide the output:
flow1: Received 1
flow2: Received 2
flow1: Received 2
flow2: Received 4
flow1: Received 3
flow2: Received 6
flow1: Received 4
flow2: Received 8
As you can tell, we filter the odd numbers. Therefore the first flow gets all numbers from 1 to 4. The second flow received 2,4,6,8 (the first flow multiplied the values by 2), and the last one did not receive any flow, because the second flow makes all of the values odd.

You can also use Merge
val g = RunnableGraph.fromGraph(GraphDSL.create() { implicit builder =>
import GraphDSL.Implicits._
val merge = builder.add(Merge[Int](3))
val flow1 = ...
val flow2 = ...
val flow3 = ...
flow1 ~> merge
flow2 ~> merge
flow3 ~> merge
ClosedShape
})
Not sure if it meets your need, just showing the alternative.

Related

Sending parallel requests Erlang

I am implementing a Twitter-like application in Erlang. I have both its distributed and non-distributed implementations. I am doing a benchmark but it seems I cannot find a way to send parallel requests to each user process for the distributed implementation. I am using a lists:foreach function to send "get tweets" to a list of client processes.My understanding is that the lists:foreach function steps into each element of the list one at a time realizing a sequential behavior which ultimately makes my distributed implementation result in an equal execution time with the non-distributed implementation. Is it possible to send the "get tweets" requests to different client processes all at once? This to me seems like a rather specific case and it has been difficult to search for a solution inside and outside StackOverflow.
test_get_tweets_Bench() ->
{ServerPid, UserInfos} = initializeForBench_server(),
run_benchmark("timeline",
fun () ->
lists:foreach(fun (_) ->
UserChoice = pick_random(UserInfos),
server:get_tweets(element(2, UserChoice), element(1, UserChoice), 1)
end,
lists:seq(1, 10000))
end,
30).
pick_random(List) ->
lists:nth(rand:uniform(length(List)), List).
userinfos is a list of the following form: [{userId,client_process},...]
After trying rpc:pmap instead of the lists:foreach, my benchmark has become approximately 3 times slower. The changes are as follows:
test_get_tweets_Bench2() ->
{ServerPid, UserInfos} = initializeForBench_server(),
run_benchmark("get_tweets 2",
fun () ->
rpc:pmap({?MODULE,do_apply},
[fun (_) ->
UserChoice = pick_random(UserInfos),
server:get_tweets(element(2, UserChoice), element(1, UserChoice), 1)
end],
lists:seq(1, 10000))
end,
30).
pick_random(List) ->
lists:nth(rand:uniform(length(List)), List).
do_apply(X,F)->
F(X).
I thought rpc:pmap would make my benchmark faster as it would send the get_tweet requests in parallel.
Below is my server module which is the API between my benchmark and my Twitter-like application. The API sends the requests from my benchmark to my Twitter-like application.
%% This module provides the protocol that is used to interact with an
%% implementation of a microblogging service.
%%
%% The interface is design to be synchrounous: it waits for the reply of the
%% system.
%%
%% This module defines the public API that is supposed to be used for
%% experiments. The semantics of the API here should remain unchanged.
-module(server).
-export([register_user/1,
subscribe/3,
get_timeline/3,
get_tweets/3,
tweet/3]).
%%
%% Server API
%%
% Register a new user. Returns its id and a pid that should be used for
% subsequent requests by this client.
-spec register_user(pid()) -> {integer(), pid()}.
register_user(ServerPid) ->
ServerPid ! {self(), register_user},
receive
{ResponsePid, registered_user, UserId} -> {UserId, ResponsePid}
end.
% Subscribe/follow another user.
-spec subscribe(pid(), integer(), integer()) -> ok.
subscribe(ServerPid, UserId, UserIdToSubscribeTo) ->
ServerPid ! {self(), subscribe, UserId, UserIdToSubscribeTo},
receive
{_ResponsePid, subscribed, UserId, UserIdToSubscribeTo} -> ok
end.
% Request a page of the timeline of a particular user.
% Request results can be 'paginated' to reduce the amount of data to be sent in
% a single response. This is up to the server.
-spec get_timeline(pid(), integer(), integer()) -> [{tweet, integer(), erlang:timestamp(), string()}].
get_timeline(ServerPid, UserId, Page) ->
ServerPid ! {self(), get_timeline, UserId, Page},
receive
{_ResponsePid, timeline, UserId, Page, Timeline} ->
Timeline
end.
% Request a page of tweets of a particular user.
% Request results can be 'paginated' to reduce the amount of data to be sent in
% a single response. This is up to the server.
-spec get_tweets(pid(), integer(), integer()) -> [{tweet, integer(), erlang:timestamp(), string()}].
get_tweets(ServerPid, UserId, Page) ->
ServerPid ! {self(), get_tweets, UserId, Page},
receive
{_ResponsePid, tweets, UserId, Page, Tweets} ->
Tweets
end.
% Submit a tweet for a user.
% (Authorization/security are not regarded in any way.)
-spec tweet(pid(), integer(), string()) -> erlang:timestamp().
tweet(ServerPid, UserId, Tweet) ->
ServerPid ! {self(), tweet, UserId, Tweet},
receive
{_ResponsePid, tweet_accepted, UserId, Timestamp} ->
Timestamp
end.
In Erlang, a message is exchanged form a process A to a process B. There is no feature available like a broadcast, or a selective broadcast. In your application I see 3 steps:
send a request to get the tweets from the users,
the user process prepare the answer and send it back to the requester
the initial process collects the answers
Sending the requests to the user processes and collecting the tweets (steps 1 and 3) cannot use parallelism. Of course you can use multiple processes to send the requests and collect the answers, up to 1 per user, but I guess that it is not the subject of your question.
What is feasible, is to ensure that the 3 steps are not done in sequence for each user process, but in parallel. I guess that the function server:get_tweets is responsible to send the request and collect the answers. If I am correct (I cannot know since You don't provide the code, and you ignore the returned values), you can use parallelism by splitting this function in 2, the first send the requests, the second collects the answers. (here is an example of code, I don't have tried or even compiled, so consider it with care :o)
test_get_tweets_Bench() ->
{ServerPid, UserInfos} = initializeForBench_server(),
run_benchmark("timeline",
fun () ->
% send the requests
List = lists:map(fun (_) ->
{UserId,Pid} = pick_random(UserInfos),
Ref = server:request_tweets(Pid,UserId),
{Ref,UserId}
end,
lists:seq(1, 10000)),
% collects the answers
collect(L,[])
end,
30).
collect([],Result) -> {ok,Result};
collect(List,ResultSoFar) ->
receive
{Ref,UserId,Tweets} ->
{ok,NewList} = remove_pending_request(Ref,UserId,List),
collect(Newlist,[{UserId,Tweets}|ResultSoFar])
after ?TIMEOUT
{error,timeout,List,ResultSoFar}
end.
remove_pending_request(Ref,UserId,List) ->
{value,{Ref,UserId},NewList} = lists:keytake(Ref,1,List),
{ok,NewList}.
pick_random(List) ->
lists:nth(rand:uniform(length(List)), List).
This is my other attempt at implementing a parallel benchmark which does not achieve any speed up.
get_tweets(Sender, UserId, Node) ->
server:get_tweets(Node, UserId, 0),
Sender ! done_get_tweets.
test_get_tweets3() ->
{_ServerId, UserInfos} = initializeForBench_server(),
run_benchmark("parallel get_tweet",
fun () ->
lists:foreach(
fun (_) ->
{UserId,Pid} = pick_random(UserInfos),
spawn(?MODULE, get_tweets, [self(), UserId, Pid])
end,
lists:seq(1, ?NUMBER_OF_REQUESTS)),
lists:foreach(fun (_) -> receive done_get_tweets -> ok end end, lists:seq(1, ?NUMBER_OF_REQUESTS))
end,
?RUNS).

how can multiple processes use one common list concurrently in Erlang?

I understand that Erlang is all about concurrency and we use spawn/spawn_link to create a process what I don't understand is how can all processes use one common list of users concurrently? say an ordict/dict storage.
What I am trying to do is;
1. A Spawned User process Subscribes/Listens to registered process A
2. Registered process A stores {Pid, Userid} of all online users
3. When some user sends a message user's process asks process A wether recipient is online or not.
sending a message in erlang is asynchronous but is it also asynchronous when a user is being sent messages by multiple users?
You can make process A a gen_server process and keep any data structure storing online users as the process state. Storing a new user or deleting one could be done with gen_server:cast/2, and checking to see if a user is online could be done with gen_server:call/2. Alternatively, you could have a gen_server create a publicly-readable ets table to allow any process to read it to check for online users, but storing and deleting would still require casts to the gen_server. You could even make the table publicly readable and writable so that any process could store, delete, or check users. But keep in mind that an ets table is by default destroyed when the process that creates it dies, so if you need it to stay around even if the gen_server that created it dies, you must arrange for it to be inherited by some other process, or give it to a supervisor.
A serious solution should use the OTP behaviors (gen_server, supervisor...) as suggested by Steve.
Anyway I wrote a small example module that implement both a server and clients and that can be started on one node using the command erl -sname test for example (or several nodes using erl -sname node1, erl -sname node2...) .
it includes also an example of a shell session that illustrates most of the cases, I hope it can help you to follow the exchanges, synchronous or asynchronous between processes.
NOTE : the access to the user list is not concurrent, it is not possible if the list is owned by a server process like it is in this example. It is why Steve propose to use an ETS to store the information and do real concurrent accesses. I have tried to write the example with interfaces that should allow a quick refactoring with ETS instead of tuple list.
-module(example).
-export([server/0,server_stop/1,server_register_name/2,server_get_address/2, server_quit/2, % server process and its interfaces
client/1,quit/1,register_name/2,get_address/2,send_message/3,print_messages/1, % client process and its interfaces
trace/0]). % to call the tracer for a nice message view
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Client interface
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
client(Node) ->
% connect the current node to the servernode given in parameter
% it will fail if the connection cannot be established
true = net_kernel:connect_node(Node),
% spawn a client process
spawn(fun () -> client([],unregistered,{server,Node}) end).
register_name(ClientPid,Name) ->
% use a helper to facilitate the trace of everything
send_trace(ClientPid,{register_name,self(),Name}),
% wait for an answer, it is then a synchronous call
receive
% no work needed, simply return any value
M -> M
after 1000 ->
% this introduce a timeout, if no answer is received after 1 second, consider it has failed
no_answer_from_client
end.
get_address(ClientPid,UserName) ->
send_trace(ClientPid,{get_address,self(),UserName}),
% wait for an answer, it is then a synchronous call
receive
% in this case, if the answer is tagged with ok, extract the Value (will be a Pid)
{ok,Value} -> Value;
M -> M
after 1000 ->
no_answer_from_client
end.
send_message(ClientPid,To,Message) ->
% simply send the message, it is asynchronous
send_trace(ClientPid,{send_message,To,Message}).
print_messages(ClientPid) ->
send_trace(ClientPid,print_messages).
quit(ClientPid) ->
send_trace(ClientPid,quit).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% client local functions
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
client(Messages,Name,Server) ->
receive
{register_name,From,UserName} when Name == unregistered ->
% if not yet registered send the request to the server and
% backward the answer to the requester
Answer = server_register_name(Server,UserName),
send_trace(From,Answer),
NName = case Answer of
registered -> UserName;
_ -> Name
end,
client(Messages,NName,Server);
{register_name,From,_} ->
% if already registered reject the request
send_trace(From,{already_registered_as,Name}),
client(Messages,Name,Server);
{get_address,From,UserName} when Name =/= unregistered ->
Answer = server_get_address(Server,UserName),
send_trace(From,Answer),
client(Messages,Name,Server);
{send_message,To,Message} ->
% directly send the message to the user, the server is not concerned
send_trace(To,{new_message,{erlang:date(),erlang:time(),Name,Message}}),
client(Messages,Name,Server);
print_messages ->
% print all mesages and empty the queue
do_print_messages(Messages),
client([],Name,Server);
quit ->
server_quit(Server,Name);
{new_message,M} ->
% append the new message
client([M|Messages],Name,Server);
_ ->
client(Messages,Name,Server)
end.
do_print_messages(Messages) ->
lists:foreach(fun({D,T,W,M}) -> io:format("from ~p, at ~p on ~p, received ~p~n",[W,T,D,M]) end,Messages).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Server interface
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
server() ->
true = register(server,spawn(fun () -> server([]) end)),
node().
server_stop(Server) ->
send_trace(Server,stop).
server_register_name(Server,User) ->
send_trace(Server,{register_name,self(),User}),
receive
M -> M
after 900 ->
no_answer_from_server
end.
server_get_address(Server,User) ->
send_trace(Server,{get_address,self(),User}),
receive
M -> M
after 900 ->
no_answer_from_server
end.
server_quit(Server,Name) ->
send_trace(Server,{quit,Name}).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% server local functions
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
server(Users) ->
receive
stop ->
ok;
{register_name,From,User} ->
case lists:keyfind(User,1,Users) of
false ->
send_trace(From,registered),
server([{User,From}|Users]);
_ ->
send_trace(From,{already_exist,User}),
server(Users)
end;
{get_address,From,User} ->
case lists:keyfind(User,1,Users) of
false ->
send_trace(From,{does_not_exist,User}),
server(Users);
{User,Pid} ->
send_trace(From,{ok,Pid}),
server(Users)
end;
{quit,Name} ->
server(lists:keydelete(Name,1,Users))
end.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% global
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
trace() ->
% start a collector, a viewer and trace the "trace_me" ...
et_viewer:start([{trace_global, true}, {trace_pattern, {et,max}},{max_actors,20}]).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% helpers
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
send_trace(To,Message) ->
% all messages will be traced by "et"
et:trace_me(50,self(),To,Message,[]),
To ! Message.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% shell commands
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% c(example).
% example:trace().
% N = node().
% C1 = example:client(N).
% example:register_name(pid(0,5555,0),"fails").
% example:register_name(C1,"fails_again").
% example:server().
% example:register_name(C1,"Joe").
% C2 = example:client(N).
% example:register_name(C2,"Bob").
% example:print_messages(C1).
% C2 = example:get_address(C1,"Bob").
% example:send_message(C1,C2,"Hi Bob!").
% example:send_message(C1,C2,"Hi Bob! are you there?").
% example:print_messages(C2).
% example:send_message(C2,C1,"Hi Joe! Got your message.").
% example:print_messages(C2).
% example:print_messages(C1).
% example:quit(C1).
% example:get_address(C2,"Joe").
% example:server_stop({server,N}).
% example:get_address(C2,"Joe").
% example:get_address(C1,"Bob").
here an extract of the event viewer:

Call multiple webservices from play 2

I am a play2.0-Scala-beginner and have to call several Webservices to generate a HTML page.
After reading the The Play WS API page and a very interesting article from Sadek Drobi I am still unsure what's the best way to accomplish this.
The article shows some code snippets which I don't fully understand as a Play beginner.
Figure 2 on page 4:
val response: Either[Response,Response] =
WS.url("http://someservice.com/post/123/comments").focusOnOk
val responseOrUndesired: Either[Result,Response] = response.left.map {
case Status(4,0,4) => NotFound
case Status(4,0,3) => NotAuthorized
case _ => InternalServerError
}
val comments: Either[Result,List[Comment]] =
responseOrUndesired.right.map(r => r.json.as[List[Comment]])
// in the controller
comment.fold(identity, cs => Ok(html.showComments(cs)))
What does the last line with the fold do? Should comment be comments? Haven't I group the last statement in an Async block?
Figure 4 shows how to combine several IO calls with a single for-expression:
for {
profile <- profilePromise
events <- attachedEventsPromise
articles <- topArticlesPromise
} yield Json.obj(
"profile" -> profile,
"events" -> events,
"articles" -> articles )
}
// in the controller
def showInfo(...) = Action { rq =>
Async {
actorInfo(...).map(info => Ok(info))
}
}
How can I use this snippet? (I am a bit confused by the extra-} after the for-expression.)
Should I write something like this?
var actorInfo = for { // Model
profile <- profilePromise
events <- attachedEventsPromise
articles <- topArticlesPromise
} yield Json.obj(
"profile" -> profile,
"events" -> events,
"articles" -> articles )
def showInfo = Action { rq => // Controller
Async {
actorInfo.map(info => Ok(info))
}
}
What's the best way to combine the snippets from figure 2 and 4 (error handling + composition of IO non-blocking calls)? (f.ex. I want to produce a Error 404 status code if any of the called webservice produce an Error 404).
Maybe someone knows a complete example of calling webservices in the play framework (cannot find an example in the play Sample applications or anywhere else).
I have to say that the article is wrong in the example you show in Figure 2. The method focusOnOk does not exist in Play 2.0. I assume the author of the article used a pre-release version of Play 2 then.
Regarding comment, yes it should be comments. The fold in the statement is operating on an Either. It takes 2 functions as parameters. The first is a function to apply if it is a left value. The second is a function to apply if it is a right value. A more detailed explanation can be found here: http://daily-scala.blogspot.com/2009/11/either.html
So what the line does is. If I have a left value (which meant I got an undesired response), apply the built-in identity function which just gives you back the value. If it has a right value (which means I got an OK response), make a new result that shows the comments somehow.
Regarding Async, it's not actually asynchronous. focusOnOk is a blocking function (a remnant from the old Java days of Play 1.x). But remember, that's not valid Play 2 code.
As for Figure 4, the trailing } is actually because it's a partial alternative of what's in Figure 3. Instead of the numerous promise flatMaps. You can do a for comprehension instead. Also, I think it should be userInfo(...).map instead of actorInfo(...).map.
The Play documentation you linked to actually already shows you a full example.
def feedTitle(feedUrl: String) = Action {
Async {
WS.url(feedUrl).get().map { response =>
Ok("Feed title: " + (response.json \ "title").as[String])
}
}
}
will get whatever is at feedUrl, and you map it to do something with the response which has a status field you can check to see if it was a 404 or something else.
To that end, the Figure 3 and 4 of your linked article should give you a starting point. So you'd have something like,
def getInfo(...) : Promise[String] = {
val profilePromise = WS.url(...).get()
val attachedEventsPromise = WS.url(...).get()
val topArticlesPromise = WS.url(...).get()
for {
profile <- profilePromise
events <- attachedEventsPromise
articles <- topArticlesPromise
} yield {
// or return whatever you want
// remember to change String to something else in the return type
profile.name
}
}
def showInfo(...) = Action { rq =>
Async {
getInfo(...).map { info =>
// convert your info to a Result
Ok(info)
}
}
}

Getting responses from erlang processes

I have an erlang project that makes a lot of concurrent SOAP requests to my application. Currently, it's limited by how many nodes are available, but I would like to adjust it so that each node can send more than one message at a time.
I've figured that problem out, but I don't know how to get a response back from process running the SOAP request.
This is my function that I'm attempting to use to do multiple threads:
batch(Url, Message, BatchSize) ->
inets:start(),
Threads = for(1, BatchSize, fun() -> spawn(fun() -> attack_thread() end) end),
lists:map(fun(Pid) -> Pid ! {Url, Message, self()} end, Threads).
This function gets called by the person who initiated the stress tester, it is called on every node in our network. It's called continually until all the requested number of SOAP requests have been sent and timed.
This is the attack_thread that is sent the message by the batch method:
attack_thread() ->
receive
{Url, Message, FromPID} ->
{TimeTaken, {ok, {{_, 200, _}, _, _}}} = timer:tc(httpc, request, [post, {Url, [{"connection", "close"}, {"charset", "utf-8"}], "text/xml", Message}, [], []]),
TimeTaken/1000/1000.
end
As you can see, I want it to return the number of seconds the SOAP request took. However, erlang's message passing (Pid ! Message) doesn't return anything useful.
How can I get a result back?
Each of your attack_thread() threads can simply drop a message in the mailbox of the process operating the batch/3 function:
FromPid ! {time_taken, self(), TimeTaken / 1000 / 1000}.
but then you need to collect the results:
batch(Url, Message, BatchSize) ->
inets:start(),
Pids = [spawn_link(fun attack_thread/0) || _ <- lists:seq(1, BatchSize],
[Pid ! {Url, Message, self()} || Pid <- Pids],
collect(Pids).
collect([]) -> [];
collect(Pids) ->
receive
{time_taken, P, Time} ->
[Time | collect(Pids -- [P])]
end.
Some other comments: you probably want spawn_link/1 here. If something dies along the way, you want the whole thing to die. Also, be sure to tune inets httpc a bit so it is more effective. You might also want to look at basho_bench or tsung.
Finally, you can use a closure directly rather than pass the url and message:
attack_thread(Url, Message, From) -> ...
So your spawn is:
Self = self(),
Pids = [spawn_link(fun() -> attack_thread(Url, Message, Self) end) || _ <- ...]
It avoids passing in the message in the beginning.

Guarantee order of messages posted to mailbox processor

I have a mailbox processor which receives a fixed number of messages:
let consumeThreeMessages = MailboxProcessor.Start(fun inbox ->
async {
let! msg1 = inbox.Receive()
printfn "msg1: %s" msg1
let! msg2 = inbox.Receive()
printfn "msg2: %s" msg2
let! msg3 = inbox.Receive()
printfn "msg3: %s" msg3
}
)
consumeThreeMessages.Post("First message")
consumeThreeMessages.Post("Second message")
consumeThreeMessages.Post("Third message")
These messages should be handled in exactly the order sent. During my testing, it prints out exactly what it should:
First message
Second message
Third message
However, since message posting is asynchronous, it sounds like posting 3 messages rapidly could result in items being processed in any order. For example, I do not want to receive messages out of order and get something like this:
Second message // <-- oh noes!
First message
Third message
Are messages guaranteed to be received and processed in the order sent? Or is it possible for messages to be received or processed out of order?
The code in your consumeThreeMessages function will always execute in order, because of the way F#'s async workflows work.
The following code:
async {
let! msg1 = inbox.Receive()
printfn "msg1: %s" msg1
let! msg2 = inbox.Receive()
printfn "msg2: %s" msg2
}
Roughly translates to:
async.Bind(
inbox.Receive(),
(fun msg1 ->
printfn "msg1: %s" msg1
async.Bind(
inbox.Receive(),
(fun msg2 -> printfn "msg2: %s" msg2)
)
)
)
When you look at the desugared form, it is clear that the code executes in serial. The 'async' part comes into play in the implementation of async.Bind, which will start the computation asynchronously and 'wake up' when it completes to finish the execution. This way you can take advantage of asynchronous hardware operations, and not waste time on OS threads waiting for IO operations.
That doesn't mean that you can't run into concurrency issues when using F#'s async workflows however. Imagine that you did the following:
let total = ref 0
let doTaskAsync() =
async {
for i = 0 to 1000 do
incr total
} |> Async.Start()
// Start the task twice
doTaskAsync()
doTaskAsync()
The above code will have two asynchronous workflows modifying the same state at the same time.
So, to answer your question in brief: within the body of a single async block things will always execute in order. (That is, the next line after a let! or do! doesn't execute until the async operation completes.) However, if you share state between two async tasks, then all bets are off. In that case you will need to consider locking or using Concurrent Data Structures that come with CLR 4.0.