OCaml - Parmap executing Lwt threads hangs on the execution - ocaml

This is a follow up to this question:
How to synchronously execute an Lwt thread
I am trying to run the following piece of code:
open Lwt
open Cohttp_lwt_unix
let server_content2 x =
"in server content x" |> print_endline ;
Client.get (Uri.of_string ("http://localhost:8080/"^x)) >>= fun (_, body) ->
(Cohttp_lwt.Body.to_string body) >|= fun sc -> sc
;;
let reyolo () =
List.init 10 (fun i -> server_content2 (string_of_int i) ) ;;
let par () =
let yolo = reyolo () in
"in par" |> print_endline;
Parmap.pariter
~ncores:4
(fun p -> "before run" |> print_endline ; "content:"^(Lwt_main.run p) |> print_endline ; "after run" |> print_endline )
(Parmap.L yolo);;
par ()
I expected this to perform 10 remote connections.
What I get is in par function Lwt_main.run seems to stuck before doing an actual remote call.
I doubt it might be of any significance but the server that suppose to respond is made in python and looks like this:
import subprocess
from bottle import run, post, request, response, get, route
#route('/<path>',method = 'GET')
def process(path):
print(path)
return "yolo"
run(host='localhost', port=8080, debug=True)

The issue is that the calls to server_content2, which start the requests, occur in the parent process. The code then tries to finish them in the child processes spawned by Parmap. Lwt breaks here: it cannot, in general, keep track of I/Os across a fork.
If you store either thunks or arguments in the list yolo, and delay the calls to server_content2 so that they are done in the child processes, the requests should work. To do that, make sure the calls happen in the callback of Parmap.pariter.

Related

Why my interface for learnyousomeerlang trade_fsm is failing?

I'm slowly learning erlang language using learnyousomeerlang site and I'm currently at "Rage Against The Finite-State Machines" chapter, which builds and describes how trade_fsm.erl works. As a part of my learning process I've decided to write an interface for this system, where you can control both trading sides by typing console commands. I think I've done a decent job at writing that, however for some reason I cannot understand, whenever I try to start trading, the clients crash. Here's how it goes:
5> z3:init("a", "b").
true
6> z3:display_pids().
First player pid: {<0.64.0>}
Second player pid: {<0.65.0>}.
done
7> z3:p1_propose_trade().
{a}: asking user <0.65.0> for a trade
{b}: <0.64.0> asked for a trade negotiation
done
8> z3:display_pids().
done
9>
And here's my code:
-module(z3).
-compile(export_all).
-record(state, {player1,
player2,
p1items=[],
p2items=[],
p1state,
p2state,
p1name="Carl",
p2name="FutureJim"}).
init(FirstName, SecondName) ->
{ok, Pid1} = trade_fsm:start_link(FirstName),
{ok, Pid2} = trade_fsm:start_link(SecondName),
S = #state{p1name=FirstName, p2name=SecondName,
player1=Pid1, player2=Pid2,
p1state=idle, p2state=idle},
register(?MODULE, spawn(?MODULE, loop, [S])).
display_pids() ->
?MODULE ! display_pids,
done.
p1_propose_trade() ->
?MODULE ! {wanna_trade, p1},
done.
p2_accept_trade() ->
?MODULE ! {accept_trade, p2},
done.
loop(S=#state{}) ->
receive
display_pids ->
io:format("First player pid: {~p}~nSecond player pid: {~p}.~n", [S#state.player1, S#state.player2]),
loop(S);
{wanna_trade, Player} ->
case Player of
p1 ->
trade_fsm:trade(S#state.player1, S#state.player2);
p2 ->
trade_fsm:trade(S#state.player2, S#state.player1);
_ ->
io:format("[Debug:] Invalid player.~n")
end,
loop(S);
{accept_trade, Player} ->
case Player of
p1 ->
trade_fsm:accept_trade(S#state.player1);
p2 ->
trade_fsm:accept_trade(S#state.player2);
_ ->
io:format("[Debug:] Invalid player.~n")
end,
loop(S);
_ ->
io:format("[Debug:] Received invalid command.~n"),
loop(S)
end.
Can anyone tell me why this code fails and how it should be implemented?
when you call z3:p1_propose_trade(). it sends the message {wanna_trade, p1} to registered process z3.
The message is interpreted in the loop function which calls trade_fsm:trade(S#state.player1, S#state.player2); converted into gen_fsm:sync_send_event(S#state.player1, {negotiate, S#state.player2}, 30000).. This call is a synchronous call which is waiting for a reply from the fsm, and which timeout after 30 seconds if it did not receive any answer.
In the state wait, you have caught the message in the statement:
idle({negotiate, OtherPid}, From, S=#state{}) ->
ask_negotiate(OtherPid, self()),
notice(S, "asking user ~p for a trade", [OtherPid]),
Ref = monitor(process, OtherPid),
{next_state, idle_wait, S#state{other=OtherPid, monitor=Ref, from=From}};
No reply value is returned to the caller. You should have used in the last line something like
{reply, Reply, idle_wait, S#state{other=OtherPid, monitor=Ref, from=From}};
or an explicit call to gen_fsm:reply/2.
I didn't dig too much in the code, but if you change it to:
idle({negotiate, OtherPid}, From, S=#state{}) ->
Reply = ask_negotiate(OtherPid, self()),
notice(S, "asking user ~p for a trade", [OtherPid]),
Ref = monitor(process, OtherPid),
{reply, Reply, idle_wait, S#state{other=OtherPid, monitor=Ref, from=From}};
it doesn't stop and seems to work properly.
Maybe some one knowing perfectly the behavior of the gen_fsm can give an explanation of what is going behind the scene (why is there nothing printout when the timeout ends, why the shell is ready for a new command while it should be waiting for an answer?):
If you call manually the function trade(OwnPid, OtherPid) you will see that it doesn't return until the 30 second timeout is reached, and then you get an error message.
when it is called by z3:p1_propose_trade()., after 30 seconds the error message is not shown but the registered process z3 dies.
[EDIT]
I have checked how the code should work, and, in fact, it doesn't seem necessary to modify the fsm code. The reply should come from the process 2, when the second user accept to negotiate. So you can't do the test this way (loop is waiting for an answer, and it cannot send the accept_trade). here is a session that works:
{ok,P1} = trade_fsm:start("a1").
{ok,P2} = trade_fsm:start("a2").
T = fun() -> io:format("~p~n",[trade_fsm:trade(P1,P2)]) end.
A = fun() -> io:format("~p~n",[trade_fsm:accept_trade(P2)]) end.
spawn(T). % use another process to avoid the shell to be locked
A().
You can change the "wanna_trade" interface to avoid the blocking issue
{wanna_trade, Player} ->
case Player of
p1 ->
spawn(fun() -> trade_fsm:trade(S#state.player1, S#state.player2) end);
p2 ->
spawn(fun() -> trade_fsm:trade(S#state.player2, S#state.player1) end);
_ ->
io:format("[Debug:] Invalid player.~n")
end,
loop(S);

How can I test a response whose body is streamed from within an ecto transaction?

I am attempting to write unit tests for an endpoint that streams its response in chunks. I can verify that the contents are fully and correctly streamed when accessed through my browser. But when I access the endpoint through my test suite, the response body is empty.
Example controller action
def stream_csv(conn, _params) do
conn = conn
|> put_resp_content_type("text/csv")
|> put_resp_header("content-disposition", "data.csv")
|> send_chunked(200)
Repo.transaction(fn ->
{:ok, conn} = chunk(conn, "csv data part")
end)
conn
end
Example unit test
test "stream endpoint" do
body = build_conn()
|> Phoenix.ConnTest.get("/stream_endpoint")
|> Phoenix.ConnTest.response(200)
assert body =~ "csv data part"
end
This will lead to an assertion failure, where body is an empty binary "".
I feel like there should be a way to wait on all the chunks before making assertions, or that I'm probably overlooking something obvious.
EDIT
The example I initially wrote works as expected. What seems to complicate things is when chunk is called from within a callback provided to Repo.transaction. I've updated the question and examples to better reflect the problem.
I had this problem and tried several ways to get to to work and ended up doing this
solution using mocks
So what is going on is that when we call Plug.Conn.chunk(conn, chunk), plug calls conn.adapter.chunk. From there, the chunk should be sent to the server (e.g. cowboy) for further handling. The conn is not aware of the chunk anymore.
To solve this, I moved the chunking to another functions with minimal side effects and easy mockable
defmodule MyApp.ControllerUtils do
use MyaAppWeb, :controller
#callback chunk_to_conn(map(), String.t()) :: map()
def chunk_to_conn(conn, current_chunk) do
conn |> chunk(current_chunk)
end
end
And in the response handler
def stream_csv(conn, _params) do
conn = conn
|> put_resp_content_type("text/csv")
|> put_resp_header("content-disposition", "data.csv")
|> send_chunked(200)
Repo.transaction(fn ->
{:ok, conn} = MyApp.ControllerUtils.chunk_to_conn(conn, "csv data part")
end)
conn
end
Now in your test you mock the chunking function to give you the chunk and use a something like Agent store and join the chunks or just assert them as they come.
import Mox
defp chunked_response_to_state(chunk, pid) do
current_chunk = Agent.get(pid, &Map.get(&1, :chunk_key))
Agent.update(pid, &Map.put(&1, :csv, current_chunk <> chunk))
end
setup do
MyApp.ControllerUtilsMock
|> stub(:chunk_to_conn, fn _, chunk -> chunk |> chunked_response_to_state(agent_pid) end)
{:ok, %{agent_pid: agent_pid}}
end
test "my test", state do
build_conn.get(somepath)
whole_chunks = Agent.get(state.agent_pid, &Map.get(&1, :chunk_key))
end

MirageOS - Http-fetch example

I'm trying to modify a bit the MirageOS http-fetch example (https://github.com/mirage/mirage-skeleton) that can be found inside mirage-skeleton but I'm having some problems understanding why I can't move some of the function executed inside the config.ml file to my unikernel.ml file. The original config.ml file follows (I'll copy just the interesting part) :
[...]
let client =
foreign "Unikernel.Client" ## console #-> resolver #-> conduit #-> job
let () =
add_to_ocamlfind_libraries ["mirage-http"];
add_to_opam_packages ["mirage-http"];
let sv4 = stack default_console in
let res_dns = resolver_dns sv4 in
let conduit = conduit_direct sv4 in
let job = [ client $ default_console $ res_dns $ conduit ] in
register "http-fetch" job
What I'm trying to do is move these two lines :
let res_dns = resolver_dns sv4 in
let conduit = conduit_direct sv4 in
into my unikernel.ml start method. Basically I want to pass to my module just the stack and let it create a dns resolver and a conduit. My start function follows:
let start c s =
C.log_s c (sprintf "Resolving in 1s using DNS server %s" ns) >>= fun () ->
OS.Time.sleep 1.0 >>= fun () ->
let res_dns = resolver_dns s in
let conduit = conduit_direct s in
http_fetch c res_dns conduit >>= fun (data) ->
Lwt.return(dump_to_db data);
Right now I'm getting this error at http_fetch parameters submission:
Error: This expression has type Mirage.resolver Mirage.impl
but an expression was expected of type Resolver_lwt.t
What I'm asking here is mostly a conceptual question because I'm clearly missing something. I'm not an expert in OCaml/MirageOS but this controversial behaviour of type mismatch is hard to understand considering that I'm just calling the same function from a different file.
config.ml is used to generate main.ml. You can copy the generated code from there if you want.

Getting responses from erlang processes

I have an erlang project that makes a lot of concurrent SOAP requests to my application. Currently, it's limited by how many nodes are available, but I would like to adjust it so that each node can send more than one message at a time.
I've figured that problem out, but I don't know how to get a response back from process running the SOAP request.
This is my function that I'm attempting to use to do multiple threads:
batch(Url, Message, BatchSize) ->
inets:start(),
Threads = for(1, BatchSize, fun() -> spawn(fun() -> attack_thread() end) end),
lists:map(fun(Pid) -> Pid ! {Url, Message, self()} end, Threads).
This function gets called by the person who initiated the stress tester, it is called on every node in our network. It's called continually until all the requested number of SOAP requests have been sent and timed.
This is the attack_thread that is sent the message by the batch method:
attack_thread() ->
receive
{Url, Message, FromPID} ->
{TimeTaken, {ok, {{_, 200, _}, _, _}}} = timer:tc(httpc, request, [post, {Url, [{"connection", "close"}, {"charset", "utf-8"}], "text/xml", Message}, [], []]),
TimeTaken/1000/1000.
end
As you can see, I want it to return the number of seconds the SOAP request took. However, erlang's message passing (Pid ! Message) doesn't return anything useful.
How can I get a result back?
Each of your attack_thread() threads can simply drop a message in the mailbox of the process operating the batch/3 function:
FromPid ! {time_taken, self(), TimeTaken / 1000 / 1000}.
but then you need to collect the results:
batch(Url, Message, BatchSize) ->
inets:start(),
Pids = [spawn_link(fun attack_thread/0) || _ <- lists:seq(1, BatchSize],
[Pid ! {Url, Message, self()} || Pid <- Pids],
collect(Pids).
collect([]) -> [];
collect(Pids) ->
receive
{time_taken, P, Time} ->
[Time | collect(Pids -- [P])]
end.
Some other comments: you probably want spawn_link/1 here. If something dies along the way, you want the whole thing to die. Also, be sure to tune inets httpc a bit so it is more effective. You might also want to look at basho_bench or tsung.
Finally, you can use a closure directly rather than pass the url and message:
attack_thread(Url, Message, From) -> ...
So your spawn is:
Self = self(),
Pids = [spawn_link(fun() -> attack_thread(Url, Message, Self) end) || _ <- ...]
It avoids passing in the message in the beginning.

Guarantee order of messages posted to mailbox processor

I have a mailbox processor which receives a fixed number of messages:
let consumeThreeMessages = MailboxProcessor.Start(fun inbox ->
async {
let! msg1 = inbox.Receive()
printfn "msg1: %s" msg1
let! msg2 = inbox.Receive()
printfn "msg2: %s" msg2
let! msg3 = inbox.Receive()
printfn "msg3: %s" msg3
}
)
consumeThreeMessages.Post("First message")
consumeThreeMessages.Post("Second message")
consumeThreeMessages.Post("Third message")
These messages should be handled in exactly the order sent. During my testing, it prints out exactly what it should:
First message
Second message
Third message
However, since message posting is asynchronous, it sounds like posting 3 messages rapidly could result in items being processed in any order. For example, I do not want to receive messages out of order and get something like this:
Second message // <-- oh noes!
First message
Third message
Are messages guaranteed to be received and processed in the order sent? Or is it possible for messages to be received or processed out of order?
The code in your consumeThreeMessages function will always execute in order, because of the way F#'s async workflows work.
The following code:
async {
let! msg1 = inbox.Receive()
printfn "msg1: %s" msg1
let! msg2 = inbox.Receive()
printfn "msg2: %s" msg2
}
Roughly translates to:
async.Bind(
inbox.Receive(),
(fun msg1 ->
printfn "msg1: %s" msg1
async.Bind(
inbox.Receive(),
(fun msg2 -> printfn "msg2: %s" msg2)
)
)
)
When you look at the desugared form, it is clear that the code executes in serial. The 'async' part comes into play in the implementation of async.Bind, which will start the computation asynchronously and 'wake up' when it completes to finish the execution. This way you can take advantage of asynchronous hardware operations, and not waste time on OS threads waiting for IO operations.
That doesn't mean that you can't run into concurrency issues when using F#'s async workflows however. Imagine that you did the following:
let total = ref 0
let doTaskAsync() =
async {
for i = 0 to 1000 do
incr total
} |> Async.Start()
// Start the task twice
doTaskAsync()
doTaskAsync()
The above code will have two asynchronous workflows modifying the same state at the same time.
So, to answer your question in brief: within the body of a single async block things will always execute in order. (That is, the next line after a let! or do! doesn't execute until the async operation completes.) However, if you share state between two async tasks, then all bets are off. In that case you will need to consider locking or using Concurrent Data Structures that come with CLR 4.0.