How to deal with multiple IO points in Scalaz and spray - monads

A REST call uses spray.io. It validates the request in that call using various functions including one in the middle that queries a data store. If all is ok then it writes to said data store. In this situation there are up to 4 IO actions: HTTP request, data read, data write and HTTP response. If I had decided to use the IO monad, how would I structure perform IO given I have some that may or may not be required (i.e. read and write stuff may not need to be done if some validation fails) interspersed with non IO functions? Presumably, first of all I could first ignore the spray stuff and think of the evaluation as starting after spray has done it's thing. But how do I compose the IO functions with non-IO functions? Do I have to lift the non-io stuff into an io monad?

The short answer is Futures. In order to be non-blocking your IO operations should return Futures that you map to other Futures. In Spray you can complete the request with a Future.

Related

What should I do if boost::beast write_some doesn't write everything?

I am sending data on a boost::beast::websocket
I would like to send the data synchronously, so am trying to decide if I should use write or write_some.
From this SO answer (which is about asio rather than beast specifically, but I assume(!) the same rules apply?) I understand that write will block until the entire message is confirmed sent, whereas write_some may return early, and will return the number of bytes sent which may not be all the bytes which were requested be sent.
In my particular use-case I am using a single thread, and the write is done from within this thread's context (ie: from inside a callback issued after entering io_context.run())
Since I don't want to block the caller for some indeterminate amount of time, I want to avoid using write if there is a more elegant solution.
So if I then turn to async_write, I am uncertain what I should do if the number of bytes is less than the number of bytes I requested be sent?
How I would normally handle this with standard tcp sockets is use non-blocking mode, and when I get back EWOULDBLOCK, enqueue the data and carry on. When the socket becomes writeable again, only then complete the write (much akin to an asio async_write). Since non-blocking is not supported in beast, I'm wondering what the analogous approach is?
Presumably I need to perform some additional write operation to ensure the rest of the bytes are sent in due course?
The beast docs say
Callers are responsible for synchronizing operations on the socket
using an implicit or explicit strand, as per the Asio documentation.
The websocket stream asynchronous interface supports one of each of
the following operations to be active at the same time:
async_read or async_read_some
async_write or async_write_some
async_ping or async_pong
async_close
Is it ok to start an async write of the remaining bytes, so long as I ensure that a new synchronous write/write_some isn't started before the outstanding async write has completed?
If I cannot start an async write to complete the send of the remaining bytes, how is one supposed to handle a synchronous write_some which doesn't completely send all bytes?
As to why I don't just use async_write always, I have additional slow processing to do after the attempt to write, such as logging etc. Since I am using a single thread, and the call to async_write happens within that thread, the write will only occur after I return control to the event loop.
So what I'd like to do is attempt to write synchronously (which will work in 90% of the cases) so the data is sent, and then perform my slow tasks which would otherwise delay the write. In the 10% of cases where a sync write doesn't complete immediately, then an alternative async_write operation should be employed - but only in the fallback situation.
Possibly related: I see that write_some has a flag fin, which should be set to true if this is the last part of the message.
I am only ever attempting to write complete messages, so should I always use true for this?

What's the most efficient way to async send data while async receiving with 0MQ?

I've got a ROUTER/DEALER setup where both ends need to be able to receive and send data asynchronously, as soon as it's available. The model is pretty much 0MQ's async C++ server: http://zguide.zeromq.org/cpp:asyncsrv
Both the client and the server workers poll, when there's data available they call a callback. While this happens, from another thread (!) I'm putting data in a std::deque. In each poll-forever thread, I check the deque (under lock), and if there are items there, I send them out to the specified DEALER id (the id is placed in the queue).
But I can't help thinking that this is not idiomatic 0MQ. The mutex is possibly a design problem. Plus, memory consumption can probably get quite high if enough time passes between polls (and data accumulates in the deque).
The only alternative I can think of is having another DEALER thread connect to an inproc each time I want to send out data, and just have it send it and exit. However, this implies a connect per item of data sent + construction and destruction of a socket, and it's probably not ideal.
Is there an idiomatic 0MQ way to do this, and if so, what is it?
I dont fully understand your design but I do understand your concern about using locks.
In most cases you can redesign your code to remove the use of locks using zeromq PAIR sockets and inproc.
Do you really need a std::deque? If not you could just use a zerom queue as its just a queue that you can read/write from from different threads using sockets.
If you really need the deque then encapsulate it into its own thread (a class would be nice) and make its API (push etc) accessible via inproc sockets.
So like I said before I may be on the wrong track but in 99% of cases I have come across you can always remove the locks completely with some ZMQ_PAIR/inproc if you need signalling.
0mq queue has limited buffer size and it can be controlled. So memory issue will get to some point and then dropping data will occur. For that reason you may consider using conflate option leaving only most recent data in queue.
In a case of single server and communication within single machine with many threads I suggest using publish/subscribe model where with conflate option you will receive new data as soon as you read buffer and won't have to worry about memory. And it removes blocking queue problem.
As for your implementation you are quite right, it is not best design but it is quite unavoidable. I suggest checking question Access std::deque from 3 threads while it answers your problem, it may not be the best approach.

Boost Beast WebSockets with several indeterminate writes per read

Using boost/beast websockets in C++
I've read up on the issues with beast websockets not supporting non-blocking reads, and the fact that there's no way to check if data is available, and that doing reads and writes in separate threads is probably not thread safe.
The issue I have, then, is figuring out the correct approach to this problem:
The IBM Watson speech-to-text WebSockets API allows you to send chunks of audio data as they become available (or in pieces from an existing file.) However, you do not get text replies for each chunk.
Instead, you keep sending it audio data until it recognizes a pause or an end of utterance, and then it finally sends back some results.
In other words, you may have to do several writes before a read will return anything, and there's no way to predict how many writes you will have to do.
Without a non-blocking read function, and without putting the blocking read in a separate thread, how do I keep sending data and then only retrieving results when they're available?
Don't confuse the lack of thread safety with a lack of full-duplex capability. You can call async_read and then follow it with a call to async_write. This will result in two "pending" asynchronous operations. The write operation will complete shortly afterwards, and the read operation will remain pending until a message is received.
Asio's asynchronous model is "reactive." That means that your completion handler gets called when something happens. You don't "check to see if data is available." Beast doesn't reinvent the wheel here, it adopts the asynchronous model of Asio. If you understand how to write asynchronous network programs with Asio, this knowledge will transfer over to Beast.

Boost::Beast Non Blocking Read for Websockets?

We have an app that is entirely synchronous, and will always be because it is basically a command line interpreter to send low level commands to our hardware, and you cant have two commands going to the hardware at the same time. I will only ever have 1 client socket for this configuration operating in a synchronous manner, one command to the server, it talks to hardware, and sends value back to client, but as far as i see it currently async_read is the only way to do non blocking reads.
What is the best way to get a non blocking read/write via Beast? For example in TCP and Serial in Windows you have ways to peek into the buffer to see if data is ready to be accessed, and if there is you can issue your read command knowing it wont block because data is there. Not sure if I am just missing this functionality in Beast, although i will say having such functionality if possible would be nice.
Anyways so based on this i have a question
First, can I take the Coroutine example and instead of using yield, to create and pass it a read_handler function?
I've taken the coroutine example, and built the functions into my class, and used the exact same read_handler from this thread answer.
How to pass read handler to async_read for Beast websocket?
It compiles as he says, but setting a break point never triggers when data is received.
I dont really need the full async functionality like the async example, pushing it into different threads, in fact that makes my life more difficult because the rest of the app is not async. And because we allow input from various sources(keyboard/TCP/Serial/File), we cant block waiting for data.
What is the best way to get a non blocking read/write via Beast?
Because of the way the websocket stream is implemented, it is not possible to support non-blocking socket modes.
can I take the Coroutine example and instead of using yield, to create and pass it a read_handler function?
If you want to use completion handlers, I would suggest that instead of starting with the coroutine example you start with one of the asynchronous examples, since these are already written to use completion handlers.
Coroutines have blocking semantics, while completion handlers do not. If you try to use the coroutine example and replace the yield expression with a completion handler, the call to the initiating function will not block the way it does when using coroutines. And you should not use spawn. You said that the coroutine example is much easier, probably this is because it resembles synchronous code. If you want that ease of writing and understanding, then you have to use coroutines. Code using completion handlers will exhibit the "inversion of control" typically associated with callbacks. This is inherent to how they work and not something you can change by just starting with code that uses coroutines and changing the completion token.

Refactor Decision: Message Queue Library, synchronize callers or offload create dedicated read/write threads in library

I'm refactoring a project that I did not design. It is written in C/C++ for linux. The project has a major client component that looks like this:
Client -> Output Queuing Library (OQL) -> Controller
Client
Messy semi-complex code, poorly designed (hodgepodge of OOP approximations using singletons/namespaces, just weird in many places - but it works)
Custom protocol implementation (not my protocol, cannot modify)
Shared Library
Multi-threaded
Multiple threads call the OQL api, ie multiple threads output
Accepts commands from controller via API
Produces massive unsequenced output which is affected but not necessarily directly (and definitely not 1:1) by the controller input)
Output Queuing Library (OQL)
Simple clean code, not really designed for it's current workload (was never meant to queue, was actually originally just writing to stdout and then a message queue was shoe-horned in)
Shared Library
Single-threaded
Exposes API which accepts many types of data from the client and builds textual representations of this data
Inserts data into a sys V message queue
Controller
Executable
Single-threaded
Elegant, fault tolerant C++ which makes extensive use of boost
Written from scratch by me, the only part of the project I've been allowed to completely "fix" so to speak
Interacts with client library via API to initiate connection to server
Saves data produced by Client and read from OQL into database in another layer
So the problem essentially boils down to this, the controller is single threaded and calls many API functions in the client library. Scenarios resulting from Controller calling Client API.
Normal (98%+)
Controller calls client API function
Client API function does magic internally
API function returns true
Client receives data as a result of magic in step 2, in another thread of execution and calls OQL put function from a secondary thread
OQL writes data to message queue, queue either blocks or does not block but neither matter since the controller's main thread of execution is running and processing data.
Success!
Problem Scenario
Controller calls client API function
Client API function immediately produces result and BEFORE returning calls OQL put function from the main thread of execution in the Controller
OQL writes data to the message queue and one of the following happens:
Message queue is not full, does not block, everything returns and the controller processes the new data in the message queue and life moves on happily
Problem Scenario Message queue IS full and DOES block
Now what I'm sure you can see is in the problem scenario, which is rare, the main thread of execution is blocking on a full message queue and also no data is being processed off of the other end of the queue since the controller is single threaded...
So yes it's a bit of a mess, and no I'm not happy with the design but I've gotta figure out the best way to solve this without rewriting all of these interactions.
I'm trying to decide between:
Digging into the client, synchronizing all of the threads to a single I/O thread that interacts with OQL
I have basically zero faith that someone after me will not come in and break this down the road, introduce massive bugs and not be able to understand why
Introducing a writer thread into OQL
Takes a simple class and complicates it significantly
Introduces funny problems
Doesn't the queue need a queue at that point? Since data has to get transported to the writer thread
Really just typing up this question was probably the best thing I could have done, since my thoughts are much more organized now...but does anyone have any input to offer here? Is there some design pattern I'm not seeing which would not require massive refactoring, and are my concerns overblown on either of these? I'm not even sure if it's possible for anyone to provide meaningful advice here without knowing all angles of the problem, but I appreciate any insight you guys can offer.
Change client to return an error when the Q is full so the controller can make an intelligent decision about how to continue.
You could change the Controller to use a second thread to do the reading from the message queue (and just post the data to a much larger buffer, internal to the Controller, to be read by the main Controller thread).