Howto implement http streaming using libcurl - libcurl

I am trying to use CURL to implement Microsoft’s EWS Streaming Notifications i.e. HTTP Streaming where the request is sent once and the server responds with a header with "Transfer Encoding: chunked". The server will send multiple keepalive or notification chunks before the final packet. The chunks are terminated with cr lf.
If I create a standard CURL client then curl_easy_perform will not return until the final chunk is received whereas I need curl_easy_perform to return upon receipt of each chunk whereupon the application will process the received chunk and call curl_easy_perform to wait for the next chunk.
I realize that I could process the chunk in the CURLOPT_WRITEFUNCTION callback but the architecture of the application doesn’t allow for that (this is a GSOAP plugin)
Any suggestions other than switching to CURLOPT_CONNECT_ONLY and handling the write all subsequent reads with curl_easy_send and curl_easy_recv? Which seems a shame as I will have to duplicate CURL’s formatting and parsing.
Alan

curl_easy_perform is completely synchronous and will return only once the entire transfer is done. There's really no way around that (you already mentioned CURLOPT_CONNECT_ONLY and I wouldn't recommend that either) with this API.
If you want control back in the same thread before the entire transfer is done, which your question suggests, you probably rather want to use the multi interface.
Using that interface, curl_multi_perform will only do as much as it can right now without blocking and return control back to your function. It does however put the responsibility over to your code to wait for socket activity and call libcurl again when there is.
(Sorry, but I don't know what restrictions a "GSOAP plugin" has and you didn't state them here, so maybe this is all crap)

Related

GRPC C++ Async Server How differentiate between WritesDone and broken connection

When developing an Async C++ GRPC Server, how can I differentiate between the client being done with writing and the connection being broken ?
I am Streaming data from the client to the server and once the client is done it will call WritesDone to let the server know it should finish storing the file. If I have a sync server I can differentiate between the client calling WritesDone and the connection being broken by calling context->IsCancelled() but in async mode you can not call IsCancelled until you get the tag specified in AsyncNotifyWhenDone.
In both cases (WritesDone and Call done) the Read tag gets returned with ok set to false. However, the AsyncNotifyWhenDone tag, which would allow me to differentiate arrives after the read tag.
I will know after I try to call finish (it will also return false) but I need to know before I call finish as my final processing might fail and I can't return the error anymore if I already called finish.
There's no way to distinguish until the AsyncNotifyWhenDone tag returns. It may come after the Read in which case you may need to buffer it up. In the sync API you can check IsCancelled() anytime (and you can also do that in the Callback API which should be available for general use soon).

Boost asio - synchronous write / read - how to do?

First, I want to say that I'm new with Boost asio, and I see a lot of examples but it remains things I don't understand.
I want to create a server, that will accept two clients (it will use two socket). The first client will send messages to the server and the server will send this message to the other client (yes, it is useless to use a server, but it's not the point here, I want to understand how all this work). This will happen until one of the client close.
So, I created a server, the server wait for the clients, and then, it must wait for the first client to send some message. And this is my question: what must I do after?
I thought I need to read the first socket, and then write on the second, and so and so, but how I know if the first client writed on the socket? Same, how I know if the second client read the second socket?
I don't need code, I just want to know the good way to do that.
Thanks a lot for reading!
When you perform async_read you specifify a callback which is going to be called whenever any data is read to the buffer ( you should provide the buffer also, check the async_read's documentation ). Respectively you should provide callback for the async_write to know when your data is already sent. So, from the server perspective, for the client which 'writes' you should do async_read, and for the second client which 'reads' you should do async write. With the offered dataflow client1->server->client2 it is hard to recognize which client the server should read from and which one is write to. It's up to you. You can choose the first connected client as writer and the second as reader, for example.
You might want to start with asio iostreams. It's a high-level iostream-like abstraction above asynchronous sockets.
P.S.: also, don't forget to run io_service.run() loop somewhere. Because all the asio callbacks are executed within that loop.

How to send and receive data up to SO_SNDTIMEO and SO_RCVTIMEO without corrupting connection?

I am currently planning how to develop a man in the middle network application for TCP server that would transfer data between server and client. It would behave as regular client for server and server for remote client without modifying any data. It will be optionally used to detect and measure how long server or client is not able to receive data that is ready to be received in situation when connection is inactive.
I am planning to use blocking send and recv functions. Before any data transfer I would call a setsockopt function to set SO_SNDTIMEO and SO_RCVTIMEO to about 10 - 20 miliseconds assuming it will force blocking send and recv functions to return early in order to let another active connection data to be routed. Running thread per connection looks too expensive. I would not use async sockets here because I can not find guarantee that they will get complete in a parts of second especially when large data amount is being sent or received. High data delays does not look good. I would use very small buffers here but calling function for each received byte looks overkill.
My next assumption would be that is safe to call send or recv later if it has previously terminated by timeout and data was received less than requested.
But I am confused by contradicting information available at msdn.
send function
https://msdn.microsoft.com/en-us/library/windows/desktop/ms740149%28v=vs.85%29.aspx
If no error occurs, send returns the total number of bytes sent, which
can be less than the number requested to be sent in the len parameter.
SOL_SOCKET Socket Options
https://msdn.microsoft.com/en-us/library/windows/desktop/ms740532%28v=vs.85%29.aspx
SO_SNDTIMEO - The timeout, in milliseconds, for blocking send calls.
The default for this option is zero, which indicates that a send
operation will not time out. If a blocking send call times out, the
connection is in an indeterminate state and should be closed.
Are my assumptions correct that I can use these functions like this? Maybe there is more effective way to do this?
Thanks for answers
While you MIGHT implement something along the ideas you have given in your question, there are preferable alternatives on all major systems.
Namely:
kqueue on FreeBSD and family. And on MAC OSX.
epoll on linux and related types of operating systems.
IO completion ports on Windows.
Using those technologies allows you to process traffic on multiple sockets without timeout logics and polling in an efficient, reactive manner. They all can be considered successors of the ancient select() function in socket API.
As for the quoted documentation for send() in your question, it is not really confusing or contradicting. Useful network protocols implement a mechanism to create "backpressure" for situations where a sender tries to send more data than a receiver (and/or the transport channel) can accomodate for. So, an application can only provide more data to send() if the network stack has buffer space ready for it.
If, for example an application tries to send 3Kb worth of data and the tcp/ip stack has only room for 800 bytes, send() might succeed and return that it used 800 bytes of the 3k offered bytes.
The basic approach to forwarding the data on a connection is: Do not read from the incoming socket until you know you can send that data to the outgoing socket. If you read greedily (and buffer on application layer), you deprive the communication channel of its backpressure mechanism.
So basically, the "send capability" should drive the receive actions.
As for using timeouts for this "middle man", there are 2 major scenarios:
You know the sending behavior of the sender application. I.e. if it has some intent on sending any data within your chosen receive timeout at any time. Some applications only send sporadically and any chosen value for a receive timeout could be wrong. Even if it is supposed to send at a specific time interval, your timeouts will cause trouble once someone debugs the sending application.
You want the "middle man" to work for unknown applications (which must not use some encryption for middle man to have a chance, of course). There, you cannot pick any "adequate" timeout value because you know nothing about the sending behavior of the involved application(s).
As a previous poster has suggested, I strongly urge you to reconsider the design of your server so that it employs an asynchronous I/O strategy. This may very well require that you spend significant time learning about each operating systems' preferred approach. It will be time well-spent.
For anything other than a toy application, using blocking I/O in the manner that you suggest will not perform well. Even with short timeouts, it sounds to me as though you won't be able to service new connections until you have completed the work for the current connection. You may also find (with short timeouts) that you're burning more CPU time spinning waiting for work to do than actually doing work.
A previous poster wisely suggested taking a look at Windows I/O completion ports. Take a look at this article I wrote in 2007 for Dr. Dobbs. It's not perfect, but I try to do a decent job of explaining how you can design a simple server that uses a small thread pool to handle potentially large numbers of connections:
Windows I/O Completion Ports
http://www.drdobbs.com/cpp/multithreaded-asynchronous-io-io-comple/201202921
If you're on Linux/FreeBSD/MacOSX, take a look at libevent:
Libevent
http://libevent.org/
Finally, a good, practical book on writing TCP/IP servers and clients is "Practical TCP/IP Sockets in C" by Michael Donahoe and Kenneth Calvert. You could also check out the W. Richard Stevens texts (which cover the topic completely for UNIX.)
In summary, I think you should take some time to learn more about asynchronous socket I/O and the established, best-of-breed approaches for developing servers.
Feel free to private message me if you have questions down the road.

Socket data race [duplicate]

This question already has answers here:
Are parallel calls to send/recv on the same socket valid?
(3 answers)
Closed 7 years ago.
Sockets can generally two way communicate, therefore the same socket can be used to send and recv.
If I wanted to send some data (on another thread) while the socket is getting read, what would the kernel do? This is applied for both parts.
Consider this example: the server is sending you a file and say it will take a lot (low uplink or a very big file). The user gets bored and decides to SIGINT you. You catch it and tell the server to stop sending the file (with some kind of message).
Will you be able to send to tell the server to stop sending even though you're reading from it? And of course, that's applied to the server-side as well.
Hopefully I've been enough clear.
If I wanted to send some data (on another thread) while the socket is getting read, what would the kernel do?
Nothing special... sockets aren't like garden hoses... there's just some meta-data added to a packet that's sent between the machines, so the reading and writing happen independently (apart perhaps from if one side calls recv() on a socket that has unsent data in the local buffers due to the Nagle algorithm, which bunches up data into sensible size packets, it might time-out immediately and send whatever it can, but any tuning of that would be an implementation latency-tuning detail and doesn't change the big picture or way the client and server call the TCP API).
Consider this example: the server is sending you a file and say it will take a lot (low uplink or a very big file). The user gets bored and decides to SIGINT you. You catch it and tell the server to stop sending the file (with some kind of message). Will you be able to send to tell the server to stop sending even though you're reading from it? And of course, that's applied to the server-side as well.
The kernel accepts a limited amount of data to be sent, and a limited amount of data received, after which it forces the sending side to wait until some has been consumed before sending more. So, if you've sent data to a server, then get a local SIGINT and send a "oh, cancel that" in the same way, the server must read all the already-sent data before it can see the "oh, cancel that". If instead of sending it "in the same way" you turn on the Out Of Band (OOB) flag while sending the cancel message, then the server can (if it's written to do so) detect that there's OOB data and read it before it's completed reading/processing the other data. It will still need to read and discard whatever in-band data you've already sent, but the flow control / buffering mentioned above means that should be a manageable amount - far less than your file size might be. Throughout all this, whatever you want to recv or the server sends is independent and unaffected by the large client->server send, any OOB data etc..
There's a discussion and example code from GNU at http://www.gnu.org/software/libc/manual/html_node/Out_002dof_002dBand-Data.html
Thread 1 can safely write to the socket (with send) whilst thread 2 reads from the socket (with recv). What you need to be careful of is that at the point where you close() the socket the threads are synchronised, else the file descriptor may be used elsewhere, so the other thread (if not synchronized) could read from a file descriptor now used for something else. One way to achieve this would be for your reading thread to shutdown the file descriptor, which should cause the other end to drop the connection and thus error an in-progress send.

C++ how to accept server-push data?

My situation: I would like to create a hobby project for improving my C++ involving real-time/latency programming.
I have decided I will write a small Java program which will send lots of random stock prices to a client, where the client will be written in C++ and accept all the prices.
I do not want the C++ client to have to poll/have a while loop which continuously checks for data even if there is none.
What options do I have for this? If it's easier to accomplish having a C++ server then that is not a problem.
I presume for starters I will have to use the boost ASIO package for networking?
I will be doing this on windows 7.
Why not just have the Java server accept connections and then wait for some duration of time. e.g. 10 seconds. Within that time if data becomes available, send it and close the connection.
Then the C++ client can have a thread which opens a connection whenever the previous one has completed.
That should give quite low latency without creating connections very often when there is no new data.
This is basically the Comet web programming model, which is used for many applications.
Think about how a web server receives data. When a URL is accessed the data is pushed to the server. The server need not poll the client (or indeed know anything about the client other than its a service pushing bytes towards it).
You could use a Java servlet to accept the data over HTTP and write the code in this fashion. Similarly, boost::asio has a server example that should get you started. Under the hood, you could enable persistent HTTP so that the connections aren't opened / closed frequently. This'll make the coding model much simpler.
I do not want the C++ client to have to poll/have a while loop which
continuously checks for data
Someone HAS to.
Need not be you. I've never used boost ASIO, but it might provide a callback registration. If yes, then just register a callback function of yours with boost, boost would do the waiting and give you a call back when it gets some data.
Other option is of course that you use some functions which are synchronous. Like (not a real function) Socket.read() which blocks the thread until there is data in the socket or it's closed. But in this case you're dedicating a thread of your own.
--edit--
Abt the communication itself. Just pick any IPC mechanism (sockets/pipes/files/...), someone already described one I think. Once you send the data, the data itself is "encoded" and "decoded" by you, so you can create your own protocol. E.g. "%%<STOCK_NAME>=<STOCK_PRICE>##" where "%%", = and ## (markers to mark start, mid and end) that you add on sender side and remove on receiver side to get stock name and price.
You can develop the protocol further based on your needs. Like you can also send buy/sell recommendation or, text alert msgs with major stock exchange news. As long as your client and server understand how the data is "encoded" you're good.
Finally, if you want to secure teh communication (and say you're not using some secure layer (SSL)) then you can encrypt the data. But that's a different chapter. :)
HTH