I have the following problem to solve. I want to make a number of requests to a number of "remote" servers (actually, a server farm we control). The connection is very simple. Send a line, and then read lines back. Because of the number of requests and the number of servers, I use pthreads, one for each request.
The naive approach, using blocking sockets, does not work; very occasionally, I'll have a thread stuck in 'connect'. I cannot use SIGALRM because I am using pthreads. I tried converting the code to O_NONBLOCK but this vastly complicated the code to read single lines.
What are my options? I'm looking for the simplest solution that allows the following pseudocode:
// Inside a pthread
try {
req = connect(host, port);
req.writeln("request command");
while (line = req.readline()) {
// Process line
}
} catch TimeoutError {
// Bitch and complain
}
My code is in C++ and I'm using Boost. A quick look at Boost ASIO shows me that it probably isn't the correct approach, but I could be wrong. ACE is far, far too heavy-weight to solve this problem.
Have you looked at libevent?
http://www.monkey.org/~provos/libevent/
It's totally different paradigm but the performance is so amazing.
memcached is built on top of libevent.
I saw the comments and i think you can use boost::asio with boost::asio::deadline_timer
Fragment of a code:
void restart_timer()
{
timer_.cancel();
timer_.expires_from_now(boost::posix_time::seconds(5));
timer_.async_wait(boost::bind(&handleTimeout,
MyClass::shared_from_this(), boost::asio::placeholders::error));
}
Where handleTimeout is a callback function, timer_ is boost::asio::deadline_timer
and MyClass is similar to
class Y: public enable_shared_from_this<Y>
{
public:
shared_ptr<Y> f()
{
return shared_from_this();
}
}
You can call restart_timer before connect ou read/write
More information about share_from_this()
You mentioned this happens 'very occasionally'. Your 'connect' side should have the fault tolerance and error handling you are looking for but you should also consider the stability of your servers, DNS, network connections, etc.
The underlying protocols are very sturdy and work very well, so if you are experiencing these kind of problems that often then it might be worth checking.
You may also be able close the socket from the other thread. That should cause the connect to fail.
Related
I'm implementing an auctioning system in C++ with Boost.Asio. There is a single centralized auctioneer (the server) and some connecting bidders (the clients). I am implementing this in an asynchronous fashion, and I have implemented the basic communication between the bidder and auctioneer (register, ping, get client list). The skeletal code for the auctioneer would look like follows:
class talkToBidder : public boost::enable_shared_from_this<talkToBidder>
{
// Code for sending and receiving messages, which works fine
};
void on_round_end()
{
// Choose the best bid and message the winner
if (!itemList.empty())
timer_reset();
}
void timer_reset()
{
// Send the item information to the bidders
// When the round ends, call on_round_end()
auction_timer.expires_from_now(boost::posix_time::millisec(ROUND_TIME));
auction_timer.async_wait(boost::bind(on_round_end));
}
void handle_accept(...)
{
// Create new bidder...
acceptor.async_accept(bidder->sock(),boost::bind(handle_accept,bidder,_1));
}
int main()
{
// Create new bidder and handle accepting it
talkToBidder::ptr bidder = talkToBidder::new_();
acceptor.async_accept(bidder->sock(),boost::bind(handle_accept,bidder,_1));
service.run();
}
My issue is, I need to wait for at least one bidder to connect before I can start the auction, so I cannot simply call timer_reset() before I use service.run(). What is the Boost.Asio way to go about doing this?
In asynchronous protocol design, it helps to draw Message Sequence Diagrams. Do include your timers.
The code now becomes trivial. You start your timer when the message arrives that should start your timer. Yes, this is shifting the problem a bit forwards. The real point here is that it's not a Boost Asio coding problem. In your case, that particular message appears to be the login of the first bidder, implemented as a TCP connect (SYN/ACK) which maps to handle_accept in your code.
In boost website, there is a good example about timeout of async operations. However, in that example, the socket is closed to cancel operations. There is also socket::cancel(), but in both documentation and as a compiler warning, it is stated as problematic in terms of portability.
Among the stack of Boost.Asio timeout questions in SO, there are several kind of answers. The first one probably is introducing a custom event loop, i.e., loop io_service::run_one() and cancel the event loop on deadline. I am using io_service::run() in a worker thread. That's not the kind of solution I would like to employ, if possible, as I do not want to change my code base.
A second option is directly changing the options of native socket. However, I would like to stick to Boost.Asio if possible and avoid any sort of platform-specific code as much as possible.
The example in the documentation is for an old version of Boost.Asio, but it's working properly, other than being forced to close the socket to cancel the operations. Using the documentation example, I have the following
void check_deadline(const boost::system::error_code &ec)
{
if(!running) {
return;
}
if(timer.expires_at() <= boost::asio::deadline_timer::traits_type::now()) {
// cancel all operations
boost::system::error_code errorcode;
boost::asio::ip::tcp::endpoint endpoint = socket.remote_endpoint();
socket.close(errorcode);
if(errorcode) {
SLOGERROR(mutex, errorcode.message(), "check_deadline()");
}
else {
SLOG(mutex, "timed out", "check_deadline()");
// connect again
Connect(endpoint);
if(errorcode) {
SLOGERROR(mutex, errorcode.message(), "check_deadline()");
}
}
// set timer to infinity, so that it won't expire
// until a proper deadline is set
timer.expires_at(boost::posix_time::pos_infin);
}
// keep waiting
timer.async_wait(std::bind(&TCPClient::check_deadline, this, std::placeholders::_1));
}
This is the only callback function registered to async_wait.The very first solution I could come up is reconnecting after closing the socket. Now my question is, is there a better way? By better way, I mean canceling the operations based on a timer without actually disrupting (i.e., not closing the socket) the connection.
I'm starting to use Boost, so may be I'm messing something up.
I'm trying to set up http server with boost (ASIO). I've taken the code from docs: http://www.boost.org/doc/libs/1_54_0/doc/html/boost_asio/examples/cpp03_examples.html (HTTP Server, the first one)
The only difference from the example is I'm running server by my own method "run" and starting io_service in background thread, like in the docs: http://www.boost.org/doc/libs/1_54_0/doc/html/boost_asio/reference/io_service.html
boost::asio::io_service::work work(io_service_);
(Also I'm stopping io_service from my run method too.)
When I'm starting this modified server everything seems to be OK, run method is working fine. But then I'm trying to get a doc from the server the request hangs and control flow never comes to "request_handle" method.
Am I missing something?
UPD. Here is my code of run method:
void NetstreamServer::run()
{
LOG4CPLUS_DEBUG(logger, "NetstreamServer is running");
boost::asio::io_service::work work(io_service_);
try
{
while (true)
{
if (condition)
{
io_service_.stop();
break;
}
}
}
catch (std::exception const& e)
{
LOG4CPLUS_ERROR(logger, "NetstreamServer" << " caught exception: " << e.what());
}
}
You should call io_service_::run() - otherwise no one will dispatch the completion handlers of Asio objects serviced by io_service_.
Without including the code you changed, everyone here can only guess. Unfortunately you also do not include the compiler and the OS you are using. Even with boost claiming it is platform independent, you should always include this information, as it reality, platforms are different even with boost.
Let me do a guess. You use Microsoft Windows? How do you prevent the "main" function to exit? You moved the blocking "run" function out of it in another thread, the main function has no wait point anymore. Let me guess again, you used something like "getchar". With that, you can exit your server with only hitting the keyboard return key. If yes, the problem is the getchar, with unfortunately blocks every io of the asio socket implementation, but only on Windows based systems.
I would not need to guess if you would include the informations mentioned in your post. In particular all(!) changes you made to the code sample.
I'm using Boost.Asio for network operations, they have to (and actually, can, there's no complex data structures or anything) remain pretty low level since I can't afford the luxury of serialization overhead (and the libs I found that did offer well enough performance seemed to be badly suited for my case).
The problem is with an async write I'm doing from the client (in QT, but that should probably be irrelevant here). The callback specified in the async_write doesn't get called, ever, and I'm at a complete loss as to why. The code is:
void SpikingMatrixClient::addMatrix() {
std::cout << "entered add matrix" << std::endl;
int action = protocol::Actions::AddMatrix;
int matrixSize = this->ui->editNetworkSize->text().toInt();
std::ostream out(&buf);
out.write(reinterpret_cast<const char*>(&action), sizeof(action));
out.write(reinterpret_cast<const char*>(&matrixSize), sizeof(matrixSize));
boost::asio::async_write(*connection.socket(), buf.data(),
boost::bind(&SpikingMatrixClient::onAddMatrix, this, boost::asio::placeholders::error, boost::asio::placeholders::bytes_transferred));
}
which calls the first write. The callback is
void SpikingMatrixClient::onAddMatrix(const boost::system::error_code& error, size_t bytes_transferred) {
std::cout << "entered onAddMatrix" << std::endl;
if (!error) {
buf.consume(bytes_transferred);
requestMatrixList();
} else {
QString message = QString::fromStdString(error.message());
this->ui->statusBar->showMessage(message, 15000);
}
}
The callback never gets called, even though the server receives all the data. Can anyone think of any reason why it might be doing that?
P.S. There was a wrapper for that connection, and yes there will probably be one again. Ditched it a day or two ago because I couldn't find the problem with this callback.
As suggested, posting a solution I found to be the most suitable (at least for now).
The client application is [being] written in QT, and I need the IO to be async. For the most part, the client receives calculation data from the server application and has to render various graphical representations of them.
Now, there's some key aspects to consider:
The GUI has to be responsive, it should not be blocked by the IO.
The client can be connected / disconnected.
The traffic is pretty intense, data gets sent / refreshed to the client every few secs and it has to remain responsive (as per item 1.).
As per the Boost.Asio documentation,
Multiple threads may call io_service::run() to set up a pool of
threads from which completion handlers may be invoked.
Note that all threads that have joined an io_service's pool are considered equivalent, and the io_service may distribute work across them in an arbitrary fashion.
Note that io_service.run() blocks until the io_service runs out of work.
With this in mind, the clear solution is to run io_service.run() from another thread. The relevant code snippets are
void SpikingMatrixClient::connect() {
Ui::ConnectDialog ui;
QDialog *dialog = new QDialog;
ui.setupUi(dialog);
if (dialog->exec()) {
QString host = ui.lineEditHost->text();
QString port = ui.lineEditPort->text();
connection = TcpConnection::create(io);
boost::system::error_code error = connection->connect(host, port);
if (!error) {
io = boost::shared_ptr<boost::asio::io_service>(new boost::asio::io_service);
work = boost::shared_ptr<boost::asio::io_service::work>(new boost::asio::io_service::work(*io));
io_threads.create_thread(boost::bind(&SpikingMatrixClient::runIo, this, io));
}
QString message = QString::fromStdString(error.message());
this->ui->statusBar->showMessage(message, 15000);
}
}
for connecting & starting IO, where:
work is a private boost::shared_ptr to the boost::asio::io_service::work object it was passed,
io is a private boost::shared_ptr to a boost::asio::io_service,
connection is a boost::shared_ptr to my connection wrapper class, and the connect() call uses a resolver etc. to connect the socket, there's plenty examples of that around
and io_threads is a private boost::thread_group.
Surely it could be shortened with some typedefs if needed.
TcpConnection is my own connection wrapper implementation, which sortof lacks functionality for now, and I suppose I could move the whole thread thing into it when it gets reinstated. This snippet should be enough to get the idea anyway...
The disconnecting part goes like this:
void SpikingMatrixClient::disconnect() {
work.reset();
io_threads.join_all();
boost::system::error_code error = connection->disconnect();
if (!error) {
connection.reset();
}
QString message = QString::fromStdString(error.message());
this->ui->statusBar->showMessage(message, 15000);
}
the work object is destroyed, so that the io_service can run out of work eventually,
the threads are joined, meaning that all work gets finished before disconnecting, thus data shouldn't get corrupted,
the disconnect() calls shutdown() and close() on the socket behind the scenes, and if there's no error, destroys the connection pointer.
Note, that there's no error handling in case of an error while disconnecting in this snippet, but it could very well be done, either by checking the error code (which seems more C-like), or throwing from the disconnect() if the error code within it represents an error after trying to disconnect.
I encountered a similar problem (callbacks not fired) but the circumstances are different from this question (io_service had jobs but still would not fire the handlers ). I will post this anyway and maybe it will help someone.
In my program, I set up an async_connect() then followed by io_service.run(), which blocks as expected.
async_connect() goes to on_connect_handler() as expected, which in turn fires async_write().
on_write_complete_handler() does not fire, even though the other end of the connection has received all the data and has even sent back a response.
I discovered that it is caused by me placing program logic in on_connect_handler(). Specifically, after the connection was established and after I called async_write(), I entered an infinite loop to perform arbitrary logic, not allowing on_connect_handler() to exit. I assume this causes the io_service to not be able to execute other handlers, even if their conditions are met because it is stuck here. ( I had many misconceptions, and thought that io_service would automagically spawn threads for each async_x() call )
Hope that helps.
I am writing an application in Qt to be deployed on Symbian S60 platform. Unfortunately, it needs to have Bluetooth functionality - nothing really advanced, just simple RFCOMM client socket and device discovery. To be exact, the application is expected to work on two platforms - Windows PC and aforementioned S60.
Of course, since Qt lacks Bluetooth support, it has to be coded in native API - Winsock2 on Windows and Symbian C++ on S60 - I'm coding a simple abstraction layer. And I have some problems with the discovery part on Symbian.
The discovery call in the abstraction layer should work synchronously - it blocks until the end of the discovery and returns all the devices as a QList. I don't have the exact code right now, but I had something like that:
RHostResolver resolver;
TInquirySockAddr addr;
// OMITTED: resolver and addr initialization
TRequestStatus err;
TNameEntry entry;
resolver.GetByAddress(addr, entry, err);
while (true) {
User::WaitForRequest(err);
if (err == KErrHostResNoMoreResults) {
break;
} else if (err != KErrNone) {
// OMITTED: error handling routine, not very important right now
}
// OMITTED: entry processing, adding to result QList
resolver.Next(entry, err);
}
resolver.Close();
Yes, I know that User::WaitForRequest is evil, that coding Symbian-like, I should use active objects, and so on. But it's just not what I need. I need a simple, synchronous way of doing device discovery.
And the code above does work. There's one quirk, however - I'd like to have a timeout during the discovery. That is, I want the discovery to take no more than, say, 15 seconds - parametrized in a function call. I tried to do something like this:
RTimer timer;
TRequestStatus timerStatus;
timer.CreateLocal();
RHostResolver resolver;
TInquirySockAddr addr;
// OMITTED: resolver and addr initialization
TRequestStatus err;
TNameEntry entry;
timer.After(timerStatus, timeout*1000000);
resolver.GetByAddress(addr, entry, err);
while (true) {
User::WaitForRequest(err, timerStatus);
if (timerStatus != KRequestPending) { // timeout
resolver.Cancel();
User::WaitForRequest(err);
break;
}
if (err == KErrHostResNoMoreResults) {
timer.Cancel();
User::WaitForRequest(timerStatus);
break;
} else if (err != KErrNone) {
// OMITTED: error handling routine, not very important right now
}
// OMITTED: entry processing, adding to result QList
resolver.Next(entry, err);
}
timer.Close();
resolver.Close();
And this code kinda works. Even more, the way it works is functionally correct - the timeout works, the devices discovered so far are returned, and if the discovery ends earlier, then it exits without waiting for the timer. The problem is - it leaves a stray thread in the program. That means, when I exit my app, its process is still loaded in background, doing nothing. And I'm not the type of programmer who would be satisfied with a "fix" like making the "exit" button kill the process instead of exiting gracefully. Leaving a stray thread seems a too serious resource leak.
Is there any way to solve this? I don't mind rewriting everything from scratch, even using totally different APIs (as long as we're talking about native Symbian APIs), I just want it to work. I've read a bit about active objects, but it doesn't seem like what I need, since I just need this to work synchronously... In the case of bigger changes, I would appreciate more detailed explanations, since I'm new to Symbian C++, and I don't really need to master it - this little Bluetooth module is probably everything I'll need to write in it in foreseeable future.
Thanks in advance for any help! :)
The code you have looks ok to me. You've missed the usual pitfall of not consuming all the requests that you've issued. Assuming that you also cancel the timer and do a User::WaitForRequest(timerStatus) inside you're error handing condition, it should work.
I'm guessing that what you're worrying about is that there's no way for your main thread to request that this thread exit. You can do this roughly as follows:
Pass a pointer to a TRequestStatus into the thread when it is created by your main thread. Call this exitStatus.
When you do the User::WaitForRequest, also wait on exitStatus.
The main thread will do a bluetoothThread.RequestComplete(exitStatus, KErrCancel) when it wants the subthread to exit, where bluetoothThread is the RThread object that the main thread created.
in the subthread, when exitStatus is signalled, exit the loop to terminate the thread. You need to make sure you cancel and consume the timer and bluetooth requests.
the main thread should do a bluetoothThread.Logon and wait for the signal to wait for the bluetooth thread to exit.
There will likely be some more subtleties to deal correctly with all the error cases and so on.
I hope I'm not barking up the wrong tree altogether here...
The question is already answered, but... If you'd use active objects, I'd propose you to use nested active scheduler (class CActiveSchedulerWait). You could then pass it to your active objects (CPeriodic for timer and some other CActive for Bluetooth), and one of them would stop this nested scheduler in its RunL() method. More than this, with this approach your call becomes synchronous for the caller, and your thread will be gracefully closed after performing the call.
If you're interested in the solution, search for examples of CActiveSchedulerWait, or just ask me and I'll give your some code sample.