What is the purpose of boost mpi request's m_handler - c++

I am trying to test an mpi request if it is done or not. However, there is a problem that I could not figure out. If I use test_all method as below, then I see that request is not done.
string msg;
boost::mpi::request req = world->irecv(some_rank, 0, msg);
vector<boost::mpi::request> waitingRequests;
if(boost::mpi::test_all(waitingRequests.begin(), waitingRequests.end()))
cout << "test_all done" << endl;
When I try this code, I see that request is done:
string msg;
boost::mpi::request req = world->irecv(some_rank, 0, msg);
cout << "test done" << endl;
So, I looked at the code in test_all function and realized that it returns false because of the condition "first->m_handler" (line 5 in the below code).
template<typename ForwardIterator> bool test_all(ForwardIterator first, ForwardIterator last) {
std::vector<MPI_Request> requests;
for (; first != last; ++first) {
// If we have a non-trivial request, then no requests can be completed.
if (first->m_handler || first->m_requests[1] != MPI_REQUEST_NULL)
return false;
int flag = 0;
int n = requests.size();
(n, &requests[0], &flag, MPI_STATUSES_IGNORE));
return flag != 0;
Now, I wonder what m_handler is for.

MPI does not support intrinsicly complex C++ objects such as std::string. That's why Boost.MPI serialises and correspondingly deserialises such objects when passing them around in the form of MPI messages. From a semantic point of view, the non-blocking operation started by irecv() should complete once the data has been received and an std::string object has been filled in appropriately. The additional step of processing the received message and deserialising it is performed by a special handler method, pointer to which is stored in the m_handler variable:
if (m_handler) {
// This request is a receive for a serialized type. Use the
// handler to test for completion.
return m_handler(this, ra_test);
} else ...
No such handling is needed for simple datatypes.
The same applies to isend() when it operates on C++ objects. In that case a handler is not attached, but the class data is sent in the form of two separate messages and special care is taken for both sends to complete. That's what the second boolean expression (m_requests[1] != MPI_REQUEST_NULL) is for.


boost::interprocess message queue not removing

Using boost::interprocess message queue, I ran into a problem with removing the queue, was hoping someone could provide an explanation for the behavior, or correct my misunderstanding.
I'm on a Linux RHEL7 box. I have two processes, one that creates and feeds the queue, and the other one that opens the message queue and reads from it. For clarity, the initial process is started first, before the second process that does the reading/removing msgs from queue.
The one that creates, is create_only_t. I want it to fail if it already exists. The creation of the first queue always fails however. The specific exception it throws is File exists.
When switched to an open_or_create_t queue, it works fine. I took this information as I wasn't cleaning it up properly, so I made sure I was trying to remove the queue before I tried to create it, as well as after the process finished sending all messages.
I would log if the remove was successful. One of my assumptions is that if the remove returns true it means it successfully removed the queue.
The boost docs for remove reads: "Removes the message queue from the system. Returns false on error.", I wasn't sure if maybe a true just means that it had a successful 'attempt' at removing it. Upon further looking at another Boost Inprocess pdf it explains:
The remove operation might fail returning false if the shared memory does not exist, the file is open or the file is still memory mapped by other processes
Either case, I feel like one would expect the queue to be removed if it always returns true, which is currently my case.
Still when trying to do a 'create_t' message queue it will continue to fail, but 'open_or_create_t' still works.
I had a hard time understanding the behavior, so I also tried to remove the message queue twice in a row before trying to initialize a create_t queue to see if the second one would fail/return false, however both returned true (which was not what I expected based on what the documentation said). The first remove should it be successful means the second one should fail as there should no longer be a message queue that exists anymore.
I've attached a snippet of my create process code. And I'll note, that this error happens without the "open process" being run.
Maybe I'm missing something obvious, thank you in advance.
try {
bool first_removal = remove(msg_queue_name);
if (first_removal) {
log_info("first removal - true"); // this log always prints
bool second_removal = remove(msg_queue_name);
if (second_removal ) {
log_info("removal was again true"); // this also always prints
} else {
log_info("second removal - false");
} else {
log_info("did not remove queue before creation");
log_info("attempting to initialize msg queue");
message_queue mq(ooc, msg_queue_name, max_num_msgs, max_msg_size); // this is where it will fail (File exists)
while(1) {
// insertion logic, but does not get here
} catch ( interprocess_exception& err ) {
log_error(err.what()); // File exists
bool removal_after_failure = remove(msg_queue_name);
if (removal_after_failure) {
log_info("Message queue was successfully removed"); // always logs here after every failure
} else {
log_warn("Message queue was NOT removed");
It worked for me.
Then it dawned on me. You're probably using namespace. Don't. For this reason:
bool first_removal = remove(msg_queue_name);
This doesn't call the function you expect. It calls ::remove from the C standard library.
Simply qualify your call:
bool first_removal = message_queue::remove(msg_queue_name);
What you can do:
write hygienic code
avoid using namespace directives
avoid ADL traps
use warnings (-Wall -Wextra -pedantic at least)
use linters. See below
check your assumptions (a simple trip into the debugger would have shown you what's happening)
E.g. clang-tidy reported:
test.cpp|27 col 30| warning: implicit conversion 'int' -> bool [readability-implicit-bool-conversion]
|| bool first_removal = remove(msg_queue_name);
Suggesting to write:
bool first_removal = remove(msg_queue_name) != 0;
This tipped me off something might be fishy.
Several of these fixes later and the code runs
Live On Coliru
#include <boost/interprocess/ipc/message_queue.hpp>
#include <chrono>
#include <iostream>
namespace bip = boost::interprocess;
using bip::message_queue;
using bip::interprocess_exception;
using namespace std::chrono_literals;
using C = std::chrono::high_resolution_clock;
static constexpr char const* msg_queue_name = "my_mq";
static constexpr bip::open_or_create_t ooc;
static constexpr auto max_num_msgs = 10;
static constexpr auto max_msg_size = 10;
static auto log_impl = [start=C::now()](auto severity, auto const& ... args) {
std::clog << severity << " at " << std::fixed << (C::now()-start)/1.ms << "ms ";
(std::clog << ... << args) << std::endl;
static auto log_error = [](auto const& ... args) { log_impl("error", args...); };
static auto log_warn = [](auto const& ... args) { log_impl("warn", args...); };
static auto log_info = [](auto const& ... args) { log_impl("info", args...); };
int main() {
try {
bool first_removal = message_queue::remove(msg_queue_name);
if (first_removal) {
log_info("first removal - true"); // this log always prints
bool second_removal = message_queue::remove(msg_queue_name);
if (second_removal) {
log_info("removal was again true"); // this also always prints
} else {
log_info("second removal - false");
} else {
log_info("did not remove queue before creation");
log_info("attempting to initialize msg queue");
message_queue mq(
ooc, msg_queue_name, max_num_msgs,
max_msg_size); // this is where it will fail (File exists)
log_info("Start insertion");
} catch (interprocess_exception& err) {
log_error(err.what()); // File exists
bool removal_after_failure = message_queue::remove(msg_queue_name);
if (removal_after_failure) {
log_info("Message queue was successfully removed"); // always logs
// here after
// every failure
} else {
log_warn("Message queue was NOT removed");
Printing, on Coliru:
info at 22.723521ms did not remove queue before creation
info at 22.879425ms attempting to initialize msg queue
error at 23.098989ms Function not implemented
warn at 23.153540ms Message queue was NOT removed
On my system:
info at 0.148484ms first removal - true
info at 0.210316ms second removal - false
info at 0.232181ms attempting to initialize msg queue
info at 0.299645ms Start insertion
info at 0.099407ms first removal - true
info at 0.173156ms second removal - false
info at 0.188026ms attempting to initialize msg queue
info at 0.257117ms Start insertion
Of course now your logic can be greatly simplified:
Live On Coliru
int main() {
try {
bool removal = message_queue::remove(msg_queue_name);
log_info("attempting to initialize msg queue (removal:", removal, ")");
message_queue mq(
ooc, msg_queue_name, max_num_msgs,
max_msg_size); // this is where it will fail (File exists)
} catch (interprocess_exception const& err) {
bool removal = message_queue::remove(msg_queue_name);
log_info(err.what(), " (removal:", removal, ")");
info at 0.462333ms attempting to initialize msg queue (removal:false)
info at 0.653085ms Function not implemented (removal:false)
info at 0.097283ms attempting to initialize msg queue (removal:true)
info at 0.239138ms insertion

App crashes when it takes too long to reply in a ZMQ REQ/REP pattern

I am writing a plugin that interfaces with a desktop application through a ZeroMQ REQ/REP request-reply communication archetype. I can currently receive a request, but the application seemingly crashes if a reply is not sent quick enough.
I receive the request on a spawned thread and put it in a queue. This queue is processed in another thread, in which the processing function is invoked by the application periodically.
The message is correctly being received and processed, but the response cannot be sent until the next iteration of the function, as I cannot get the data from the application until then.
When this function is conditioned to send the response on the next iteration, the application will crash. However, if I send fake data as the response soon after receiving the request, in the first iteration, the application will not crash.
Constructing the socket
zmq::socket_t socket(m_context, ZMQ_REP);
socket.bind("tcp://*:" + std::to_string(port));
Receiving the message in the spawned thread
void ZMQReceiverV2::receiveRequests() {
nInfo(*m_logger) << "Preparing to receive requests";
while (m_isReceiving) {
zmq::message_t zmq_msg;
bool ok = m_respSocket.recv(&zmq_msg, ZMQ_NOBLOCK);
if (ok) {
// msg_str will be a binary string
std::string msg_str;
msg_str.assign(static_cast<char *>(zmq_msg.data()), zmq_msg.size());
nInfo(*m_logger) << "Received the message: " << msg_str;
std::pair<std::string, std::string> pair("", msg_str);
// adding to message queue
nInfo(*m_logger) << "Done receiving requests";
Processing function on seperate thread
void ZMQReceiverV2::exportFrameAvailable()
// checking messages
// if the queue is not empty
if (!m_messages.empty()) {
nInfo(*m_logger) << "Reading message in queue";
smart_target::SMARTTargetCreateRequest id_msg;
std::pair<std::string, std::string> pair = m_messages.front();
std::string topic = pair.first;
std::string msg_str = pair.second;
// removing just read message
//m_respSocket.send(zmq::message_t()); wont crash if I reply here in this invocation
// sending back the ID that has just been made, for it to be mapped
if (timeToSendReply()) {
sendReply(); // will crash, if I wait for this to be exectued on next invocation
My research shows that there is no time limit for the response to be sent, so this, seeming to be, timing issue, is strange.
Is there something that I am missing that will let me send the response on the second iteration of the processing function?
Revision 1:
I have edited my code, so that the responding socket only ever exists on one thread. Since I need to get information from the processing function to send, I created another queue, which is checked in the revised the function running on its own thread.
void ZMQReceiverV2::receiveRequests() {
zmq::socket_t socket = setupBindSocket(ZMQ_REP, 5557, "responder");
nInfo(*m_logger) << "Preparing to receive requests";
while (m_isReceiving) {
zmq::message_t zmq_msg;
bool ok = socket.recv(&zmq_msg, ZMQ_NOBLOCK);
if (ok) {
// does not crash if I call send helper here
// msg_str will be a binary string
std::string msg_str;
msg_str.assign(static_cast<char *>(zmq_msg.data()), zmq_msg.size());
NLogger::nInfo(*m_logger) << "Received the message: " << msg_str;
std::pair<std::string, std::string> pair("", msg_str);
// adding to message queue
if (!sendQueue.empty()) {
sendEntityCreationMessage(socket, sendQueue.front());
nInfo(*m_logger) << "Done receiving requests";
The function sendEntityCreationMessage() is a helper function that ultimately calls socket.send().
void ZMQReceiverV2::sendEntityCreationMessage(zmq::socket_t &socket, NUniqueID id) {
This code seems to be following the thread safety guidelines for sockets. Any suggestions?
Q : "Is there something that I am missing"
Yes,the ZeroMQ evangelisation, called a Zen-of-Zero, since ever promotes never try to share a Socket-instance, never try to block and never expect the world to act as one wishes.
This said, avoid touching the same Socket-instance from any non-local thread, except the one that has instantiated and owns the socket.
Last, but not least, the REQ/REP-Scalable Formal Communication Pattern Archetype is prone to fall into a deadlock, as a mandatory two-step dance must be obeyed - where one must keep the alternating sequence of calling .send()-.recv()-.send()-.recv()-.send()-...-methods, otherwise the principally distributed-system tandem of Finite State Automata (FSA) will unsalvageably end up in a mutual self-deadlock state of the dFSA.
In case one is planning to professionally build on ZeroMQ, the best next step is to re-read the fabulous Pieter HINTJENS' book "Code Connected: Volume 1". A piece of a hard read, yet definitely worth one's time, sweat, tears & efforts put in.

Can't find bug in my threading/atomic values code

I am using CodeBlocks with MinGW compiler and wxWidgets library.
I am writing a program that read some data from the microcontroller, by sending messages (using index and subindex) and getting response messages with said data.
My plan was to send messages one-by-one and waiting for response message using __atomic int variables__ to check when I get response message.
This is my function for sending a message:
typedef std::chrono::high_resolution_clock Clock;
void sendSDO(int index, int subindex)
int nSent = 0;
canOpenClient->SDORead(index, subindex);
auto start = Clock::now();
while ((atomic_index.load() != 0) && (atomic_subindex.load() != 0))
auto t = chrono::duration_cast<chrono::milliseconds>(Clock::now() - start);
if(t.count() > 20)
if (nSent > 5)
MainFrame->printTxt("[LOG] response not received\n");
canOpenClient->SDORead(index, subindex);
start = Clock::now();
Pseudocode is set atomic int to index and subindex of what I want value I want to read from microcontroller, then send message to it SDORead(), and if no response was received in 20 ms, send the message again, up to 5 times.
For receiving messages, I have a __separate thread__ with a callback function which is called when I get response message from the controller:
void notifyEvent(unsigned char ev_type)
SDO_msg_t msg;
msg = canOpenClient->Cmd_CustomMessageGet(); //get response message
if(ev_type == CO_EVENT_SDO_READ)
if ((msg.index == atomic_index.load()) && (msg.subindex == atomic_subindex.load()))
//does stuff, like saves message data to set container
if (message data not in container)
printf("not in container!")
Here I set the same atomic int values to 0, when the correct response message is received, and save response message data
I also have variables nSentMessages and nReceivedMessages, which hold the number of messages sent and messages received. I check at the end if these values are the same. Normally, I wouldn't need this (since I wait for every response), I put it there as an extra safety measure.
Now onto the problem:
1) My problem is in callback function notifyEvent(), where I presumably save response message data to a container, but I still sometimes get "not in container!" from that if statement and I don't know why. (My container is just normal set set<EDSobject, cmp> container, it's not atomic or anything, since I know there won't be reads/writes to it at the same time from different threads.)
2) If you check my function sendSDO(), there is a line Sleep(10). The program works ok with it, but if I remove it, the program returns a different value for nSentMessages and nReceivedMessages - 576 and 575. This happens every time I run the program and I don't understand why.

How to edit SIM800l library to ensure that a call is established

I use SIM800l to make calls with arduino UNO with AT commands. By using this library I make calls with gprsTest.callUp(number) function. The problem is that it returns true even the number is wrong or there is no credit.
It is clear on this part code from GPRS_Shield_Arduino.cpp library why it is happening. It doesnt check the return of ATDnumberhere;
bool GPRS::callUp(char *number)
//char cmd[24];
if(!sim900_check_with_cmd("AT+COLP=1\r\n","OK\r\n",CMD)) {
return false;
//HACERR quitar SPRINTF para ahorar memoria ???
//sprintf(cmd,"ATD%s;\r\n", number);
return true;
The return of ATDnumberhere; on software serial communication is:
If number is wrong
If there is no credit
`MO CONNECTED //instant response
+COLP: "003069XXXXXXXX",129,"",0,"" // after 3 sec
If it is calling and no answer
MO RING //instant response, it is ringing
NO ANSWER // after some sec
If it is calling and hang up
MO RING //instant response
NO CARRIER // after some sec
If the receiver has not carrier
If it is calling , answer and hang up
+COLP: "69XXXXXXXX",129,"",0,""
The question is how to use different returns for every case by this function gprsTest.callUp(number) , or at least how to return true if it is ringing ?
This library code seems better than the worst I have seen at first glance, but it still have some issues. The most severe is its Final result code handling.
The sim900_check_with_cmd function is conceptually almost there, however only checking for OK is in no way acceptable. It should check for every single possible final result code the modem might send.
From your output examples you have the following final result codes
but there exists a few more as well. You can look at the code for atinout for an example of a is_final_result_code function (you can also compare to isFinalResponseError and isFinalResponseSuccess1 in ST-Ericsson's U300 RIL).
The unconditional return true; at the end of GPRS::callUp is an error, but it might be deliberate due to lack of ideas for implementing a better API so that the calling client could check the intermediate result codes. But that is such a wrong way to do it.
The library really should do all the stateful command line invocation and final result code parsing with no exceptions. Just doing parts of that in the library and leaving some of it up to the client is just bad design.
When clients want to inspect or act on intermediate result codes or information text that comes between the command line and the final result code, the correct way to do it is to let the library "deframe" everything it receives from the modem into individual complete lines, and for everything that is not a final result code provide this to the client through a callback function.
The following is from an unfinished update to my atinout program:
bool send_commandline(
const char *cmdline,
const char *prefix,
void (*handler)(const char *response_line, void *ptr),
void *ptr,
FILE *modem)
int res;
char response_line[1024];
DEBUG(DEBUG_MODEM_WRITE, ">%s\n", cmdline);
res = fputs(cmdline, modem);
if (res < 0) {
error(ERR "failed to send '%s' to modem (res = %d)", cmdline, res);
return false;
* Adding a tiny delay here to avoid losing input data which
* sometimes happens when immediately jumping into reading
* responses from the modem.
do {
const char *line;
line = fgets(response_line, (int)sizeof(response_line), modem);
if (line == NULL) {
error(ERR "EOF from modem");
return false;
DEBUG(DEBUG_MODEM_READ, "<%s\n", line);
if (prefix[0] == '\0') {
handler(response_line, ptr);
} else if (STARTS_WITH(response_line, prefix)) {
handler(response_line + strlen(prefix) + strlen(" "), ptr);
} while (! is_final_result(response_line));
return strcmp(response_line, "OK\r\n") == 0;
You can use that as a basis for implementing proper handling. If you want to
get error responses out of the function, add an additional callback argument and change to
success = strcmp(response_line, "OK\r\n") == 0;
if (!success) {
error_handler(response_line, ptr);
return success;
Tip: Read all of chapter 5 in the V.250 specification, it will teach you almost everything you need to know about command lines, result codes and response handling. Like for instance that a command line should also be terminated with \r only, not \r\n-
1 Note that CONNECT is not a final result code, it is an intermediate result code, so the name isFinalResponseSuccess is strictly speaking not 100% correct.

strange behavior in concurrently executing a function for objects in queue

My program has a shared queue, and is largely divided into two parts:
one for pushing instances of class request to the queue, and the other accessing multiple request objects in the queue and processing these objects. request is a very simple class(just for test) with a string req field.
I am working on the second part, and in doing so, I want to keep one scheduling thread, and multiple (in my example, two) executing threads.
The reason I want to have a separate scheduling thread is to reduce the number of lock and unlock operation to access the queue by multiple executing threads.
I am using pthread library, and my scheduling and executing function look like the following:
void * sched(void* elem) {
queue<request> *qr = static_cast<queue<request>*>(elem);
pthread_t pt1, pt2;
if(pthread_mutex_lock(&mut) == 0) {
if(!qr->empty()) {
int result1 = pthread_create(&pt1, NULL, execQueue, &(qr->front()));
if (result1 != 0) cout << "error sched1" << endl;
if(!qr->empty()) {
int result2 = pthread_create(&pt2, NULL, execQueue, &(qr->front()));
if (result2 != 0) cout << "error sched2" << endl;
pthread_join(pt1, NULL);
pthread_join(pt2, NULL);
return 0;
void * execQueue(void* elem) {
request *r = static_cast<request*>(elem);
cout << "req is: " << r->req << endl; // req is a string field
return 0;
Simply, each of execQueue has one thread to be executed on, and just outputs a request passed to it through void* elem parameter.
sched is called in main(), with a thread, (in case you're wondering how, it is called in main() like below)
pthread_t schedpt;
int schresult = pthread_create(&schedpt, NULL, sched, &q);
if (schresult != 0) cout << "error sch" << endl;
pthread_join(schedpt, NULL);
and the sched function itself creates multiple(two in here) executing threads and pops requests from the queue, and executes the requests by calling execQueue on multiple threads(pthread_create and then ptrhead_join).
The problem is the weird behavior by the program.
When I checked the size and the elements in the queue without creating threads and calling them on multiple threads, they were exactly what I expected.
However, when I ran the program with multiple threads, it prints out
1 items are in the queue.
2 items are in the queue.
req is:
req is: FIRST! �(x'�j|1��rj|p�rj|1����FIRST!�'�j|!�'�j|�'�j| P��(�(��(1���i|p��i|
with the last line constantly varying.
The desired output is
1 items are in the queue.
2 items are in the queue.
req is: FIRST
req is: FIRST
I guess either the way I call the execQueue on multiple threads, or the way I pop() is wrong, but I could not figure out the problem, nor could I find any source to refer to for a correct usage.
Please help me on this. Bear with me for clumsy use of pthread, as I am a beginner.
Your queue holds objects, not pointers to objects. You can address the object at the front of the queue via operator &() as you are, but as soon as you pop the queue that object is gone and that address is no longer valid. Of course, sched doesn't care, but the execQueue function you sent that address do certainly does.
The most immediate fix for your code is this:
Change this:
pthread_create(&pt1, NULL, execQueue, &(qr->front()));
To this:
// send a dynamic *copy* of the front queue node to the thread
pthread_create(&pt1, NULL, execQueue, new request(qr->front()));
And your thread proc should be changed to this:
void * execQueue(void* elem)
request *r = static_cast<request*>(elem);
cout << "req is: " << r->req << endl; // req is a string field
delete r;
return nullptr;
That said, I can think of better ways to do this, but this should address your immediate problem, assuming your request object class is copy-constructible, and if it has dynamic members, follows the Rule Of Three.
And here's your mildly sanitized c++11 version just because I needed a simple test thingie for MSVC2013 installation :)
See it Live On Coliru
#include <iostream>
#include <thread>
#include <future>
#include <mutex>
#include <queue>
#include <string>
struct request { std::string req; };
std::queue<request> q;
std::mutex queue_mutex;
void execQueue(request r) {
std::cout << "req is: " << r.req << std::endl; // req is a string field
bool sched(std::queue<request>& qr) {
std::thread pt1, pt2;
std::lock_guard<std::mutex> lk(queue_mutex);
if (!qr.empty()) {
pt1 = std::thread(&execQueue, std::move(qr.front()));
if (!qr.empty()) {
pt2 = std::thread(&execQueue, std::move(qr.front()));
if (pt1.joinable()) pt1.join();
if (pt2.joinable()) pt2.join();
return true;
int main()
auto fut = std::async(sched, std::ref(q));
if (!fut.get())
std::cout << "error" << std::endl;
Of course it doesn't actually do much now (because there's no tasks in the queue).