asio: best way to store a message to be broadcast

asio: best way to store a message to be broadcast - c++

I want to make a buffer of characters, write to it using sprintf, then pass it to multiple calls of async_write() (i.e. distribute it to a set of clients). My question is what is the best data structure to use for this? If there are compromises then I guess the priorities for defining "best" would be:
fewer CPU cycles
code clarity
less memory usage
Here is what I have currently, that appears to work:
function broadcast(){
char buf[512];
sprintf(buf,"Hello %s","World!");
boost::shared_ptr<std::string> msg(new std::string(buf));
msg->append(1,0); //NUL byte at the end
for(std::vector< boost::shared_ptr<client_session> >::iterator i=clients.begin();
i!=clients.end();++i) i->write(buf);
}
Then:
void client_session::write(boost::shared_ptr<std::string> msg){
if(!socket->is_open())return;
boost::asio::async_write(*socket,
boost::asio::buffer(*msg),
boost::bind(&client_session::handle_write, shared_from_this(),_1,_2,msg)
);
}
NOTES:
Typical message size is going to be less than 64 bytes; the 512 buffer size is just paranoia.
I pass a NUL byte to mark the end of each message; this is part of the protocol.
msg has to out-live my first code snippet (an asio requirement), hence the use of a shared pointer.
I think I can do better than this on all my criteria. I wondered about using boost::shared_array? Or creating an asio::buffer (wrapped in a smart pointer) directly from my char buf[512]? But reading the docs on these and other choices left me overwhelmed with all the possibilities.
Also, in my current code I pass msg as a parameter to handle_write(), to ensure the smart pointer is not released until handle_write() is reached. That is required isn't it?
UPDATE: If you can argue that it is better overall, I'm open to replacing sprintf with a std::stringstream or similar. The point of the question is that I need to compose a message and then broadcast it, and I want to do this efficiently.
UPDATE #2 (Feb 26 2012): I appreciate the trouble people have gone to post answers, but I feel none of them has really answered the question. No-one has posted code showing a better way, nor given any numbers to support them. In fact I'm getting the impression that people think the current approach is as good as it gets.

First of all, note that you are passing your raw buffer instead of your message to the write function, I think you do not meant to do that?
If you're planning to send plain-text messages, you could simply use std::string and std::stringstream to begin with, no need to pass fixed-size arrays.
If you need to do some more binary/bytewise formatting I would certainly start with replacing that fixed-size array by a vector of chars. In this case I also wouldn't take the roundtrip of converting it to a string first but construct the asio buffer directly from the byte vector. If you do not have to work with a predefined protocol, an even better solution is to use something like Protocol Buffers or Thrift or any viable alternative. This way you do not have to worry about things like endianness, repetition, variable-length items, backwards compatibility, ... .
The shared_ptr trick is indeed necessary, you do need to store the data that is referenced by the buffer somewhere until the buffer is consumed. Do not forget there are alternatives that could be more clear, like storing it simply in the client_session object itself. However, if this is feasible depends a bit on how your messaging objects are constructed ;).

You could store a std::list<boost::shared_ptr<std::string> > in your client_session object, and have client_session::write() do a push_back() on it. I think that is cleverly avoiding the functionality of boost.asio, though.

As I got you need to send the same messages to many clients. The implementation would be a bit more complicated.
I would recommend to prepare a message as a boost::shared_ptr<std::string> (as #KillianDS recommended) to avoid additional memory usage and copying from your char buf[512]; (it's not safe in any case, you cannot be sure how your program will evolve in the future and if this capacity will be sufficient in all cases).
Then push this message to each client internal std::queue. If the queue is empty and no writings are pending (for this particular client, use boolean flag to check this) - pop the message from queue and async_write it to socket passing shared_ptr as a parameter to a completion handler (a functor that you pass to async_write). Once the completion handler is called you can take the next message from the queue. shared_ptr reference counter will keep the message alive until the last client suffesfully sent it to socket.
In addition I would recommend to limit maximum queue size to slow down message creation on insufficient network speed.
EDIT
Usually sprintf is more efficient in cost of safety. If performance is criticical and std::stringstream is a bottleneck you still can use sprintf with std::string:
std::string buf(512, '\0');
sprintf(&buf[0],"Hello %s","World!");
Please note, std::string is not guaranteed to store data in contiguous memory block, as opposite to std::vector (please correct me if this changed for C++11). Practically, all popular implementations of std::string does use contiguous memory. Alternatively, you can use std::vector in the example above.

Related

std::string vs. byte buffer (difference in c++)

I have a project where I transfer data between client and server using boost.asio sockets. Once one side of the connection receives data, it converts it into a std::vector of std::strings which gets then passed on to the actualy recipient object of the data via previously defined "callback" functions. That way works fine so far, only, I am at this point using methods like atoi() and to_string to convert other data types than strings into a sendable format and back. This method is of course a bit wasteful in terms of network usage (especially when transferring bigger amounts of data than just single ints and floats). Therefore I'd like to serialize and deserialize the data. Since, effectively, any serialisation method will produce a byte array or buffer, it would be convenient for me to just use std::string instead. Is there any disadvantage to doing that? I would not understand why there should be once, since strings should be nothing more than byte arrays.

In terms of functionality, there's no real difference.
Both for performance reasons and for code clarity reasons, however, I would recommend using std::vector<uint8_t> instead, as it makes it far more clear to anyone maintaining the code that it's a sequence of bytes, not a String.

You should use std::string when you work with strings, when you work with binary blob you better work with std::vector<uint8_t>. There many benefits:
your intention is clear so code is less error prone
you would not pass your binary buffer as a string to function that expects std::string by mistake
you can override std::ostream<<() for this type to print blob in proper format (usually hex dump). Very unlikely that you would want to print binary blob as a string.
there could be more. Only benefit of std::string that I can see that you do not need to do typedef.

You're right. Strings are nothing more than byte arrays. std::string is just a convenient way to manage the buffer array that represents the string. That's it!
There's no disadvantage of using std::string unless you are working on something REALLY REALLY performance critical, like a kernel, for example... then working with std::string would have a considerable overhead. Besides that, feel free to use it.
--
An std::string behind the scenes needs to do a bunch of checks about the state of the string in order to decide if it will use the small-string optimization or not. Today pretty much all compilers implement small-string optimizations. They all use different techniques, but basically it needs to test bitflags that will tell if the string will be constructed in the stack or the heap. This overhead doesn't exist if you straight use char[]. But again, unless you are working on something REALLY critical, like a kernel, you won't notice anything and std::string is much more convenient.
Again, this is just ONE of the things that happens under the hood, just as an example to show the difference of them.

Depending on how often you're firing network messages, std::string should be fine. It's a convenience class that handles a lot of char work for you. If you have a lot of data to push though, it might be worth using a char array straight and converting it to bytes, just to minimise the extra overhead std::string has.
Edit: if someone could comment and point out why you think my answer is bad, that'd be great and help me learn too.

What is the best way to implement a serial buffer using vector<char>?

I need to read data from a serial device and put it into a buffer to be consumed by another thread. Basically, I want to achieve this:
while(!exit){
// read from fd and push into the vector<char> buffer
}
And do it the right way in C++. I know how to get this done in C, and I'd really appreciate it if someone could point me in the right direction.
From what I've found so far, people have been suggesting:
read(fd, &vector[0], vector.size());
But, I'm not convinced. Especially since modifying the &vector[0] directly doesn't update size() (or does it?) and seems like an indirect way to modify the underlying array. I'd like to avoid using open() and read() as well, if I could help, as they aren't really C++. Some form of istream would be awesome here!
Also, I couldn't find any examples of how to neatly "pop" the data from this vector when the data needs to be consumed from the other thread. I believe, and I'm certainly not 100% sure about this, that if there's only one writer thread and one reader thread for this vector, I wouldn't need any special code for thread safety. Please correct me if I'm wrong.
If it matters at all, the data in the vector is binary.

In my experience, I've always used a fixed size array of uint8_t for serial communications. This provides a faster access than going through a vector; and most serial I/O has been time sensitive.
A fixed size means no time spent resizing.

Which dynamic container type to use?

I'm writing code for a router (aka gateway), and as I'm receiving and sending packets I need to use a type of container that can support the logic of a router. When receiving a packet I want to place it in the end of the dynamic container (here from and on known as DC). When taking the packet out of the DC for processing I want to take it from the front of the DC.
Any suggestion on which one to use?
I've heard that a vector would be a good idea but I'm not quite sure if they are dynamic..
EDIT: The type of element that it should contain is a raw packet of type "unsigned char *". How would I write the code for the DC to contain such a type?

std::deque<unsigned char *> is the obvious choice here, since it supports efficient FIFO semantics (use push_back and pop_front, or push_front and pop_back, the performance should be the same).
In my experience the std::queue (which is a container adapter normally built over std::deque) is not worth the effort, it only restricts the interface without adding anything useful.

For a router, you probably should use a fixed size custom container (probably based around std::array or a C array). You can then introduce some logic to allow it to be used as a circular buffer. The fixed size is extremely important because you need to deal with the scenario where packets are coming in faster than you can send them off. When you reach your size limit, you then flow off.
With dynamically re-sizable containers, your may end up running out of memory or introducing unacceptable amounts of latency into the system.

You can use std::queue. You insert elements at the end using push() and remove elements from the front using pop(). front() returns the front element.
To store unsigned char* elements, you'd declare a queue like this:
std::queue<unsigned char*> packetQueue;

Output binary buffer with STL

I'm trying to use something that could best be described as a binary output queue. In short, one thread will fill a queue with binary data and another will pop this data from the queue, sending it to a client socket.
What's the best way to do this with STL? I'm looking for something like std::queue but for many items at a time.
Thanks

What does "binary data" mean? Just memory buffers? Do you want to be able push/pop one buffer at a time? Then you should wrap a buffer into a class, or use std::vector<char>, and push/pop them onto std::deque.

I've needed this sort of thing for a network communications system in a multi-threaded environment.
In my case I just wrapped std::queue with an object that handled locking (std::queue is not thread-safe, generally speaking). The objects in the queue were just very lightweight wrappers over char*-style arrays.
Those wrappers also provided the following member functions which I find extremely useful.
insertByte(unsigned int location, char value)
insertWord(unsigned int location, int value)
insertLong(unsigned int location, long value)
getByte/Word/Long(unsigned int location)
These were particularly useful in this context, since the word and long values had to be byteswapped, and I could isolate that issue to the class that actually handled it at the end.
There were some slightly strange things we were doing with "larger than 4 byte" chunks of the binary data, which I thought at the time would prevent us from using std::vector, although these days I would just use it and play around with &vector[x].

Thread Local Memory, Using std::string's internal buffer for c-style Scratch Memory

I am using Protocol Buffers and OpensSSL to generate, HMACs and then CBC encrypt the two fields to obfuscate the session cookies -- similar Kerberos tokens.
Protocol Buffers' API communicates with std::strings and has a buffer caching mechanism; I exploit the caching mechanism, for successive calls in the the same thread, by placing it in thread local memory; additionally the OpenSSL HMAC and EVP CTX's are also placed in the same thread local memory structure ( see this question for some detail on why I use thread local memory and the massive amount of speedup it enables even with a single thread).
The generation and deserialization, "my algorithms", of these cookie strings uses intermediary void *s and std::strings and since Protocol Buffers has an internal memory retention mechanism I want these characteristics for "my algorithms".
So how do I implement a common scratch memory ? I don't know much about the rdbuf(streambuf - strinbuf ??) of the std::string object. I would presumeably need to grow it to the lowest common size ever encountered during the execution of "my algorithms". Thoughts ?
My question I guess would be: " is the internal buffer of a string re-usable, and if so, how ?"
Edit (new question):
It seems uppon reflection after Vlad's post that I do need a std::string as well a void * c-style scratch buffer. My question would then be: do popular stl's string implementations retain memory when they dont need it ? (my needs will probably stay between 128-bytes to 10-KB).

You shouldn't expect the whole content of your std::string to reside in TLS, since std::string makes allocations and reallocations for data on its own. A simple idea would be to allocate a structure on heap and store a pointer to it in the TLS.
Edit:
AFAIK rdbuf is a feature of streams, not of string (see here and here).
Edit:
I would suggest using std::vector instead of string, it should be contiguous. Again, it's perhaps better to put just a pointer to the vector into TLS. The comments to the same article say that the standard requires even string to be contiguous, starting from &(str[0]) char.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js