Design pattern for handling packets - c++

I'm writing a TCP Network for a game project.
When a packet comes in the first byte of the packet determines that packet's handling type. The packet should than be forwarded on to a method that handles the packet based on its handle type
I could have a bunch of logic cases that then call a method based on the packet type, but I wanted to see what better design patterns I could implement to reduce code duplication.
I've thought about using the subscriber/notifier pattern already, I'm not fully against it, but I feel as if I'd have a bunch of Subscribe(packetType, funcReference) calls, so perhaps it isn't ideal either.

Having a big switch statement that handles each packet type is perfectly acceptable. Even in the case where there's multiple resolvers for a given handled packet, you can just trigger the subscribed callbacks in that case.
In my experience this is one of those cases where people (myself included, in the past) will over-complicate for the sake of what feels like "better" code. Switch then handle is very easy to grok at first glance, and easy to extend.

Since your packet type marker is only a byte, you can make an array of pointers to handling functions with size of 256 elements. Initialize it once upon program start.

Related

How can I recv TCP socket data in one package without dividing

Since I create a TCP socket,it is fine when sending small amount data.no fragment. all data came in one package. but when data becomes bigger and bigger. TCP package has been divided into pieces.. it`s really annoying. Is there any option to set on socket, and the socket will automatically put pieces into one package for me ?
It's a byte stream. All the bytes will arrive correctly and in the right order, but not necessarily when you want them. If you need to send anything more complex than one byte, you need another protocol on top of TCP. That's why there are all those other TCP/IP protocols like HTTP, SMTP etc.
No there is not. There are even situations where you might receive 1 byte.
Consider using higher level messaging libraries like ZMQ. It handles all the message packing and unpacking for you.
TCP provides you reliable bi-directional byte stream. It takes care of sequencing, transport-layer packetization, retransmission, and flow-control. Decades of research went into optimizing its performance. Pretty nifty. The small price you pay for all this convenience is that you have to write and read the stream in a loop, watching for a complete application protocol message you can process when receiving, and flushing yet unbuffered bytes when sending.
Welcome to socket programming!
I'll chime in here and say that there's pretty much nothing you can do to solve you issue without adding extra dependencies on libraries which handle application protocols for you. There are some lower level message packing libraries (google's protocol buffers, among others) which may help.
It's probably the most beneficial to get used to reading and writing TCP data in a loop. It's proven and very portable.. even if you pay a small price in actually writing the streaming codecs yourself.
Try it a few times. It's a useful experience which you can re-use, and it's really not as difficult and annoying once you get the hang of it (like anything else, really).
Furthermore, it's fairly easy to unit-test (rather than dealing with esoteric libraries and uncommon protocols with badly/sparsely documented options)..
You can optimize sockets reads to return larger chunks, on platforms that support it, by setting low watermark using setsockopt() and SO_RECVLOWAT. But you will still have to handle the possibility of getting bytes less than the watermark.
I think you want SOCK_SEQPACKET (or possibly SOCK_RDM). See socket(2).

C++ IPC Communication

I am in dilema to make decision on the below scenarios. Kindly need experts help.
Scenario : There is TCP/IP communication between two process running in two boxes.
Communication Method 1 : Stream based communication on the socket. Where on the receiver side , he will receive the entire byte buffer and interpret first few fixed bytes as header and desrialize it and get to know the message length and start take message of that length and deserialize it and proceed to next message header like that goes on....
Communication Method2 : Put all the messages in a vector and vector will be residing in a class object. serialize the class object in one go and send to receiver. Receiver deserialize the class object and read the vector array one by one.
Please let me know which approach is efficient and if any other approach , please guide me.
Also pros and cons of class based data transmission and structure based data transmission and which is suitable for which scenario ?
Your question lacks some key details, and mixes different concerns, frustrating any attempt to provide a good answer.
Specifically, Method 2 mysteriously "serialises" and "deserialises" the object and contained vector without specifying any details of how that's being done. In practice, the details are of the kind alluded to in Method 1. So, 1 and 2 aren't alternatives unless you're choosing between using a serialisation library and doing it from scratch (in which case I'd say use the library as you're new to this and the library's more likely to get it right).
What I can say:
at a TCP level, it's most efficient to read into a decent sized block (given I tend to work on PC/server hardware, I'd just use 64k though smaller may be enough to get the same kind of throughput) and have each read() or recv() read as much data from the socket as possible
after reading enough bytes (in however many read/recvs) to attempt some interpretation of the data, it's necessary to recognise the end of particular parts of the serialised input: sometimes that's implicit in the data type involved, other times it's communicated using some sentinel (e.g. a linefeed or NUL), and other times there can be a prefixed fixed-size "expect N bytes" header. This aspect/consideration often applies hierarchically to the stream of objects and nested sub objects etc..
the TCP read/recvs may deliver more data than were sent in any single request, so you may have 1 or more bytes that are logically part of the subsequent but incomplete logical message at the end of the block assembled above
the process of reading larger blocks then accessing various fixed and variable sized elements inside the buffers is already supported by C++ iostreams, but you can roll your own if you want
So, let me emphasise this: do NOT assume you will receive any more than 1 byte from any given read of the socket: if you have say a 20 byte header you should loop reading until you hit either an error or have assembled all 20 bytes. Sending 20 bytes in a single write() or send() does not mean the 20 bytes will be presented to a single read() / recv(). TCP is a byte stream protocol, and you have to take arbitrary numbers of bytes as and when they're provided, waiting until you have enough data to interpret it. Similarly, be prepared to get more data than the client could write in a single write()/`send().
Also pros and cons of class based data transmission and structure based data transmission and which is suitable for which scenario ?
These terms are completely bogus. classes and structures are almost identical things in C++ - mechanisms for grouping data and related functions (they differ only in how they - by default - expose the base classes and data members to client code). Either can have or not have member functions or support code that helps serialise and deserialise the data. For example, the simplest and most typical support are operator<< and/or operator>> streaming functions.
If you want to contrast these kind of streaming functions with an ad-hoc "write a binary block, read a binary block" approach (perhaps from thinking of structs as being POD without support code), then I'd say prefer streaming functions where possible, starting with streaming to human-readable representations as they'll make your system easier and quicker to develop, debug and support. Once you're really comfortable with that, if the runtime performance requires it then optimise with a binary representation. If you write the serialisation code well, you won't notice much difference in performance between a cruder void*/#bytes model of data and proper per-member serialisation, but the latter can more easily support unusual cases - portability across systems with different size ints/longs etc., different byte ordering, intentional choices re shallow vs. deep copying of pointed to data etc....
I'd also recommend looking at the boost serialisation library. Even if you don't use it, it should give you a better understanding of how this kind of thing is reasonably implemented in C++.
Both methods are equivalent. In both you must send a header with message size and identifier in order to deserialize. If you assume that first option is composed by a serialized 'class' like a normal message, you must implement the same 'code'.
Another thing you must have in mind is message's size in order to full TCP buffers to optimize communications. If your 1st method messages are so little, try to improve the communication ratio with bigger messages like in 2nd option you describe.
Keep in mind that it's not safe simply streaming out a struct or class directly by interpreting it as a sequence of bytes, even if it's a simple POD - there's issues like endianness (which is unlikely to be a real-world problem for most of us), and structure alignment/padding (which is a potential problem).
C++ doesn't have any built-in serialization/deserialization, you'll either have to roll your own or take a look at things like boost Serialization, or Google's protobuf.
If it is not a homework or study project, there may be little point in fiddling with IPC at TCP stream level especially if that's not something one is familiar with.
Use a messaging library, like ØMQ to send and receive complete messages rather than streams of bytes.

Networking Method

Hey guys, Iv'e noticed that when I send a complete packet (collect it's data in a buffer and send) it is much slower than sending the packet byte by byte.
Will it be okay if I make an online game using this method?
Sounds like a naggling-related problem.
You have to disable naggling for latency-demanding applications. (See setsockopt, TCP_NODELAY).
Explanation:
TCP stack behaves differently for small chunks, trying to combine them in bizare ways on the way to IP datagrams. This is a performance optimization suggested by J.Nagle (hence nagling). Keep in mind that enabling NODELAY will make every send() call a kernel-mode transition, so you may wish to pack streams into chunks yourself by means of memory copying, before feeding them into send() if performance is an issue for what you are doing.
I think you need to define what are your measurement points (what exactly are you measuring). By the way is this TCP or UDP?
Anyway Winsock has its own internal buffers that you can modify by calls to setsockopt.
That sounds bizarre. There is much more overhead in sending data byte by byte. Your transport headers will far outweigh the payload! Not to mention O(n) send calls (where n is the number of bytes).
You're doing something wrong if that's what you experience.
What I didn't really measure anything, I'm pretty sure it has something to do with sending data and not collecting it..
I'm using C# for server-side and C++ for client side, in the server side I wrapped the socket with a BinaryWriter and BinaryReader, and in the client I just used send and recv
to send every byte.

How to delta encode a C/C++ struct for transmission via sockets

I need to send a C struct over the wire (using UDP sockets, and possibly XDR at some point) at a fairly high update rate, which potentially causes lots of redundant and unnecessary traffic at several khz.
This is because, some of the data in the struct may not have changed at times, so I thought that delta-encoding the current C struct against the previous C struct would seem like a good idea, pretty much like a "diff".
But I am wondering, what's the best approach of doing something like this, ideally in a portable manner that also ensures that data integrity is maintained? Would it be possible to simply XOR the data and proceed like this?
Similarly, it would be important that the approach remains extensible enough, so that new fields can be added to the struct or reordered if necessary (padding), which sounds as if it'd require versioning information, as well.
Any ideas or pointers (are there existing libraries?) would be highly appreciated!
Thanks
EDIT: Thanks to everyone one who provided an answer, the level of detail is really appreciated, I realize that I probably should not have mentioned UDP though, because that is in fact not the main problem, because there is already a corresponding protocol implemented on top of UDP that accounts for the mentioned difficulties, so the question was really meant to be specific to feasible means of delta encoding a struct, and not so much about using UDP in particular as a transport mechanism.
UDP does not guarantee that a given packet was actually received, so encoding whatever you transmit as a "difference from last time" is problematic -- you can't know that your counterpart has the same idea as you about what the "last time" was. Essentially you'd have to build some overhead on top of UDP to check what packets have been received (tagging each packet with a unique ID) -- everybody who's tried to go this route will agree that more often than not you find yourself more or less duplicating the TCP streaming infrastructure on top of UDP... only, most likely, not as solid and well-developed (although admittedly sometimes you can take advantage of very special characteristics of your payloads in order to gain some modest advantage over plain good old TCP).
Does your transmission need to be one-way, sender to receiver? If that's the case (i.e., it's not acceptable for the receiver to send acknowledgments or retransmits) then there's really not much you can do along these lines. The one thing that comes to mind: if it's OK for the receiver to be out of sync for a while, then the sender could send two kinds of packets -- one with a complete picture of the current value of the struct, and an identifying unique tag, to be sent at least every (say) 5 minutes (so realistically the receiver may be out of sync for up to 15 minutes if it misses two of these "big packets"); one with just an update (diff) from the last "big packet", including the big packet's identifying unique tag and (e.g.) a run-length-encoded version of the XOR you mention.
Of course once having prepared the run-length-encoded version, the server will compare its size vs the size of the whole struct, and only send the delta-kind of packet if the savings are substantial, otherwise it might as well send the big-packet a bit earlier than needed (gains in reliability). The received will keep track of the last big-packet unique tag it has received and only apply deltas which pertain to it (helps against missing packets and packets delivered out of order, depending how sophisticated you want to make your client).
The need for versioning &c, depending on what exactly you mean (will senders and receivers with different ideas about how the struct's C layout should look need to communicate regularly? how do they handshake about what versions are know to both? etc), will add a whole further universe of complications, but that is really another question, and your core question as summarized in the title is already plenty big enough;-).
If you can afford occasional meta-messages from the receiver back to the sender (acks or requests to resend) then depending on the various numerical parameters in play you may design different strategies. I suspect acks would have to be pretty frequent to do much good, so a request to resend a big-packet (either a specifically identified one or "whatever you have that's freshest") may be the best meta-strategy to cull the options space (which otherwise threatens to explode;-). If so then the sender may be blissfully ignorant of whatever strategy the receiver is using to request big-packet-resends, and you can experiment on the receiver side with various such strategies without needing to redeploy the sender as well.
It's hard to offer much more help without some specifics, i.e., at least ballpark numbers for all the numerical parameters -- packet sizes, frequencies of sending, how long is it tolerable for the sender to be out of sync with the receiver, a bundle of network parameters, etc etc. But I hope this somewhat generic analysis and suggestions still help.
To delta encode:
1) Send "key frames" periodically (e.g. once a second). A key frame is a complete copy (rather than a delta) so that if you lose comms for any reason, you only lose a small amount of data before you can "aquire the signal" again. Use a simple packet header that allows you to detect the start of a packet and know what type of data it contains.
2) Calculate the delta from the previous packet and encode that in a compact form. By examining the type of data you are sending and the way it typically changes, you should be able to devise a pretty compact delta. However, you may need to check the size of the delta - in some cases it may not be an efficient encoding - if it's bigger than a key frame you can just send another key frame instead. You can also decide at this point whether your deltas are lossy or lossless.
3) Add a CRC check to the packet (search for CRC32). This will allow the receiver to verify that the packet has been received intact, allowing them to skip invalid packets.
NOTES:
Be careful about doing this over UDP - it gives no guarantee that your packets will arrive in the same order you sent them. Obviously a delta will only work if packets are in order. In this case, you will need to add some form of sequence ID to each packet (first packet is "1", second packet is "2" etc) so that you can detect out-of-order receiving. You may even need to keep a buffer of "n" packets in the receiver so that you can reassemble them in the correct order when you come to decode them (but of course, this could introduce some latency). You will probably also miss some packets over UDP, in which case you'll need to wait until the next keyframe before you'll be able to "re-aquire the signal" - so the key frames must be frequent enough to avoid catastrophic outages in your comms.
Consider using compression (e.g. zip etc). You may find a full packet can be built in a zip-friendly manner (e.g. rearrage data to group bytes that are likely to have similar values (especially zeros) together) and then compressed so well that it is smaller than an uncompressed delta, and you won't need to go to all the effort of deltas at all (and you won't have to worry about packet ordering etc).
edit
- Always use a version number (or packet type) in your packets so you can add new fields or change the delta encoding in the future! You'll need this for differentiating key/delta frames anyway.
I'm not convinced that delta encoding values on UDP - which is inherently unreliable and out of order - is going to be particularly easy. Instead, I'd send an ID of the field which has changed and its current value. That also doesn't require anything to change if you want to add extra fields to the data structure you're sending. If you want a standard way of doing this, look at SNMP; that may be something you can drop in, or it may be a bit baggy for you (it qualifies the field names globally and uses ASN.1 - both of which give maximum interoperability, but at the cost of some bytes in the packet).
Use an RPC like corba or protocol buffers
Use DTLS with a compression option
Use a packed format
Repurposes an existing header compression library

Options for a message passing system for a game

I'm working on an RTS game in C++ targeted at handheld hardware (Pandora). For reference, the Pandora has a single ARM processor at ~600Mhz and runs Linux. We're trying to settle on a good message passing system (both internal and external), and this is new territory for me.
It may help to give an example of a message we'd like to pass. A unit may make this call to load its models into memory:
sendMessage("model-loader", "load-model", my_model.path, model_id );
In return, the unit could expect some kind of message containing a model object for the particular model_id, which can then be passed to the graphics system. Please note that this sendMessage function is in no way final. It just reflects my current understanding of message passing systems, which is probably not correct :)
From what I can tell there are two pretty distinct choices. One is to pass messages in memory, and only pass through the network when you need to talk to an external machine. I like this idea because the overhead seems low, but the big problem here is it seems like you need to make extensive use of mutex locking on your message queues. I'd really like to avoid excess locking if possible. I've read a few ways to implement simple queues without locking (by relying on atomic int operations) but these assume there is only one reader and one writer for a queue. This doesn't seem useful to our particular case, as an object's queue will have many writers and one reader.
The other choice is to go completely over the network layer. This has some fun advantages like getting asynchronous message passing pretty much for free. Also, we gain the ability to pass messages to other machines using the exact same calls as passing locally. However, this solution rubs me the wrong way, probably because I don't fully understand it :) Would we need a socket for every object that is going to be sending/receiving messages? If so, this seems excessive. A given game will have thousands of objects. For a somewhat underpowered device like the Pandora, I fear that abusing the network like that may end up being our bottleneck. But, I haven't run any tests yet, so this is just speculation.
MPI seems to be popular for message passing but it sure feels like overkill for what we want. This code is never going to touch a cluster or need to do heavy calculation.
Any insight into what options we have for accomplishing this is much appreciated.
The network will be using locking as well. It will just be where you cannot see it, in the OS kernel.
What I would do is create your own message queue object that you can rewrite as you need to. Start simple and make it better as needed. That way you can make it use any implementation you like behind the scenes without changing the rest of your code.
Look at several possible implementations that you might like to do in the future and design your API so that you can handle them all efficiently if you decide to implement in those terms.
If you want really efficient message passing look at some of the open source L4 microkernels. Those guys put a lot of time into fast message passing.
Since this is a small platform, it might be worth timing both approaches.
However, barring some kind of big speed issue, I'd always go for the approach that is simpler to code. That is probably going to be using the network stack, as it will be the same code no matter where the recipient is, and you won't have to manually code and degug your mutual exclusions, message buffering, allocations, etc.
If you find out it is too slow, you can always recode the local stuff using memory later. But why waste the time doing that up front if you might not have to?
I agree with Zan's recommendation to pass messages in memory whenever possible.
One reason is that you can pass complex objects C++ without needing to marshal and unmarshal (serialize and de-serialize) them.
The cost of protecting your message queue with a semaphore is most likely going to be less than the cost of making networking code calls.
If you protect your message queue with some lock-free algorithm (using atomic operations as you alluded to yourself) you can avoid a lot a context switches into and out of the kernel.