On the client side I use:
mosquitto_pub -t tpc -m msg
On the server side I use nonblocking socket and socket() API:
https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_72/rzab6/xnonblock.htm
After first received packet I send connect acknowledge packet.
For each received packet I print how many bytes was received and whole buffer in hex.
I compare received data with WireShark capturing.
Sometime it works well:
37 bytes received - Connect Command
10 bytes received - Publish Message [tpc]
2 bytes received - Disconnect Req
Sometime I get Disconnect Req inside Publish Message [tpc]:
37 bytes received - Connect Command
12 bytes received - Publish Message [tpc] + Disconnect Req
These last two bytes are Disconnect Req:
30
8
0
3
74
70
63
6d
73
67
ffffffe0 <--
0 <--
How can I avoid these situations and get always 3 packets?
Short answer: you can't. You have to actually parse the messages to determine the length.
The constant to create a tcp socket is called SOCK_STREAM for a reason. Socket has to be treated as such: a stream of bytes. Nobody guarantees that one send() on one side results in one recv() on the other side. The only guarantee is that the sequence is preserved: abcd may become (ab, cd), but will not become acbd.
The packets may be splitted somewhere half the way. So it may be that the client sends 2048 bytes, but on the server side you'll receive first ~1400 bytes and then the rest. So N sends does not result in N recv.
Another thing is that the client also treats the socket as a stream. It may send byte by byte, or send a batch of messages with one send(). N messages are not N sends.
Related
I'm using the TCP echo example (1.62 is what is currently shipping in the main Ubuntu package).
https://www.boost.org/doc/libs/1_62_0/doc/html/boost_asio/example/cpp11/echo/async_tcp_echo_server.cpp
It works great for small things, you can see it has a buffer of 1024 and uses async_read_some.
But then I try to send it the Python string ("A"*4096)+("B"*4096)+("C"*4096)... I will see 4 calls to the read handler for 1024 bits each... i.e. it will print all the As but never any Bs or Cs.
Expected behavior: If there is 4096*3 data in the socket, subsequent calls to async_read_some should be pulling all the data out 1024 at a time??
One cannot use async_read in such an echo protocol, because variable data is passed over the wire. The problem is async_read_some is ignored/deleting data that is still to be read from the socket.
How to fix the example code?
I took that sample and ran it with your alleged client code:
#!/usr/bin/env python
import socket
TCP_IP = '127.0.0.1'
TCP_PORT = 6767
BUFFER_SIZE = 1024
MESSAGE = ("A"*4096)+("B"*4096)+("C"*4096);
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((TCP_IP, TCP_PORT))
s.send(MESSAGE)
received = "";
while (len(received) < len(MESSAGE)):
data = s.recv(BUFFER_SIZE)
print "received data: %d bytes ending in ...%s" % (len(data), data[-10:])
received += data
s.close()
It correctly runs and prints
sehe ~ Projects stackoverflow ./sotest 6767&
sehe ~ Projects stackoverflow python ./test.py
received data: 1024 bytes ending in ...AAAAAAAAAA
received data: 1024 bytes ending in ...AAAAAAAAAA
received data: 1024 bytes ending in ...AAAAAAAAAA
received data: 1024 bytes ending in ...AAAAAAAAAA
received data: 1024 bytes ending in ...BBBBBBBBBB
received data: 1024 bytes ending in ...BBBBBBBBBB
received data: 1024 bytes ending in ...BBBBBBBBBB
received data: 1024 bytes ending in ...BBBBBBBBBB
received data: 1024 bytes ending in ...CCCCCCCCCC
received data: 1024 bytes ending in ...CCCCCCCCCC
received data: 1024 bytes ending in ...CCCCCCCCCC
received data: 1024 bytes ending in ...CCCCCCCCCC
So you're doing something wrong.
Expected behavior: If there is 4096*3 data in the socket, subsequent calls to async_read_some should be pulling all the data out 1024 at a time??
Yes. This is exactly what happens. Mind you, you should not assume the blocks "arrive" in 1024 blocks. They could happen to arrive in smaller chunks depending on the buffering in intermediate OS/network layers. IOW: TCP is a stream protocol and packeting is an implementation detail you should not usually depend on¹
One cannot use async_read in such an echo protocol, because variable data is passed over the wire.
Data is always variable (otherwise there would be no reason to send it). async_read can always be used where read can be, because it's merely the asynchronous IO version of the same function.
¹ using various advanced techniques/flags you can somewhat control these effects but they're partly platform dependent and nearly always operate with margins giving the OS/network layers leeway to optimize network performance.
I have written a simple DPDK send and receive application. When the packet len <= 60 bytes, send and receive application works, but when packet len > 60 bytes, send application show it has sent out packet. but in recieve application, it does not receive anything.
In send application:
mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", NUM_MBUFS,
MBUF_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
pkt = rte_pktmbuf_alloc(mbuf_pool);
pkt->data_len = packlen; //if packlen<=60, it works, but when packlen>60, receiver cannot receive anything.
I try both l2fwd and basicfwd as receive application. It is same result.
The issue is here:
pchar[12]=0;
pchar[13] = 0
This means Ethertype is 0. From the list of assigned Ethertypes:
https://www.iana.org/assignments/ieee-802-numbers/ieee-802-numbers.xhtml
We see that 0 means zero Ethernet frame length. Since the minimum Ethernet frame length is 64 (60 + 4 FCS), that is why you have troubles sending packets longer that 60 bytes.
To fix the issue, simply put there a reasonable Ethertype from the list above.
I have been reading some socket guides such as Beej's guide to network programming. It is quite clear now that there is no guarantee on how many bytes are received in a single recv() call. Therefore a mechanism of e.g. first two bytes stating the message length should be sent and then the message. So the receiver receives the first two bytes and then receives in a loop until the whole message has been received. All good and dandy!?
I was asked by a colleague about messages going out of sync. E.g. what if, somehow, I receive two bytes in once recv() call that are actually in the middle of the message itself and it would appear as a integer of some value? Does that mean that the rest of the data sent will be out of sync? And what about receiving the header partially, i.e. one byte at a time?
Maybe this is overthinking, but I can't find this mentioned anywhere and I just want to be sure that I would handle this if it could be a possible threat to the integrity of the communication.
Thanks.
It is not overthinking. TCP presents a stream so you should treat it this way. A lot of problems concerning TCP are due to network issues and will probably not happen during development.
Start a message with a (4 byte) magic that you can look for followed by a (4 byte) length in an expected order (normally big endian). When receiving, read each byte of the header at the time, so you can handle it anyway the bytes were received. Based on that you can accept messages in a lasting TCP connection.
Mind you that when starting a new connection per message, you know the starting point. However, it doesn't hurt sending a magic either, if only to filter out some invalid messages.
A checksum is not necessary because TCP shows a reliable stream of bytes which was already checked by the receiving part of TCP, and syncing will only be needed if there was a coding issue with sending/receiving.
On the other hand, UDP sends packets, so you know what to expect, but then the delivery and order is not guaranteed.
Your colleague is mistaken. TCP data cannot arrive out of order. However you should investigate the MSG_WAITALL flag to recv() to overcome the possibility of the two length bytes arriving separately, and to eliminate the need for a loop when receiving the message body.
Its your responsibility to make you client and server syncing together, how ever in TCP there is no out of order delivery, if you got something by calling recv() you can think there isn't anything behind that that you doesn't received.
So the question is how to synchronize sender and receiver ? its easy, as stefaanv said, sender and receiver are knowing their starting point. so you can define a protocol for your network communication. for example a protocol could be defined this way :
4 bytes of header including message type and payload length
Rest of message is payload length
By this, you have to send 4 byte header before sending actual payload, then sending actual payload followed.
Because TCP has garauntied Inorder reliable delivery, you can make two recv() call for each pack. one recv() call with length of 4 bytes for getting next payload size, and another call to recv() with size specified in header. Its necessary to make both recv() blocking to getting synchronized all the time.
An example would be like this:
#define MAX_BUF_SIZE 1024 // something you know
char buf[MAX_BUF_SIZE];
int recvLen = recv(fd, buff, 4, MSG_PEEK);
if(recvLen==4){
recvLen = recv(fd, buff, 4);
if(recvLen != 4){
// fatal error
}
int payloadLen = extractPayloadLenFromHeader(buf);
recvLen = recv(fd, buff, payloadLen, MSG_PEEK);
if(recvLen == payloadLen){
recvLen = recv(fd, buff, payloadLen); // actual recv
if(recvLen != payloadLen){
// fatal error
}
// do something with received payload
}
}
As you can see, i have first called recv with MSG_PEEK flag to ensure is there really 4 bytes available or not, then received actual header. same for payload
I'm doing something similar to Stack Overflow question Handling partial return from recv() TCP in C.
The data receive is bigger than the buffer initialised (for example, 1000 bytes). Therefore a temporary buffer of a bigger size (for example, 10000 bytes) is used. The problem is that the multiple data received is rubbish. I've already checked the offset to memcpy to the temporary buffer, but I keep receiving rubbish data.
This sample shows what I do:
First message received:
memcpy(tmpBuff, dataRecv, 1000);
offSet = offSet + 1000;
Second msg onwards:
memcpy(tmpBuffer + offSet, dataRecv, 1000);
Is there something I should check?
I've checked the TCP hex that was sent out. Apparently, the sender is sending an incomplete message. How my program works is that when the sender sends the message, it will pack (message header + actual message). the message header has some meta data, and one of it is the message length.
When the receiver receives the packet, it will get the message header using the message header offset and message header length. It will extract the message length, check if the current packet size is more than or equal to the message length and return the correct message size to the users. If there's a remaining amount of message left in the packet, it will store it into a temporary buffer and wait to receive the next packet. When it receives the next packet, it will check the message header for the message length and do the same thing.
If the sender pack three messages in a packet, each message have its own message header indicating the message length. Assume all three messages are 300 bytes each in length. Also assume that the second message sent is incomplete and turns out to be only 100 bytes.
When the receiver receives the three messages in a packet, it will return the first message correctly. Since the second message is incomplete, my program wouldn't know, and so it will return 100 bytes from the second message and 200 bytes from the third message since the message header indicates the total size is 300 bytes. Thus the second message returned will have some rubbish data.
As for the third message, my program will try to get the message length from the message header. Since the first 200 bytes are already returned, the message header is invalid. Thus, the message length returned to my program will be rubbish as well. Is there a way to check for a complete message?
Suppose you are expecting 7000 bytes over the tcp connection. In this case it is very likely that your messages will be split into tcp packets with an actual payload size of let's say 1400 bytes (so 5 messages).
In this case it is perfectly possible consecutive recv calls with a target buffer of 1000 bytes will behave as follows:
recv -> reads 1000 bytes (packet 1)
recv -> reads 400 bytes (packet 1)
recv -> reads 1000 bytes (packet 2)
recv -> reads 400 bytes (packet 2)
...
Now, in this case, when reading the 400 bytes packet you still copy the full 1000 bytes to your larger buffer, actually pasting 600 bytes of rubbish in between. You should actually only memcpy the number of bytes received, which is the return value of recv itself. Of course you should also check if this value is 0 (socket closed) or less than zero (socket error).
I am having an issue trying to communicate between a python TCP server and a c++ TCP client.
After the first call, which works fine, the subsequent calls cause issues.
As far as WinSock is concerned, the send() function worked properly, it returns the proper length and WSAGetLastError() does not return anything of significance.
However, when watching the packets using wireshark, i notice that the first call sends two packets, a PSH,ACK with all of the data in it, and an ACK right after, but the subsequent calls, which don't work, only send the PSH,ACK packet, and not a subsequent ACK packet
the receiving computers wireshark corroborates this, and the python server does nothing, it doesnt have any data coming out of the socket, and i cannot debug deeper, since socket is a native class
when i run a c++ client and a c++ server (a hacked replica of what the python one would do), the client faithfully sends both the PSH,ACk and ACK packets the whole time, even after the first call.
Is the winsock send function supposed to always send a PSH,ACK and an ACK?
If so, why would it do so when connected to my C++ server and not the python server?
Has anyone had any issues similar to this?
client sends a PSH,ACK and then the
server sends a PSH,ACK and a
FIN,PSH,ACK
There is a FIN, so could it be that the Python version of your server is closing the connection immediately after the initial read?
If you are not explicitly closing the server's socket, it's probable that the server's remote socket variable is going out of scope, thus closing it (and that this bug is not present in your C++ version)?
Assuming that this is the case, I can cause a very similar TCP sequence with this code for the server:
# server.py
import socket
from time import sleep
def f(s):
r,a = s.accept()
print r.recv(100)
s = socket.socket()
s.bind(('localhost',1234))
s.listen(1)
f(s)
# wait around a bit for the client to send it's second packet
sleep(10)
and this for the client:
# client.py
import socket
from time import sleep
s = socket.socket()
s.connect(('localhost',1234))
s.send('hello 1')
# wait around for a while so that the socket in server.py goes out of scope
sleep(5)
s.send('hello 2')
Start your packet sniffer, then run server.py and then, client.py. Here is the outout of tcpdump -A -i lo, which matches your observations:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 96 bytes
12:42:37.683710 IP localhost:33491 > localhost.1234: S 1129726741:1129726741(0) win 32792 <mss 16396,sackOK,timestamp 640881101 0,nop,wscale 7>
E..<R.#.#...............CVC.........I|....#....
&3..........
12:42:37.684049 IP localhost.1234 > localhost:33491: S 1128039653:1128039653(0) ack 1129726742 win 32768 <mss 16396,sackOK,timestamp 640881101 640881101,nop,wscale 7>
E..<..#.#.<.............C<..CVC.....Ia....#....
&3..&3......
12:42:37.684087 IP localhost:33491 > localhost.1234: . ack 1 win 257 <nop,nop,timestamp 640881102 640881101>
E..4R.#.#...............CVC.C<......1......
&3..&3..
12:42:37.684220 IP localhost:33491 > localhost.1234: P 1:8(7) ack 1 win 257 <nop,nop,timestamp 640881102 640881101>
E..;R.#.#...............CVC.C<......./.....
&3..&3..hello 1
12:42:37.684271 IP localhost.1234 > localhost:33491: . ack 8 win 256 <nop,nop,timestamp 640881102 640881102>
E..4.(#.#...............C<..CVC.....1}.....
&3..&3..
12:42:37.684755 IP localhost.1234 > localhost:33491: F 1:1(0) ack 8 win 256 <nop,nop,timestamp 640881103 640881102>
E..4.)#.#...............C<..CVC.....1{.....
&3..&3..
12:42:37.685639 IP localhost:33491 > localhost.1234: . ack 2 win 257 <nop,nop,timestamp 640881104 640881103>
E..4R.#.#...............CVC.C<......1x.....
&3..&3..
12:42:42.683367 IP localhost:33491 > localhost.1234: P 8:15(7) ack 2 win 257 <nop,nop,timestamp 640886103 640881103>
E..;R.#.#...............CVC.C<......./.....
&3%W&3..hello 2
12:42:42.683401 IP localhost.1234 > localhost:33491: R 1128039655:1128039655(0) win 0
E..(..#.#.<.............C<......P...b...
9 packets captured
27 packets received by filter
0 packets dropped by kernel
What size of packets do you send?
If they are small - may be Nagle's Algorith & Delayed ACK Algorithm is your headache? From what you described think Delayed ACK is involved...