gpsd client data buffer - c++

I am developing a C++ application that should retrieve the received NMEA sentences of type $GPGGA, using gpsd. The idea is to read from gpsd approximately once per second and to parse the last $GPGGA received sentence, extracting the two fields of my interest: the quality indicator and the reference station ID. I used the C++ libgpsmm library, periodically calling to gpsmm::read() and to gpsmm::data(), accessing directly to the client data buffer.
At first, I have made several tests using gpsfake and a fake GPS log (specifying the gpsfake option "-c 0.5", in order to have two sentences per second). The results are OK when the time between two requests to gpsd is less or equal to 400ms. If I try with a greater time, the results are unexpected, having in each reading a piece of NMEA sentences with lots of repeated data as well as some truncated sentences. The things are really worse when I try with a real GPS that writes ~40 sentences per second: in this case the time between reading should be ~ 10ms or even less in order to have correct results.
The following is a simpler program that prints the NMEA sentences that are received. It works well, with the simulated and even with the real GPS. But if I uncomment the usleep() call, which makes the program to check the buffer once per second, client data buffer does not give reasonable results.
#include <iostream>
#include "libgpsmm.h"
using namespace std;
#define WAITING_TIME 5000000
#define RETRY_TIME 5
#define ONE_SECOND 1000000
int main(void)
{
for(;;){
//For version 3.7
gpsmm gps_rec("localhost", DEFAULT_GPSD_PORT);
if (gps_rec.stream(WATCH_ENABLE|WATCH_NMEA) == NULL) {
cout << "No GPSD running. Retry to connect in " << RETRY_TIME << " seconds." << endl;
usleep(RETRY_TIME * ONE_SECOND);
continue; // It will try to connect to gpsd again
}
const char* buffer = NULL;
for (;;) {
struct gps_data_t* newdata;
if (!gps_rec.waiting(WAITING_TIME))
continue;
if ((newdata = gps_rec.read()) == NULL) {
cerr << "Read error.\n";
break;
} else {
buffer = gps_rec.data();
// We print the NMEA sentences!
cout << "***********" << endl;
cout << buffer << endl;
//usleep(1000000);
}
}
}
}
Here is the output having the usleep() call commented (ie. continually reading data):
$ ./GPSTest1
***********
{"class":"VERSION","release":"3.7","rev":"3.7","proto_major":3,"proto_minor":7}
***********
{"class":"WATCH","enable":true,"json":false,"nmea":true,"raw":0,"scaled":false,"timing":false}
***********
$GPGGA,202010.00,3313.9555651,S,06019.3785868,W,4,09,1.0,39.384,M,16.110,M,10.0,*46<CR><LF>
***********
$GPGGA,202011.00,3313.9555664,S,06019.3785876,W,4,09,1.0,39.386,M,16.110,M,11.0,*4D<CR><LF>
***********
$GPGGA,202012.00,3313.9555668,S,06019.3785882,W,4,09,1.0,39.394,M,16.110,M,12.0,*49<CR><LF>
***********
$GPGGA,202013.00,3313.9555673,S,06019.3785911,W,4,09,1.0,39.395,M,16.110,M,13.0,*49<CR><LF>
***********
$GPGGA,202014.00,3313.9555670,S,06019.3785907,W,4,09,1.0,39.409,M,16.110,M,14.0,*4F<CR><LF>
***********
$GPGGA,202015.00,3313.9555657,S,06019.3785905,W,4,09,1.0,39.395,M,16.110,M,15.0,*4A<CR><LF>
And this is the output when the line is commented (ie. the buffer is checked once per second):
$ ./GPSTest2
***********
{"class":"VERSION","release":"3.7","rev":"3.7","proto_major":3,"proto_minor":7}
***********
{"class":"DEVICE","path":"/dev/pts/0","activated":"2012-11-05T23:48:38.110Z","driver":"Generic NMEA","native":0,"bps":4800,"parity":"N","stopbits":1,"cycle":1.00}
$GPGGA,202013.00,3313.9555673,S,06019.3785911,W,1,09,1.0,39.395,M,16.110,M,13.0,*49<CR><LF>
0}
$GPGGA,202013.00,3313.9555673,S,06019.3785911,W,1,09,1.0,39.395,M,16.110,M,13.0,*49<CR><LF>
":"Generic NMEA","native":0,"bps":4800,"parity":"N","stopbits":1,"cycle":1.00}
$GPGGA,202013.00,3313.9555673,S,06019.3785911,W,1,09,1.0,39.395,M,16.110,M,13.0,*49<CR><LF>
***********
$GPGGA,202013.00,3313.9555673,S,06019.3785911,W,1,09,1.0,39.395,M,16.110,M,13.0,*49<CR><LF>
***********
$GPGGA,202016.00,3313.9555642,S,06019.3785894,W,1,09,1.0,39.402,M,16.110,M,16.0,*4E<CR><LF>
$GPGGA,202017.00,3313.9555643,S,06019.3785925,W,1,09,1.0,39.404,M,16.110,M,17.0,*42<CR><LF>
$GPGGA,202017.00,3313.9555643,S,06019.3785925,W,1,09,1.0,39.404,M,16.110,M,17.0,*42<CR><LF>
$GPGGA,202017.00,3313.9555643,S,06019.3785925,W,1,09,1.0,39.404,M,16.110,M,17.0,*42<CR><LF>
***********
Any suggestion? At first, I tried to directly analyze the gps_data_t structure, but It seems to be harder to identify the quality indicator and the reference station ID that way, among all the fields of the structure, in comparison with the search within a NMEA sentence.

I am not familiar with the gpsd service, but what you describe looks a lot like the receive buffer is being corrupted (overwritten). The GPS receiver is outputting NMEA information continually and when your application is sleeping these characters will accumulate in the buffer, if too many characters are recieved then the buffer will get overwritten.
Either increase the serial port receiver buffer size (if possible) or maybe clear the buffer after the wakeup and then wait for the next GGA message (which may be up to one second away in the worst case).
The GPS receiver should be configured to output information at a 1Hz (once per second), in this case the the device should only output about 8 sentances per second. If you're seeing 40 sentances then your receiver would seem to be outputting information at around 5Hz which sounds like overkill for your particular interest.

Related

I don't understand buffered output and cout

I've got a really simple program that prints lines using cout and sleeps after each line. All is well and good for about 7 iterations, as the buffer is clearly not flushed at any point. After then, what I assume is only part of the buffer is flushed on every iteration.
I have a few questions about this behaviour:
If the buffer is supposedly big enough to fit ~7 lines of output, why is the buffer flushed one line at a time?
If this buffer is indeed flushed in this way, what is the advantage to this? Why isn't the whole buffer flushed at once?
Is it just a coincidence that the exact same number of characters are flushed to the output as my line length, or is the cout buffer internally flushed based on end-of-line delimiters such as '\n'?
int main(){
for(int i = 0; i < 100; ++i){
std::cout << "This is line " << i << '\n';
Sleep(1000);
}
return 0;
}
You seem to assume that the buffer will not be written until it is full. Probably what happens is that an asynchronous write is started with as little as one output byte. The empty buffer space is used to receive characters while the asynchronous write is in progress. When the current write completes, if/when there are additional characters in the buffer, a new asynchronous write is started. The process would only need to block on writing if the buffer got full.

Wrapping a commandline program with pstream

I want to be able to read and write to a program from C++. It seems like pstream can do the job, but I find the documentation difficult to understand and have not yet find an example.
I have setup the following minimum working example. This opens python, which in turn (1) prints hello (2) ask input, and (3) prints hello2:
#include <iostream>
#include <cstdio>
#include "pstream.h"
using namespace std;
int main(){
std::cout << "start";
redi::pstream proc(R"(python -c "if 1:
print 'hello'
raw_input()
print 'hello2'
")");
std::string line;
//std::cout.flush();
while (std::getline(proc.out(), line)){
std::cout << " " << "stdout: " << line << '\n';
}
std::cout << "end";
return 0;
}
If I run this with the "ask input" part commented out (i.e. #raw_input()), I get as output:
start stdout: hello
stdout: hello2
end
But if I leave the "ask input" part in (i.e. uncommented raw_input()), all I get is blank, not even start, but rather what seems like a program waiting for input.
My question is, how can one interact with this pstream, how can one establish a little read-write-read-write session? Why does the program not even show start or the first hello?
EDIT:
I don't seem to be making much progress. I don't think I really grasp what is going on. Here are some further attempts with commentary.
1) It seems like I can successfully feed raw_inputI prove this by writing to the child's stderr:
int main(){
cout << "start" <<endl;
redi::pstream proc(R"(python -c "if 1:
import sys
print 'hello'
sys.stdout.flush()
a = raw_input()
sys.stdin.flush()
sys.stderr.write('hello2 '+ a)
sys.stderr.flush()
")");
string line;
getline(proc.out(), line);
cout << line << endl;
proc.write("foo",3).flush();
cout << "end" << endl;
return 0;
}
output:
start
hello
end
hello2 foo
But it locks if I try to read from the stdout again
int main(){
...
a = raw_input()
sys.stdin.flush()
print 'hello2', a
sys.stdout.flush()
")");
...
proc.write("foo",3).flush();
std::getline(proc.out(), line);
cout << line << endl;
...
}
output
start
hello
2) I can't get the readsome approach to work at all
int main(){
cout << "start" <<endl;
redi::pstream proc(R"(python -c "if 1:
import sys
print 'hello'
sys.stdout.flush()
a = raw_input()
sys.stdin.flush()
")");
std::streamsize n;
char buf[1024];
while ((n = proc.out().readsome(buf, sizeof(buf))) > 0)
std::cout.write(buf, n).flush();
proc.write("foo",3).flush();
cout << "end" << endl;
return 0;
}
output
start
end
Traceback (most recent call last):
File "<string>", line 5, in <module>
IOError: [Errno 32] Broken pipe
The output contains a Python error, it seems like the C++ program finished while the Python pipe were still open.
Question: Can anyone provide a working example of how this sequential communication should be coded?
But if I leave the "ask input" part in (i.e. uncommented raw_input()), all I get is blank, not even start, but rather what seems like a program waiting for input.
The Python process is waiting for input, from its stdin, which is connected to a pipe in your C++ program. If you don't write to the pstream then the Python process will never receive anything.
The reason you don't see "start" is that Python thinks it's not connected to a terminal, so it doesn't bother flushing every write to stdout. Try import sys and then sys.stdout.flush() after printing in the Python program. If you need it to be interactive then you need to flush regularly, or set stdout to non-buffered mode (I don't know how to do that in Python).
You should also be aware that just using getline in a loop will block waiting for more input, and if the Python process is also blocking waiting for input you have a deadlock. See the usage example on the pstreams home page showing how to use readsome() for non-blocking reads. That will allow you to read as much as is available, process it, then send a response back to the child process, so that it produces more output.
EDIT:
I don't think I really grasp what is going on.
Your problems are not really problems with pstreams or python, you're just not thinking through the interactions between two communicating processes and what each is waiting for.
Get a pen and paper and draw state diagrams or some kind of chart that shows where the two processes have got to, and what they are waiting for.
1) It seems like I can successfully feed raw_input
Yes, but you're doing it wrong. raw_input() reads a line, you aren't writing a line, you're writing three characters, "foo". That's not a line.
That means the python process keeps trying to read from its stdin. The parent C++ process writes the three characters then exits, running the pstream destructor which closes the pipes. Closing the pipes causes the Python process so get EOF, so it stops reading (after only getting three characters not a whole line). The Python process then prints to stderr, which is connected to your terminal, because you didn't tell the pstream to attach a pipe to the child's stderr, and so you see that output.
But it locks if I try to read from the stdout again
Because now the parent C++ process doesn't exit, so doesn't close the pipes, so the child Python process doesn't read EOF and keeps waiting for more input. The parent C++ process is also waiting for input, but that will never come.
If you want to send a line to be read by raw_input() then write a newline!
This works fine, because it sends a newline, which causes the Python process to get past the raw_input() line:
cout << "start" <<endl;
redi::pstream proc(R"(python -c "if 1:
import sys
print 'hello'
sys.stdout.flush()
a = raw_input()
print 'hello2', a
sys.stdout.flush()
")");
string line;
getline(proc, line);
cout << line << endl;
proc << "foo" << endl; // write to child FOLLOWED BY NEWLINE!
std::getline(proc, line); // read child's response
cout << line << endl;
cout << "end" << endl;
N.B. you don't need to use proc.out() because you haven't attached a pipe to the process' stderr, so it always reads from proc.out(). You would only need to use that when reading from both stdout and stderr, where you would use proc.out() and proc.err() to distinguish them.
2) I can't get the readsome approach to work at all
Again, you have the same problem that you're only writing three characters and so the Python processes waits forever. The C++ process is trying to read as well, so it also waits forever. Deadlock.
If you fix that by sending a newline (as shown above) you have another problem: the C++ program will run so fast that it will get to the while loop that calls readsome before the Python process has even started. It will find nothing to read in the pipe, and so the first readsome call returns 0 and you exit the loop. Then the C++ program gets to the second while loop, and the child python process still hasn't started printing anything yet, so that loop also reads nothing and exits. Then the whole C++ program exits, and finally the Python child is ready to run and tries to print "hello" but by then it's parent is gone and it can't write to the pipe.
You need readsome to keep trying if there's nothing to read the first time you call it_, so it waits long enough for the first data to be readable.
For your simple program you don't really need readsome because the Python process only writes a single line at a time, so you can just read it with getline. But if it might write more than one line you need to be able to keep reading until there's no more data coming, which readsome can do (it reads only if there's data available). But you also need some way to tell whether more data is still going to come (maybe the child is busy doing some calculations before it sends more data) or if it's really finished. There's no general way to know that, it depends on what the child process is doing. Maybe you need the child to send some sentinel value, like "---END OF RESPONSE---", which the parent can look for to know when to stop trying to read more.
For the purposes of your simple example, let's just assume that if readsome gets more than 4 bytes it received the whole response:
cout << "start" <<endl;
redi::pstream proc(R"(python -c "if 1:
import sys
print 'hello'
sys.stdout.flush()
a = raw_input()
sys.stdin.flush()
print 'hello2', a
sys.stdout.flush()
")");
string reply;
streamsize n;
char buf[1024];
while ((n = proc.readsome(buf, sizeof(buf))) != -1)
{
if (n > 0)
reply.append(buf, n);
else
{
// Didn't read anything. Is that a problem?
// Need to try to process the content of 'reply' and see if
// it's what we're expecting, or if it seems to be incomplete.
//
// Let's assume that if we've already read more than 4 characters
// it's a complete response and there's no more to come:
if (reply.length() > 3)
break;
}
}
cout << reply << std::flush;
proc << "foo" << std::endl;
while (getline(proc, reply)) // maybe use readsome again here
cout << reply << std::endl;
cout << "end" << endl;
This loops while readsome() != -1, so it keeps retrying if it reads nothing and only stops the loop if there's an error. In the loop body it decides what to do if nothing was read. You'll need to insert your own logic in here that makes sense for whatever you're trying to do, but basically if readsome() hasn't read anything yet, then you should loop and retry. That makes the C++ program wait long enough for the Python program to print something.
You'd probably want to split out the while loop into a separate function that reads a whole reply into a std::string and returns it, so that you can re-use that function every time you want to read a response. If the child sends some sentinel value that function would be easy to write, as it would just stop every time it receives the sentinel string.

How to prevent input text breaking while other thread is outputting to the console?

I have 2 threads: one of them is constantly cout'ing to the console some value, let's say increments an int value every second - so every second on the console is 1,2,3... and so on.
Another thread is waiting for user input - with the command cin.
Here is my problem: when I start typing something, when the time comes to cout the int value, my input gets erased from the input field, and put into the console with the int value. So when I want to type in "hello" it looks something like this:
1
2
3
he4
l5
lo6
7
8
Is there a way to prevent my input from getting put to the console, while other thread is writing to the console?
FYI this is needed for a chat app at client side - one thread is listening for messages and outputs this message as soon as it comes in, and the other thread is listening for user input to be sent to a server app.
Usually the terminal itself echos the keys typed. You can turn this off and get your program to echo it. This question will give you pointers on how to do it Hide password input on terminal
You can then just get the one thread to handle output.
If you are a slow typer, then the solution to your problem can be, as I said, making it a single thread, but that may make the app to receive only after it sends.
Another way is to increase your receiving thread's sleep time, which would provide you some more time to type (without interruption)
You could make a GUI (or use ncurses if you really want to work in the console). This way you avoid having std::cout shared by the threads.
I think you could solve this problem with a semaphore. When you have an incoming message you check to see if the user is writing something. If he does you wait until he finishes to print the message.
Is there a way to prevent my input from getting put to the console, while other thread is writing to the console?
It is the other way around. The other thread shouldn't interrupt the display of what you are typing.
Say you have typed "Hel" and then a new message comes in from the other thread. What do you do? How should it be displayed?
Totally disable echoing of what you type and only display it after you hit enter. In this way you can display messages from the different threads properly, in an atomic fashion. The big drawback is that you cannot see what you have typed already... :(
You immediately echo what you type. When the new message comes in, you undo the "Hel", print the new message and print again "Hel" on a new line and you can continue typing. Doable but a bit ugly.
You echo what you type in a separate place. That is, you split somehow the display. In one place you display the posted/received messages in order; and in another place you display what you are typing. You either need a GUI or at least some console library to do this. This would be the nicest solution but perhaps the most difficult to port to another OS due to the library dependencies.
In any case, you need a (preferably internally) synchronized stream that you can safely call from different threads and can write strings into it atomically. That is, you need to write your own synchronized stream class.
Hope this helps.
Well i recently solved this same issue with a basic workaround. This might not be the #1 solution but worked like a charm for me, as a newbie;
#include <iostream> // I/O
#include <Windows.h> // Sleep();
#include <conio.h> // _getch();
#include <string> // MessageBuffer
#include <thread> // Thread
using namespace std;
void ThreadedOutput();
string MessageBuffer; // or make it static
void main()
{
thread output(ThreadedOutput); // Attach the output thread
int count = 0;
char cur = 'a'; // Temporary at start
while (cur != '\r')
{
cur = _getch(); // Take 1 input
if (cur >= 32 && cur <= 126) // Check if input lies in alphanumeric and special keys
{
MessageBuffer += cur; // Store input in buffer
cout << cur; // Output the value user entered
count++;
}
else if (cur == 8) // If input key was backspace
{
cout << "\b \b"; // Move cursor 1 step back, overwrite previous character with space, move cursor 1 step back
MessageBuffer = MessageBuffer.substr(0, MessageBuffer.size() - 1); // Remove last character from buffer
count--;
}
else if (cur == 13) // If input was 'return' key
{
for (int i = 0; i < (signed)MessageBuffer.length(); i++) // Remove the written input
cout << "\b \b";
// "MessageBuffer" has your input, use it somewhere
MessageBuffer = ""; // Clear the buffer
}
}
output.join(); // Join the thread
}
void ThreadedOutput()
{
int i = 0;
while (true)
{
for (int i = 0; i < (signed)MessageBuffer.length(); i++) // Remove the written input
cout << "\b \b";
cout << ++i << endl; // Give parallel output with input
cout << MessageBuffer; // Rewrite the stored buffer
Sleep(1000); // Prevent this example spam
}
}

Reading using fstream

I am using fstream to read a binary file, but strangely I get different values for the same input file each time I execute the code.
if(fs->is_open())
{
while (!fs->eof())
{
fs->seekg( pos );
fs->read( (char *)&mdfHeader, sizeof(mdfHeader_t) );
pos += mdfHeader.length;
fs->read( (char *)&eventHeader, sizeof(eventHeader_t) );
fs->read( (char *)&rawHeader, sizeof(rawHeader_t) );
fs->read( (char *)&ingressHeader, sizeof(ingressHeader_t) );
fs->read( (char *)&l1Header_xc0, sizeof(l1Header_xc0_t) );
fs->read(data, dataLength);
printf("Data=%#x\n",data);
std::cout << "counter: " << c << "\n";
c++;
}
fs->close();
}
As you can see, I print out data, which should be the same each time, but yields a different value. mdfHeader.length is the length of one block of data.
The first things to change are:
The condition eof() is only really useful to determine why reading data failed but it isn't a useful condition for a loop.
You need to check after reading that you successfully read the data you are interested in.
That, the loop would look something like this:
while (*fs) {
// read data from fs
if (*fs) {
// do something with the data
}
else if (!fs->eof()) {
std::cout << "ERROR: failed to read record\n";
}
}
I'd also guess that you don't need the seeks and it is a good idea to get rid of them: seeking is relatively expensive because it looses any buffer. You didn't show the entire code but the initial value of pos has a fair chance to provide some level of randomness. Also, you assume that the sequence of bytes you are reading matches how the data is laid out in your computer. Typically, that isn't the case and you generally need to adjust the binary format, e.g., to accommodate different sizes of words, different endianess, padding, etc.
Computer is like mathematics, every thing is certain(even for functions like rand if input be the same, the output is also same as before) So if you run a code a hundred time with same input and state you will certainly get same output, unless input or running state changed.
You say that input is same each time you execute the code, so only thing that is changed is running state( for example malloc may return 2 different value each time that you run the program, because it may work in different state, because its state will be indicated by the OS ).
In your code you use printf("Data=%#x\n",data); to output your data, but it actually just print address of data as HEX value, so it is very natural that in multiple runs of the program this address may changed because OS map your executive to different positions or anything else. You should output content of the data and you will see that it will be same as previous run

Strange performance issues reading from stdout

I'm working on some code that will be used to test other executables. For convenience I'll refer to my code as the tester and the code being tested as the client. The tester will spawn the client and send commands to the client's stdin and receive results from the client's stdout.
I wanted to do some performance testing first so I wrote a very simple example tester and client. The tester waits for the client to write "READY" to its stdout and in response it sends "GO" to the client's stdin. The client then writes some number of bytes to stdout, configured via a command line flag, and then writes "\nREADY\n" at which point the tester will again write "GO". This repeats 10,000 times after which I calculate the time it took to complete the test and the "throughput", the 10,000 divided by the time to complete.
I ran the above test having the client send 0, 10, 100, 1000, 10000, and 100000 bytes of data before it sends "READY". For each byte size I repeated the test 10 times and took the average. When run on my laptop in an Ubuntu VMWare instance I got a throughput of about 100k GO/READY pairs per second. The performance was fairly stable and had virtually no dependence on the number of binary bytes the client sends to the tester. I then repeated the test on a very fast, 24 core server running CentOS. With a 0 byte payload I observed only about 55k GO/READY pairs per second and the performance degraded noticably as the number of bytes the client sent increased. When the client sends 100k bytes between "GO" and "READY" the throughput was only about 6k operations per second.
So I have three questions
Why would the same code run much more slowly on a faster machine
Why would the performance in the virtual machine be independent of payload size but the performance on the fast server be heavily dependent on payload size?
Is there anything I can do to make things faster on the server
One possible explanation is that I recompiled the code on the fast server and it is using a different version of the C++ libraries. The VMWare machine is running Ubuntu 11.10 and the fast sever is running CentOS 6. Both are 64 bit machines.
The relevant tester code is as follows:
ios_base::sync_with_stdio(false);
const int BUFFER_SIZE = 2 << 20;
char buffer[BUFFER_SIZE];
process_stdout->rdbuf()->pubsetbuf(buffer, BUFFER_SIZE);
Timer timer;
// Wait until the process is ready
string line;
line.reserve(2 << 20);
getline(*process_stdout, line);
CHECK(line == "READY");
timer.Start();
for (int i = 0; i < num_trials; ++i) {
*process_stdin << "GO\n";
process_stdin->flush();
line = "";
while (line != "READY") {
getline(*process_stdout, line);
}
}
double elapsed = timer.Elapsed();
cout << "Done. Did " << num_trials << " iterations in "
<< elapsed << " seconds. Throughput: "
<< double(num_trials) / elapsed << " per second." << endl;
I also tried versions using read() calls (from unistd.h) into a 1MB buffer and calls to memchr to find the "\n" characters and look for READY but got the same performance results.
The relevant client code is as follows:
// Create a vector of binary data. Some portion of the data will be sent
// to stdout each time a "GO" is received before sending "READY"
vector<char> byte_source;
const int MAX_BYTES = 1 << 20;
for (int i = 0; i < MAX_BYTES; ++i) {
byte_source.push_back(i % 256);
}
cout << "READY" << endl;
while (cin.good()) {
string line;
getline(cin, line);
if (line == "GO") {
// The value of response_bytes comes from a command line flag
OutputData(response_bytes, byte_source);
cout << "READY" << endl;
}
}
// write bytes worth of data from byte_source to stdout
void OutputData(unsigned int bytes,
const vector<char>& byte_source) {
if (bytes == 0) {
return;
}
cout.write(&byte_source[0], bytes);
cout << "\n";
}
Any help would be greatly appreciated!
The fact that the speed in VM is independent of the payload size indicates that you're doing something wrong. These are not complete programs so it's hard to pinpoint what. Use strace to see what's going on, i.e., whether the client actually does send all data you believe it is (and also check that the tester is receiving all data it should be).
100k READY/GO pairs is way too much; it's basically near the upper limit of the number of context switches per second, without doing anything else.