Strange performance issues reading from stdout - c++

I'm working on some code that will be used to test other executables. For convenience I'll refer to my code as the tester and the code being tested as the client. The tester will spawn the client and send commands to the client's stdin and receive results from the client's stdout.
I wanted to do some performance testing first so I wrote a very simple example tester and client. The tester waits for the client to write "READY" to its stdout and in response it sends "GO" to the client's stdin. The client then writes some number of bytes to stdout, configured via a command line flag, and then writes "\nREADY\n" at which point the tester will again write "GO". This repeats 10,000 times after which I calculate the time it took to complete the test and the "throughput", the 10,000 divided by the time to complete.
I ran the above test having the client send 0, 10, 100, 1000, 10000, and 100000 bytes of data before it sends "READY". For each byte size I repeated the test 10 times and took the average. When run on my laptop in an Ubuntu VMWare instance I got a throughput of about 100k GO/READY pairs per second. The performance was fairly stable and had virtually no dependence on the number of binary bytes the client sends to the tester. I then repeated the test on a very fast, 24 core server running CentOS. With a 0 byte payload I observed only about 55k GO/READY pairs per second and the performance degraded noticably as the number of bytes the client sent increased. When the client sends 100k bytes between "GO" and "READY" the throughput was only about 6k operations per second.
So I have three questions
Why would the same code run much more slowly on a faster machine
Why would the performance in the virtual machine be independent of payload size but the performance on the fast server be heavily dependent on payload size?
Is there anything I can do to make things faster on the server
One possible explanation is that I recompiled the code on the fast server and it is using a different version of the C++ libraries. The VMWare machine is running Ubuntu 11.10 and the fast sever is running CentOS 6. Both are 64 bit machines.
The relevant tester code is as follows:
ios_base::sync_with_stdio(false);
const int BUFFER_SIZE = 2 << 20;
char buffer[BUFFER_SIZE];
process_stdout->rdbuf()->pubsetbuf(buffer, BUFFER_SIZE);
Timer timer;
// Wait until the process is ready
string line;
line.reserve(2 << 20);
getline(*process_stdout, line);
CHECK(line == "READY");
timer.Start();
for (int i = 0; i < num_trials; ++i) {
*process_stdin << "GO\n";
process_stdin->flush();
line = "";
while (line != "READY") {
getline(*process_stdout, line);
}
}
double elapsed = timer.Elapsed();
cout << "Done. Did " << num_trials << " iterations in "
<< elapsed << " seconds. Throughput: "
<< double(num_trials) / elapsed << " per second." << endl;
I also tried versions using read() calls (from unistd.h) into a 1MB buffer and calls to memchr to find the "\n" characters and look for READY but got the same performance results.
The relevant client code is as follows:
// Create a vector of binary data. Some portion of the data will be sent
// to stdout each time a "GO" is received before sending "READY"
vector<char> byte_source;
const int MAX_BYTES = 1 << 20;
for (int i = 0; i < MAX_BYTES; ++i) {
byte_source.push_back(i % 256);
}
cout << "READY" << endl;
while (cin.good()) {
string line;
getline(cin, line);
if (line == "GO") {
// The value of response_bytes comes from a command line flag
OutputData(response_bytes, byte_source);
cout << "READY" << endl;
}
}
// write bytes worth of data from byte_source to stdout
void OutputData(unsigned int bytes,
const vector<char>& byte_source) {
if (bytes == 0) {
return;
}
cout.write(&byte_source[0], bytes);
cout << "\n";
}
Any help would be greatly appreciated!

The fact that the speed in VM is independent of the payload size indicates that you're doing something wrong. These are not complete programs so it's hard to pinpoint what. Use strace to see what's going on, i.e., whether the client actually does send all data you believe it is (and also check that the tester is receiving all data it should be).
100k READY/GO pairs is way too much; it's basically near the upper limit of the number of context switches per second, without doing anything else.

Related

Data not written with ofstream, even though success is returned

I'm writing a program which fetches a large number of email files using libcurl and then writes the file to disk, and then generates a receipt.
My problem is that, whilst most of the receipts seem to get written, the majority of the emails aren't written to disk. Worse, even though the file doesn't get written, ofstream returns success - so the receipt gets written even if the file write didn't complete successfully.
My guess is that, because ofstream is asynchronous, if a write doesn't complete in time then it'll get dropped on the floor - only a certain number of writes being possible concurrently. I am just guessing here.
Perhaps I need to refactor my code to write synchronously - but I can't believe that that's necessary. Does anyone have any idea how I can make this work?
The email sizes range from a few KBytes to a couple of MBytes.
int write_file(string filename, string mail_item) {
ofstream out(filename.c_str());
out << mail_item;
out.close();
out.flush();
if (!out) {
return FUNCTION_FAILED;
}
return FUNCTION_SUCCESS;
}
This is part of another function, and has been cut out so that only the salient code for this question is shown.
vector<string> directory = curl_listroot(curl);
for (int i=0; i<directory.size(); i++) {
vector<int> mail_list = curl_search(curl,directory[i],make_vector<string>() << "SEEN" << "RECENT" << "NEW" << "ANSWERED" << "FLAGGED");
for (int j=0; j<mail_list.size(); j++) {
curl_reset(curl, imap.username, imap.password);
string mail_item = curl_fetch(curl,directory[i],mail_list[j]);
if (mail_item.compare("") != 0) {
string m_id = getMessageID(mail_item);
string filename = save_path+"/"+RECEIPTNAME+"/"+clean_filename(m_id) + ".eml";
if (!file_exists(filename)) {
string real_filename;
real_filename = save_path+"/"+INBOXNAME+"/"+clean_filename(m_id) + ".eml";
int success = write_file(real_filename, mail_item);
if (success == FUNCTION_SUCCESS) {
write_file(filename, ""); //write empty receipt
}
}
}
}
}
All suggestions gratefully received! Thank you!
Okay. I've found an answer - there may be better answers - but this one works for me. The problem seems to be in the OS (Linux, in this case) - ofstream completes, having handed the responsibility for writing the file to the OS, but the file hasn't actually been written yet (so whilst ofstream may be synchronous the end to end write of the file, from data to file safely written to disk, isn't). Given that I'm banging away with a huge number of writes in quick succession (potentially thousands), this won't necessarily work. The OS may throw its hands in the air and drop a significant number of the files writes on the floor (hence my original request for a synchronous way of writing the files - end to end).
My solution is to pause after each write to give the OS time to catch up. It's inelegant though, and not as performant as it should be - it doesn't take half a second to write an empty file. Additionally, on slow storage, half a second might not be enough time. I'd welcome any clever suggestions for how to improve my code.
int write_file(string filename, string mail_item) {
ofstream out(filename.c_str());
if (!out) {
return FUNCTION_FAILED;
}
out << mail_item << endl;
out.flush();
usleep(500000); //wait for half a second to give the OS time to output the file
if (!out) {
return FUNCTION_FAILED;
}
out.close();
if (!out) {
return FUNCTION_FAILED;
}
return FUNCTION_SUCCESS;
}

Strange SIGPIPE in loop

After dealing with a very strange error in a C++ program I was writing, I decided to write the following test code, confirming my suspicion. In the original program, calling send() and this_thread::sleep_for() (with any amount of time) in a loop 16 times caused send to fail with a SIGPIPE signal. In this example however, it fails after 4 times.
I have a server running on port 25565 bound to localhost. The original program was designed to communicate with this server. I'm using the same one in this test code because it doesn't terminate connections early.
int main()
{
struct sockaddr_in sa;
memset(sa.sin_zero, 0, 8);
sa.sin_family = AF_INET;
inet_pton(AF_INET, "127.0.0.1", &(sa.sin_addr));
sa.sin_port = htons(25565);
cout << "mark 1" << endl;
int sock = socket(AF_INET, SOCK_STREAM, 0);
connect(sock, (struct sockaddr *) &sa, sizeof(sa));
cout << "mark 2" << endl;
for (int i = 0; i < 16; i++)
{
cout << "mark 3" << endl;
cout << "sent " << send(sock, &i, 1, 0) << " byte" << endl;
cout << "errno == " << errno << endl;
cout << "i == " << i << endl;
this_thread::sleep_for(chrono::milliseconds(2));
}
return 0;
}
Running it in GDB is how I discovered it was emitting SIGPIPE. Here is the output of that: http://pastebin.com/gXg2Y6g1
In another test, I called this_thread::sleep_for() 16 times in a loop, THEN called send() once. This did NOT produce the same error. It ran without issue.
In yet another test, I commented out the thread sleeping line, and it ran all the way through just fine. I did this in both the original program and the above test code.
These results make me believe it's not a case of the server closing the connection, even though that's usually what SIGPIPE means (why did it run fine when there was no call to this_thread::sleep_for()?).
Any ideas as to what could be causing this? I've been messing around with it for a week and have gotten no further.
Running this on my machine prints up to mark 3 once, as I expected it to. The fact that it does run several times on your end tells me that you have a server listening on port 25565, which you have not included in this question.
Your problem is that you are not testing to see whether the server, of which you have not told us, closed the connection. When it does, your process gets a SIGPIPE. Since you do not handle that signal, your process quits.
What you can do in order to fix this:
Start checking return values of functions. It wouldn't have helped in this particular case, but you ignore potential errors from both connect and send. I'm hoping this is because of minimizing the program, but it is worth mentioning.
Handle the signal. If you prefer to handle server closes from the main flow of your code, you can either register a handler that ignores the signal, or pass the flag MSG_NOSIGNAL to send. In both cases, send will return -1 in such a case with errno set to EPIPE.
RTFM. Seriously. A simple man send and a search for SIGPIPE would give you this answer.
As for why the server closed, I cannot answer this without knowing what server it is and what protocol it is running. No, don't answer that question in the comments. It is irrelevant to this question. The simple truth of the matter is that a server you are talking to might close the connection at any time, and your code must be able to to deal with that.

Clearing a read() buffer while using a socket

Recently I've been messing around with some sockets by trying to make a client/server program. So far I have been successful, but it seems I hit a roadblock. For some quick background information, I made a server that can accept a connection, and once everything is set up and a connection to a client is made, this block of code begins to exectue:
while(1){
read(newsockfd, &inbuffer, 256);
std::cout << "Message from client " << inet_ntoa(cli_addr.sin_addr) << " : ";
for(int i = 0; i < sizeof(inbuffer); i++){
std::cout << inbuffer[i];
}
std::cout << std::endl;
}
Now the client simply, when executed, connects to the server and writes to the socket, and then exits. So since one message was sent, this loop should only run once, and then wait for another message if what I read was correct.
But what ends up happenning is that this loop continues over and over, printing the same message over and over. From what I read (on this site and others) about the read() function is that after it is called once, it waits for another message to be recieved. I may be making a stupid mistake here, but is there any way I can have this read() function wait for a new message, instead of using the same old message over and over? Or is there another function that could replace read() to do what I want it to?
Thanks for any help.
You don't check the return value of read. So if the other end closes the connection or there's an error, you'll just loop forever outputting whatever happened to be in the buffer. You probably want:
while(1){
int msglen = read(newsockfd, &inbuffer, 256);
if (msglen <= 0) break;
std::cout << "Data from client " << inet_ntoa(cli_addr.sin_addr) << " : ";
for(int i = 0; i < msglen; i++){
std::cout << inbuffer[i];
}
std::cout << std::endl;
}
Notice that I changed the word "message" to "data". Here's why:
So since one message was sent, this loop should only run once, and then wait for another message if what I read was correct.
This is incorrect. The code above does not have any concept of a "message", and TCP does not preserve application message boundaries. So not only is this wrong, there's no way it could be correct because the word "message" has no meaning that could possibly apply in this context. TCP does not "glue together" the bytes that happend to be passed in a single call to a sending function.

How to multithread file processing in C++?

I'm working on one problem where I need to process 24 files (each size = 3 GB) and write the output into multiple files (24). Each file takes around 1 hour to process. Is it possible to write data into multiple files concurrently using multi-threading with below code?
int _tmain(int argc, _TCHAR* argv[])
{
std::string path;
cout << "Enter the folder of the logs: " << endl;
cin >> path;
WIN32_FIND_DATA FileInformation; // File information
memset(&FileInformation, 0, sizeof(WIN32_FIND_DATA));
std::string strExt = "\\*.txt";
std::string strEscape = "\\";
std::string strPattern = path + strExt;
HANDLE hFile = ::FindFirstFile(strPattern.c_str(), &FileInformation);
while(hFile != INVALID_HANDLE_VALUE)
{
int offset;
std::ifstream Myfile;
std::string strFileName = FileInformation.cFileName;
std:: string fullPath = path + strEscape + strFileName;
std::string outputFile = path + strEscape + strFileName.substr(0, strFileName.length()-3) + "processed"+".txt";
std::ofstream ofs(outputFile, std::ofstream::out);
Myfile.open (fullPath);
std::string line;
if(Myfile.is_open())
{
while(!Myfile.eof())
{
-------Processing--------
}
Myfile.close();
}
else
cout<<"Cannot open file."<<endl;
if(FindNextFile(hFile, &FileInformation) == FALSE)
break;
}
// Close handle
::FindClose(hFile);
return 0;
}
Looking into your code I assume you produce one output file from one input. In such case you do not need to write multithreaded code to check if processing multiple files at once will speed up the process. Just modify your program to accept file name as a parameter and run multiple of them in parallel. But unless you are reading/writing from/to SSD drive such parallel processing most probably would slow process down, as hard-drive will have to switch between reading/writing for multiple positions, and head positioning is slow.
It is not clear what you are doing on processing, but if it takes 100% CPU then you most probably will speed up process significantly by processing one file by multiple threads. You would have one thread reading, then thread pool processing, then one thread writing. Tricky part would be to synchronize data and make it not appear in output file in wrong order.
Don't write multithreaded code here, write multiprocess code. That is, have your program process one file (which is passed as an argument), and call it multiple times in parallel from a script.
Don't run your program 24 times concurrently (unless you have 24 cores and 72GB of memory available). Try running 2, 4 or 6 instances concurrently and see what's best. I guess it'll be the number of cores, maybe the number of cores * 2 - 1 (hyperthreading does help). Try it out.
Also, if your program reads the file at the start, then performs the calculations, then writes the result, measure the time it takes to read the 3GB of data. If it's, for example, 30 seconds, and you run 4 processes concurrently, have your run script start the first instance, then wait 45 seconds, then start the second one and so on until the fourth. Start the fifth instance once one of the first four is finished. Every time another instance finishes, run the next one until all 24 have been run.

gpsd client data buffer

I am developing a C++ application that should retrieve the received NMEA sentences of type $GPGGA, using gpsd. The idea is to read from gpsd approximately once per second and to parse the last $GPGGA received sentence, extracting the two fields of my interest: the quality indicator and the reference station ID. I used the C++ libgpsmm library, periodically calling to gpsmm::read() and to gpsmm::data(), accessing directly to the client data buffer.
At first, I have made several tests using gpsfake and a fake GPS log (specifying the gpsfake option "-c 0.5", in order to have two sentences per second). The results are OK when the time between two requests to gpsd is less or equal to 400ms. If I try with a greater time, the results are unexpected, having in each reading a piece of NMEA sentences with lots of repeated data as well as some truncated sentences. The things are really worse when I try with a real GPS that writes ~40 sentences per second: in this case the time between reading should be ~ 10ms or even less in order to have correct results.
The following is a simpler program that prints the NMEA sentences that are received. It works well, with the simulated and even with the real GPS. But if I uncomment the usleep() call, which makes the program to check the buffer once per second, client data buffer does not give reasonable results.
#include <iostream>
#include "libgpsmm.h"
using namespace std;
#define WAITING_TIME 5000000
#define RETRY_TIME 5
#define ONE_SECOND 1000000
int main(void)
{
for(;;){
//For version 3.7
gpsmm gps_rec("localhost", DEFAULT_GPSD_PORT);
if (gps_rec.stream(WATCH_ENABLE|WATCH_NMEA) == NULL) {
cout << "No GPSD running. Retry to connect in " << RETRY_TIME << " seconds." << endl;
usleep(RETRY_TIME * ONE_SECOND);
continue; // It will try to connect to gpsd again
}
const char* buffer = NULL;
for (;;) {
struct gps_data_t* newdata;
if (!gps_rec.waiting(WAITING_TIME))
continue;
if ((newdata = gps_rec.read()) == NULL) {
cerr << "Read error.\n";
break;
} else {
buffer = gps_rec.data();
// We print the NMEA sentences!
cout << "***********" << endl;
cout << buffer << endl;
//usleep(1000000);
}
}
}
}
Here is the output having the usleep() call commented (ie. continually reading data):
$ ./GPSTest1
***********
{"class":"VERSION","release":"3.7","rev":"3.7","proto_major":3,"proto_minor":7}
***********
{"class":"WATCH","enable":true,"json":false,"nmea":true,"raw":0,"scaled":false,"timing":false}
***********
$GPGGA,202010.00,3313.9555651,S,06019.3785868,W,4,09,1.0,39.384,M,16.110,M,10.0,*46<CR><LF>
***********
$GPGGA,202011.00,3313.9555664,S,06019.3785876,W,4,09,1.0,39.386,M,16.110,M,11.0,*4D<CR><LF>
***********
$GPGGA,202012.00,3313.9555668,S,06019.3785882,W,4,09,1.0,39.394,M,16.110,M,12.0,*49<CR><LF>
***********
$GPGGA,202013.00,3313.9555673,S,06019.3785911,W,4,09,1.0,39.395,M,16.110,M,13.0,*49<CR><LF>
***********
$GPGGA,202014.00,3313.9555670,S,06019.3785907,W,4,09,1.0,39.409,M,16.110,M,14.0,*4F<CR><LF>
***********
$GPGGA,202015.00,3313.9555657,S,06019.3785905,W,4,09,1.0,39.395,M,16.110,M,15.0,*4A<CR><LF>
And this is the output when the line is commented (ie. the buffer is checked once per second):
$ ./GPSTest2
***********
{"class":"VERSION","release":"3.7","rev":"3.7","proto_major":3,"proto_minor":7}
***********
{"class":"DEVICE","path":"/dev/pts/0","activated":"2012-11-05T23:48:38.110Z","driver":"Generic NMEA","native":0,"bps":4800,"parity":"N","stopbits":1,"cycle":1.00}
$GPGGA,202013.00,3313.9555673,S,06019.3785911,W,1,09,1.0,39.395,M,16.110,M,13.0,*49<CR><LF>
0}
$GPGGA,202013.00,3313.9555673,S,06019.3785911,W,1,09,1.0,39.395,M,16.110,M,13.0,*49<CR><LF>
":"Generic NMEA","native":0,"bps":4800,"parity":"N","stopbits":1,"cycle":1.00}
$GPGGA,202013.00,3313.9555673,S,06019.3785911,W,1,09,1.0,39.395,M,16.110,M,13.0,*49<CR><LF>
***********
$GPGGA,202013.00,3313.9555673,S,06019.3785911,W,1,09,1.0,39.395,M,16.110,M,13.0,*49<CR><LF>
***********
$GPGGA,202016.00,3313.9555642,S,06019.3785894,W,1,09,1.0,39.402,M,16.110,M,16.0,*4E<CR><LF>
$GPGGA,202017.00,3313.9555643,S,06019.3785925,W,1,09,1.0,39.404,M,16.110,M,17.0,*42<CR><LF>
$GPGGA,202017.00,3313.9555643,S,06019.3785925,W,1,09,1.0,39.404,M,16.110,M,17.0,*42<CR><LF>
$GPGGA,202017.00,3313.9555643,S,06019.3785925,W,1,09,1.0,39.404,M,16.110,M,17.0,*42<CR><LF>
***********
Any suggestion? At first, I tried to directly analyze the gps_data_t structure, but It seems to be harder to identify the quality indicator and the reference station ID that way, among all the fields of the structure, in comparison with the search within a NMEA sentence.
I am not familiar with the gpsd service, but what you describe looks a lot like the receive buffer is being corrupted (overwritten). The GPS receiver is outputting NMEA information continually and when your application is sleeping these characters will accumulate in the buffer, if too many characters are recieved then the buffer will get overwritten.
Either increase the serial port receiver buffer size (if possible) or maybe clear the buffer after the wakeup and then wait for the next GGA message (which may be up to one second away in the worst case).
The GPS receiver should be configured to output information at a 1Hz (once per second), in this case the the device should only output about 8 sentances per second. If you're seeing 40 sentances then your receiver would seem to be outputting information at around 5Hz which sounds like overkill for your particular interest.