We are using the following method to write the log to log file. The log entries are kept in a vector named m_LogList (stl string entries are kept in the vector). The method is called when the size of the vector is more than 100. The CPU utilization of the Log server is around 20-40% if we call FlushLog method. If we comment out the FlushLog method, the CPU utilisation drops to 10-20% range.
What optimizations can I use to reduce the CPU utilization? We are using fstream object for writing the log entries to file
void CLogFileWriter::FlushLog()
{
CRCCriticalSectionLock lock(m_pFileCriticalSection);
//Entire content of the vector are writing to the file
if(0 < m_LogList.size())
{
for (int i = 0; i < (int)m_LogList.size(); ++i)
{
m_ofstreamLogFile << m_LogList[i].c_str()<<endl;
m_nSize = m_ofstreamLogFile.tellp();
if(m_pLogMngr->NeedsToBackupFile(m_nSize))
{
// Backup the log file
}
}
m_ofstreamLogFile.flush();
m_LogList.clear(); //Clearing the content of the Log List
}
}
The first optimization I'd use is to drop the .c_str() in << m_LogList[i].c_str(). It forces operator<< to do a strlen (O(n)) instead of relying on string::size (O(1)).
Also, I'd just sum string sizes, instead of calling tellp.
Finally, << endl includes a flush, on every line. Just use << '\n'. You already have the flush at the end.
I'd consider first of all dumping the log in one stdlib call like this:
std::copy(list.begin(), list.end(), std::ostream_iterator<std::string>(m_ofstreamLogFile, "\n"));
This will remove the flushing because of endl and the unnecessary conversion to a c string. CPU-wise this should be quite efficient.
You can do the backup afterwards unless you really care about a very specific limit, but even in that case I'd say: backup atsome lower threshold so that you can account for some overflow.
Also, remove if(0 < m_LogList.size()), it's not really necessary for anything.
A few comments:
if(0 < m_LogList.size())
Should be:
if(!m_LogList.empty())
Although with a vector it shouldn't make a difference.
Also you should consider moving the
m_nSize = m_ofstreamLogFile.tellp();
if(m_pLogMngr->NeedsToBackupFile(m_nSize)) { /*...*/ }
out of the loop. You don't say how much CPU it uses, but I'd bet it is heavy.
You could also iterate using iterators:
for (int i = 0; i < (int)m_LogList.size(); ++i)
Should be:
for (std::vector<std::string>::iterator it = m_LogList.begin();
it != m_LogList.end(); ++it)
Lastly, change the line:
m_ofstreamLogFile << m_LogList[i].c_str()<<endl;
Into:
m_ofstreamLogFile << m_LogList[i] << '\n';
The .c_str() is unnecesary. And the endl writes an EOL and flushes the stream. You do not want to do that, as you are flushing it at the end of the loop.
Related
My program reads a file, processes it and saves the results in a csv file.
The whole of us include a loop in which many different files are processed. a separate csv file is generated for each of these files.
I was able to implement the processing very efficiently in terms of time, so that saving the respective results is the longest process in the loop.
The results are available as vector <float> and are currently saved as follows:
std::vector<float*> out = calculation(bla);
fstream data;
data.open(savepfad + name + ".csv", ios::out);
data<< sizex << endl;
data<< sizey << endl;
data<< dim << endl;
for (int d = 0; d < dim; d++)
{
for (int x = 0; x < sizex * sizey; x++)
{
data << out[d][x] << ",";
}
data << endl;
}
data.close();
my first thought was that i would simply outsource the storage process to a new thread (possibly with a fork) so i could continue with the main loop. But I use windows.
can I somehow write the data to the hard drive faster?
does anyone have a brilliant idea?
EDIT:
so i rebuilt the code according to the statements, but there is no real speed advantage. The code now looks like this:
std::vector<float*> out = calculation(bla);
string line = std::to_string(sizex) + "\n" + std::to_string(sizey ) + "\n" + std::to_string(dim) + "\n";
for (int d = 0; d < dim; d++)
{
for (int x = 0; x < sizex * sizey; x++)
{
line += out[d][x];
line += ",";
}
line += "\n";
}
fstream data;
data.open(savepfad + name + ".csv", ios::out);
data<<line;
data.close();
I also noticed that if out [] [] = 0 hours :: to_string (out [] []) makes 0 from 0.00 to 0.000000, and a data << out [] [] only writes 0 into the file. this makes the file size from 8000KB to 36000KB.
So if I can dump quasi instant 100MB onto the hard disk in python, I have to be able to write 8000KB relatively quickly, currently it takes between 1 and 2 minutes.
example size:
sizex = 638
sizey = 958
dim = 8
The time measurement shows that it takes almost the entire time to go through the two loops. it is a vector consisting of arrays. is the access to out too slow?
data << endl sends a newline AND flushes the result to disk.
You could do
data << "\n";
instead to send a newline without flushing.
The end result is that you flush fewer times, which means you spend less time waiting for the OS.
If that is still not fast enough, consider buffering everything into a ostrstream and dumping that into data in one go.
There are a couple of things you can do which may help, I would try implementing them one after another and measure the performance.
Don't flush after every line:
std::endl actually flushes the buffers and saves the file to the drive, that's probably killing the performance. So use << '\n';
You can try to minimize memory allocation and copying, if you buffer every line (or multiple lines) before writing it out. I would try to reserve a big string (std::string line; line.reserve(<big number enough for the full line>);) and do line += std::to_string(out[d][x]); line += ',';
You can optimize this even further, and you can try to use std::to_chars.
+1. If you are on windows, you can try to use the latest MSVC, they reported 5x speedup in float to string conversion (compared to crt functions), after implementing to_chars. https://www.youtube.com/watch?v=4P_kbF0EbZM
I made this method to read from a file and put it into a vector of strings;
std::vector<std::string> read_file_lines1(const char* filepath){
std::vector<std::string> file;
std::ifstream input(filepath);
Timer timer;
float time = 0;
std::string line;
int i = 0;
while (getline(input, line)){
timer.reset();
file.push_back(line);
time += timer.elapsed();
if (i == 10000)
std::cout << "10000 done" << std::endl;
i = ((i + 1) % 10001);
}
std::cout << time << std::endl;;
return file;
}
But the performance was really bad in my opinion (200k lines in ~22 seconds)
with a small change making it a vector<string*> (using file.push_back(new std::string(line)) pushback calls went from ~16 seconds to ~1.2 seconds what was a huge improvement (still behind my goals) and it has a small disadvantage: memory usage; if I want to clear the memory used here I will have to remember to make a loop to clear each string*
Now it takes 6~seconds for the whole method, ~5 of them are mostly used in string in the "getline" method and I would really like to know how to optimize it or make an alternative.
PS: I am doing this do load a 3D model, using the same model in Java it takes ~0.8 seconds to read everything AND FILTER (putting "each line in the" vertex/texture... array and then putting them in the index order), so I'm really disappointed if I take that much time to read each line from a file in c++ (using debug mode in both java/c++, that probably makes it quite a bad benchmark but I'm still really disappointed);
Main reason why it is slow, that you need to reallocate memory and move all strings into new location each time when vector capacity is reached. Use std::deque instead of vector, deque doesn't reallocate memory, it adding new chunks. Or you could preallocate vector with reserve method, to avoid reallocations.
Also debug c++ code could be much slower than release, especially with a lot of template and/or inline code - you really need to measure release performance and you need to use timer just once for whole loop as I suspect that in release mode you will be spending a lot of time in timer code.
Another small optimization. instead of
if (i == 10000)
std::cout << "10000 done" << std::endl;
i = ((i + 1) % 10001);
use:
if (i == 10000)
{
std::cout << "10000 done" << std::endl;
i = 0;
}
++i;
I was experimenting with c++ trying to figure out how I could print the numbers from 0 to n as fast as possible.
At first I just printed all the numbers with a loop:
for (int i = 0; i < n; i++)
{
std::cout << i << std::endl;
}
However, I think this flushes the buffer after every single number that it outputs, and surely that must take some time, so I tried to first print all the numbers to the buffer (or actually until it's full as it seems then seems to flush automatically) and then flush it all at once. However it seems that printing a \n after flushes the buffer like the std::endl so I omitted it:
for (int i = 0; i < n; i++)
{
std::cout << i << ' ';
}
std::cout << std::endl;
This seems to run about 10 times faster than the first example. However I want to know how to store all the values in the buffer and flush it all at once rather than letting it flush every time it becomes full so I have a few questions:
Is it possible to print a newline without flushing the buffer?
How can I change the buffer size so that I could store all the values inside it and flush it at the very end?
Is this method of outputting text dumb? If so, why, and what would be a better alternative to it?
EDIT: It seems that my results were biased by a laggy system (Terminal app of a smartphone)... With a faster system the execution times show no significant difference.
TL;DR: In general, using '\n' instead of std::endl is faster since std::endl
Explanation:
std::endl causes a flushing of the buffer, whereas '\n' does not.
However, you might or might not notice any speedup whatsoever depending upon the method of testing that you apply.
Consider the following test files:
endl.cpp:
#include <iostream>
int main() {
for ( int i = 0 ; i < 1000000 ; i++ ) {
std::cout << i << std::endl;
}
}
slashn.cpp:
#include <iostream>
int main() {
for ( int i = 0 ; i < 1000000 ; i++ ) {
std::cout << i << '\n';
}
}
Both of these are compiled using g++ on my linux system and undergo the following tests:
1. time ./a.out
For endl.cpp, it takes 19.415s.
For slashn.cpp, it takes 19.312s.
2. time ./a.out >/dev/null
For endl.cpp, it takes 0.397s
For slashn.cpp, it takes 0.153s
3. time ./a.out >temp
For endl.cpp, it takes 2.255s
For slashn.cpp, it takes 0.165s
Conclusion: '\n' is definitely faster (even practically), but the difference in speed can be dependant upon other factors. In the case of a terminal window, the limiting factor seems to depend upon how fast the terminal itself can display the text. As the text is shown on screen, and auto scrolling etc needs to happen, massive slowdowns occur in the execution. On the other hand, for normal files (like the temp example above), the rate at which the buffer is being flushed affects it a lot. In the case of some special files (like /dev/null above), since the data is just sinked into a black-hole, the flushing doesn't seem to have an effect.
I know endl or calling flush() will flush it. I also know that when you call cin after cout, it flushes too. And also when the program exit. Are there other situations that cout flushes?
I just wrote a simple loop, and I didn't flush it, but I can see it being printed to the screen. Why? Thanks!
for (int i =0; i<399999; i++) {
cout<<i<<"\n";
}
Also the time for it to finish is same as withendl both about 7 seconds.
for (int i =0; i<399999; i++) {
cout<<i<<endl;
}
There is no strict rule by the standard - only that endl WILL flush, but the implementation may flush at any time it "likes".
And of course, the sum of all digits in under 400K is 6 * 400K = 2.4MB, and that's very unlikely to fit in the buffer, and the loop is fast enough to run that you won't notice if it takes a while between each output. Try something like this:
for(int i = 0; i < 100; i++)
{
cout<<i<<"\n";
Sleep(1000);
}
(If you are using a Unix based OS, use sleep(1) instead - or add a loop that takes some time, etc)
Edit: It should be noted that this is not guaranteed to show any difference. I know that on my Linux machine, if you don't have a flush in this particular type of scenario, it doesn't output anything - however, some systems may do "flush on \n" or something similar.
Is there a more effective way to compare data bytewise than using the comparison
operator of the C++ list container?
I have to compare [large? 10 kByte < size < 500 kByte] amounts of data bytewise, to verify the integrity of external storage devices.
Therefore I read files bytewise and store the values in a list of unsigned chars.
The recources of this list are handled by a shared_ptr, so that I can pass it around in the program without the need to worry about the size of the list
typedef boost::shared_ptr< list< unsigned char > > = contentPtr;
namespace boost::filesystem = fs;
contentPtr GetContent( fs::path filePath ){
contentPtr actualContent (new list< unsigned char > );
// Read the file with a stream, put read values into actual content
return actualContent;
This is done twice, because there're always two copies of the file.
The content of these two files has to be compared, and throw an exception if a mismatch is found
void CompareContent() throw( NotMatchingException() ){
// this part is very fast, below 50ms
contentPtr contentA = GetContent("/fileA");
contentPtr contentB = GetContent("/fileB");
// the next part takes about 2secs with a file size of ~64kByte
if( *contentA != *contentB )
throw( NotMatchingException() );
}
My problem is this:
With increasing file size, the comparison of the lists gets very slow. Working with file sizes of about 100 kByte, it will take up to two seconds to compare the content. Increasing and decreasing with the file size....
Is there a more effective way of doing this comparison? Is it a problem of the used container?
Don't use a std::list use a std::vector.
std::list is a linked-list, elements are not guaranteed to be stored contiguously.
std::vector on the other hand seems far better suited for the specified task (storing contiguous bytes and comparing large chunks of data).
If you have to compare several files multiple times and don't care about where the differences are, you may also compute a hash of each file and compare the hashes. This would be even faster.
My first piece of advice would be to profile your code.
The reason I say that is that, no matter how slow your comparison code is, I suspect your file I/O time dwarfs it. You don't want to waste days trying to optimize a part of your code that only takes 1% of your runtime as-is.
It could even be that there is something else you didn't notice before that is actually causing the slowness. You won't know until you profile.
If there's nothing else to be done with the contents of those files (looks like you're going to let them get deleted by shared_ptr at the end of CompareContent()'s scope), why not compare the files using iterators, not creating any containers at all?
Here's a piece of my code that compares two files bytewise:
// compare files
if (equal(std::istreambuf_iterator<char>(local_f),
std::istreambuf_iterator<char>(),
std::istreambuf_iterator<char>(host_f)))
{
// we're good: move table to OutPath, remove other files
EDIT: if you do need to keep contents around, I think std::deque might be slightly more efficient than std::vector for the reasons explained in GOTW #54.. or not -- profiling will tell. And still, there would be the need for only one of the two identical files to be loaded in memory -- I'd read one into a deque and compare with the other file's istreambuf_iterator.
As you write, you are comparing contents of two files. Then you can make use of boost's mapped_files. You really do not need to read the whole file. You can read on the fly (in an optimized way as boost does) and stop when you find the first unequal byte...
Just like the very elegant solution in Cubbi's answer here: http://www.cplusplus.com/forum/general/94032/ Note that just below he also adds some benchmarks which clearly show this is the fastest way. I will just rewrite a bit his answer and add zero-file size check which throws exception otherwise and enclose the test into a function to benefit from early returns:
#include <iostream>
#include <algorithm>
#include <boost/iostreams/device/mapped_file.hpp>
#include <boost/filesystem.hpp>
namespace io = boost::iostreams;
namespace fs = boost::filesystem;
bool files_equal(const std::string& path1, const std::string& path2)
{
fs::path f1(path1);
fs::path f2(path2);
if (fs::file_size(f1) != fs::file_size(f2))
return false;
// zero-sized files cannot be opened with mapped_file_source
// hence we consider all zero-sized files equal
if (fs::file_size(f1) == 0)
return true;
io::mapped_file_source mf1(f1.string());
io::mapped_file_source mf2(f1.string());
return std::equal(mf1.data(), mf1.data() + mf1.size(), mf2.data());
}
int main()
{
if (files_equal("test.1", "test.2"))
std::cout << "The files are equal.\n";
else
std::cout << "The files are not equal.\n";
}
std::list is monumentally inefficient for a char element - there is overhead for every element to facilitate O(1) insertion and removal, which is really not what your task requires.
If you must use STL, then either std::vector or the iterator approach suggested would be preferable to std::list, but why not just read the data into a char* wrapped in some smart pointer of your choice and use memcmp?
It is crazy to use anything other than memcmp for the comparison. (Unless you want it even faster, in which case you might want to code it in assembly language.)
In the interest of objectivity in the memcmp-vs-equal debate, I offer the following benchmark program, so that you can see for yourselves which, if any, is faster on your system. It also tests operator==. On my system (Borland C++ 5.5.1 for Win32):
std::equal: 1375 clock ticks
operator==: 1297 clock ticks
memcmp: 1297 clock ticks
What happens on your system?
#include <algorithm>
#include <vector>
#include <iostream>
using namespace std;
static char* buff ;
static vector<char> v0, v1 ;
static int const BufferSize = 100000 ;
static clock_t StartTimer() ;
static clock_t EndTimer (clock_t t) ;
int main (int argc, char** argv)
{
// Allocate a buffer
buff = new char[BufferSize] ;
// Create two vectors
vector<char> v0 (buff, buff + BufferSize) ;
vector<char> v1 (buff, buff + BufferSize) ;
clock_t t ;
// Compare them 10000 times using std::equal
t = StartTimer() ;
for (int i = 0 ; i < 10000 ; i++)
if (!equal (v0.begin(), v0.end(), v1.begin()))
cout << "Error in std::equal\n", exit (1) ;
t = EndTimer (t) ;
cout << "std::equal: " << t << " clock ticks\n" ;
// Compare them 10000 times using operator==
t = StartTimer() ;
for (int i = 0 ; i < 10000 ; i++)
if (v0 != v1)
cout << "Error in operator==\n", exit (1) ;
t = EndTimer (t) ;
cout << "operator==: " << t << " clock ticks\n" ;
// Compare them 10000 times using memcmp
t = StartTimer() ;
for (int i = 0 ; i < 10000 ; i++)
if (memcmp (&v0[0], &v1[0], v0.size()))
cout << "Error in memcmp\n", exit (1) ;
t = EndTimer (t) ;
cout << "memcmp: " << t << " clock ticks\n" ;
return 0 ;
}
static clock_t StartTimer()
{
// Start on a clock tick, to enhance reproducibility
clock_t t = clock() ;
while (clock() == t)
;
return clock() ;
}
static clock_t EndTimer (clock_t t)
{
return clock() - t ;
}