how to properly write a benchmark program under release mode? - c++

while running some benchmark programs, the wall clock may surprisingly give a very small duration, which is because of various compiler optimization like dead code elimination, loop unwinding... that optimize away the tested code.
I may add some "exterior depency" via static/volatile qualifier, but that's not always work.
Any idea?

Generally, adding "some "exterior depency" via static/volatile qualifier" will change the actual behavior, and timing so is not advised.
The way I generally make sure that code can be tested is by using argc (the first arg to main)
If I need a loop, I change the input data using argc. Also have some dummy calculated value that is dependant on the code and printed after the loops and timer to make sure the result can't be optimized out.
If your fill function can take a seed, you can also just use argc as part of the seed.
So if you have random test data, or a loop used to test your code, add argc to it.
Without knowing exactly what you are testing the best I can do is someting like
seed=argc; // the compiler cannot count on any value of argc
dummy=0;
total_time=0;
for (rep=0; rep < max_rep; ++rep)
{
for (i =0; i < max_array; ++i)
{
test_array[i]=seed+gen_random(); // Note use of seed
}
start=time_now();
dummy+=function_to_time(test_array); // this is what you are testing
end=time_now();
total_time+=end-start;
seed++; // change seed just to be extra paranoid
}
std::cout << "Time=" << total_time << " Dummy=" << dummy << std::endl;

Related

Is it problematic to declare variables in the main loop of a OpenGL program? [duplicate]

Question #1: Is declaring a variable inside a loop a good practice or bad practice?
I've read the other threads about whether or not there is a performance issue (most said no), and that you should always declare variables as close to where they are going to be used. What I'm wondering is whether or not this should be avoided or if it's actually preferred.
Example:
for(int counter = 0; counter <= 10; counter++)
{
string someString = "testing";
cout << someString;
}
Question #2: Do most compilers realize that the variable has already been declared and just skip that portion, or does it actually create a spot for it in memory each time?
This is excellent practice.
By creating variables inside loops, you ensure their scope is restricted to inside the loop. It cannot be referenced nor called outside of the loop.
This way:
If the name of the variable is a bit "generic" (like "i"), there is no risk to mix it with another variable of same name somewhere later in your code (can also be mitigated using the -Wshadow warning instruction on GCC)
The compiler knows that the variable scope is limited to inside the loop, and therefore will issue a proper error message if the variable is by mistake referenced elsewhere.
Last but not least, some dedicated optimization can be performed more efficiently by the compiler (most importantly register allocation), since it knows that the variable cannot be used outside of the loop. For example, no need to store the result for later re-use.
In short, you are right to do it.
Note however that the variable is not supposed to retain its value between each loop. In such case, you may need to initialize it every time. You can also create a larger block, encompassing the loop, whose sole purpose is to declare variables which must retain their value from one loop to another. This typically includes the loop counter itself.
{
int i, retainValue;
for (i=0; i<N; i++)
{
int tmpValue;
/* tmpValue is uninitialized */
/* retainValue still has its previous value from previous loop */
/* Do some stuff here */
}
/* Here, retainValue is still valid; tmpValue no longer */
}
For question #2:
The variable is allocated once, when the function is called. In fact, from an allocation perspective, it is (nearly) the same as declaring the variable at the beginning of the function. The only difference is the scope: the variable cannot be used outside of the loop. It may even be possible that the variable is not allocated, just re-using some free slot (from other variable whose scope has ended).
With restricted and more precise scope come more accurate optimizations. But more importantly, it makes your code safer, with less states (i.e. variables) to worry about when reading other parts of the code.
This is true even outside of an if(){...} block. Typically, instead of :
int result;
(...)
result = f1();
if (result) then { (...) }
(...)
result = f2();
if (result) then { (...) }
it's safer to write :
(...)
{
int const result = f1();
if (result) then { (...) }
}
(...)
{
int const result = f2();
if (result) then { (...) }
}
The difference may seem minor, especially on such a small example.
But on a larger code base, it will help : now there is no risk to transport some result value from f1() to f2() block. Each result is strictly limited to its own scope, making its role more accurate. From a reviewer perspective, it's much nicer, since he has less long range state variables to worry about and track.
Even the compiler will help better : assuming that, in the future, after some erroneous change of code, result is not properly initialized with f2(). The second version will simply refuse to work, stating a clear error message at compile time (way better than run time). The first version will not spot anything, the result of f1() will simply be tested a second time, being confused for the result of f2().
Complementary information
The open-source tool CppCheck (a static analysis tool for C/C++ code) provides some excellent hints regarding optimal scope of variables.
In response to comment on allocation:
The above rule is true in C, but might not be for some C++ classes.
For standard types and structures, the size of variable is known at compilation time. There is no such thing as "construction" in C, so the space for the variable will simply be allocated into the stack (without any initialization), when the function is called. That's why there is a "zero" cost when declaring the variable inside a loop.
However, for C++ classes, there is this constructor thing which I know much less about. I guess allocation is probably not going to be the issue, since the compiler shall be clever enough to reuse the same space, but the initialization is likely to take place at each loop iteration.
Generally, it's a very good practice to keep it very close.
In some cases, there will be a consideration such as performance which justifies pulling the variable out of the loop.
In your example, the program creates and destroys the string each time. Some libraries use a small string optimization (SSO), so the dynamic allocation could be avoided in some cases.
Suppose you wanted to avoid those redundant creations/allocations, you would write it as:
for (int counter = 0; counter <= 10; counter++) {
// compiler can pull this out
const char testing[] = "testing";
cout << testing;
}
or you can pull the constant out:
const std::string testing = "testing";
for (int counter = 0; counter <= 10; counter++) {
cout << testing;
}
Do most compilers realize that the variable has already been declared and just skip that portion, or does it actually create a spot for it in memory each time?
It can reuse the space the variable consumes, and it can pull invariants out of your loop. In the case of the const char array (above) - that array could be pulled out. However, the constructor and destructor must be executed at each iteration in the case of an object (such as std::string). In the case of the std::string, that 'space' includes a pointer which contains the dynamic allocation representing the characters. So this:
for (int counter = 0; counter <= 10; counter++) {
string testing = "testing";
cout << testing;
}
would require redundant copying in each case, and dynamic allocation and free if the variable sits above the threshold for SSO character count (and SSO is implemented by your std library).
Doing this:
string testing;
for (int counter = 0; counter <= 10; counter++) {
testing = "testing";
cout << testing;
}
would still require a physical copy of the characters at each iteration, but the form could result in one dynamic allocation because you assign the string and the implementation should see there is no need to resize the string's backing allocation. Of course, you wouldn't do that in this example (because multiple superior alternatives have already been demonstrated), but you might consider it when the string or vector's content varies.
So what do you do with all those options (and more)? Keep it very close as a default -- until you understand the costs well and know when you should deviate.
I didn't post to answer JeremyRR's questions (as they have already been answered); instead, I posted merely to give a suggestion.
To JeremyRR, you could do this:
{
string someString = "testing";
for(int counter = 0; counter <= 10; counter++)
{
cout << someString;
}
// The variable is in scope.
}
// The variable is no longer in scope.
I don't know if you realize (I didn't when I first started programming), that brackets (as long they are in pairs) can be placed anywhere within the code, not just after "if", "for", "while", etc.
My code compiled in Microsoft Visual C++ 2010 Express, so I know it works; also, I have tried to to use the variable outside of the brackets that it was defined in and I received an error, so I know that the variable was "destroyed".
I don't know if it is bad practice to use this method, as a lot of unlabeled brackets could quickly make the code unreadable, but maybe some comments could clear things up.
For C++ it depends on what you are doing.
OK, it is stupid code but imagine
class myTimeEatingClass
{
public:
//constructor
myTimeEatingClass()
{
sleep(2000);
ms_usedTime+=2;
}
~myTimeEatingClass()
{
sleep(3000);
ms_usedTime+=3;
}
const unsigned int getTime() const
{
return ms_usedTime;
}
static unsigned int ms_usedTime;
};
myTimeEatingClass::ms_CreationTime=0;
myFunc()
{
for (int counter = 0; counter <= 10; counter++) {
myTimeEatingClass timeEater();
//do something
}
cout << "Creating class took " << timeEater.getTime() << "seconds at all" << endl;
}
myOtherFunc()
{
myTimeEatingClass timeEater();
for (int counter = 0; counter <= 10; counter++) {
//do something
}
cout << "Creating class took " << timeEater.getTime() << "seconds at all" << endl;
}
You will wait 55 seconds until you get the output of myFunc.
Just because each loop constructor and destructor together need 5 seconds to finish.
You will need 5 seconds until you get the output of myOtherFunc.
Of course, this is a crazy example.
But it illustrates that it might become a performance issue when each loop the same construction is done when the constructor and / or destructor needs some time.
Since your second question is more concrete, I'm going to address it first, and then take up your first question with the context given by the second. I wanted to give a more evidence-based answer than what's here already.
Question #2: Do most compilers realize that the variable has already
been declared and just skip that portion, or does it actually create a
spot for it in memory each time?
You can answer this question for yourself by stopping your compiler before the assembler is run and looking at the asm. (Use the -S flag if your compiler has a gcc-style interface, and -masm=intel if you want the syntax style I'm using here.)
In any case, with modern compilers (gcc 10.2, clang 11.0) for x86-64, they only reload the variable on each loop pass if you disable optimizations. Consider the following C++ program—for intuitive mapping to asm, I'm keeping things mostly C-style and using an integer instead of a string, although the same principles apply in the string case:
#include <iostream>
static constexpr std::size_t LEN = 10;
void fill_arr(int a[LEN])
{
/* *** */
for (std::size_t i = 0; i < LEN; ++i) {
const int t = 8;
a[i] = t;
}
/* *** */
}
int main(void)
{
int a[LEN];
fill_arr(a);
for (std::size_t i = 0; i < LEN; ++i) {
std::cout << a[i] << " ";
}
std::cout << "\n";
return 0;
}
We can compare this to a version with the following difference:
/* *** */
const int t = 8;
for (std::size_t i = 0; i < LEN; ++i) {
a[i] = t;
}
/* *** */
With optimization disabled, gcc 10.2 puts 8 on the stack on every pass of the loop for the declaration-in-loop version:
mov QWORD PTR -8[rbp], 0
.L3:
cmp QWORD PTR -8[rbp], 9
ja .L4
mov DWORD PTR -12[rbp], 8 ;✷
whereas it only does it once for the out-of-loop version:
mov DWORD PTR -12[rbp], 8 ;✷
mov QWORD PTR -8[rbp], 0
.L3:
cmp QWORD PTR -8[rbp], 9
ja .L4
Does this make a performance impact? I didn't see an appreciable difference in runtime between them with my CPU (Intel i7-7700K) until I pushed the number of iterations into the billions, and even then the average difference was less than 0.01s. It's only a single extra operation in the loop, after all. (For a string, the difference in in-loop operations is obviously a bit greater, but not dramatically so.)
What's more, the question is largely academic, because with an optimization level of -O1 or higher gcc outputs identical asm for both source files, as does clang. So, at least for simple cases like this, it's unlikely to make any performance impact either way. Of course, in a real-world program, you should always profile rather than make assumptions.
Question #1: Is declaring a variable inside a loop a good practice or
bad practice?
As with practically every question like this, it depends. If the declaration is inside a very tight loop and you're compiling without optimizations, say for debugging purposes, it's theoretically possible that moving it outside the loop would improve performance enough to be handy during your debugging efforts. If so, it might be sensible, at least while you're debugging. And although I don't think it's likely to make any difference in an optimized build, if you do observe one, you/your pair/your team can make a judgement call as to whether it's worth it.
At the same time, you have to consider not only how the compiler reads your code, but also how it comes off to humans, yourself included. I think you'll agree that a variable declared in the smallest scope possible is easier to keep track of. If it's outside the loop, it implies that it's needed outside the loop, which is confusing if that's not actually the case. In a big codebase, little confusions like this add up over time and become fatiguing after hours of work, and can lead to silly bugs. That can be much more costly than what you reap from a slight performance improvement, depending on the use case.
Once upon a time (pre C++98); the following would break:
{
for (int i=0; i<.; ++i) {std::string foo;}
for (int i=0; i<.; ++i) {std::string foo;}
}
with the warning that i was already declared (foo was fine as that's scoped within the {}). This is likely the WHY people would first argue it's bad. It stopped being true a long time ago though.
If you STILL have to support such an old compiler (some people are on Borland) then the answer is yes, a case could be made to put the i out the loop, because not doing so makes it makes it "harder" for people to put multiple loops in with the same variable, though honestly the compiler will still fail, which is all you want if there's going to be a problem.
If you no longer have to support such an old compiler, variables should be kept to the smallest scope you can get them so that you not only minimise the memory usage; but also make understanding the project easier. It's a bit like asking why don't you have all your variables global. Same argument applies, but the scopes just change a bit.
It's a very good practice, as all above answer provide very good theoretical aspect of the question let me give a glimpse of code, i was trying to solve DFS over GEEKSFORGEEKS, i encounter the optimization problem......
If you try to solve the code declaring the integer outside the loop will give you Optimization Error..
stack<int> st;
st.push(s);
cout<<s<<" ";
vis[s]=1;
int flag=0;
int top=0;
while(!st.empty()){
top = st.top();
for(int i=0;i<g[top].size();i++){
if(vis[g[top][i]] != 1){
st.push(g[top][i]);
cout<<g[top][i]<<" ";
vis[g[top][i]]=1;
flag=1;
break;
}
}
if(!flag){
st.pop();
}
}
Now put integers inside the loop this will give you correct answer...
stack<int> st;
st.push(s);
cout<<s<<" ";
vis[s]=1;
// int flag=0;
// int top=0;
while(!st.empty()){
int top = st.top();
int flag = 0;
for(int i=0;i<g[top].size();i++){
if(vis[g[top][i]] != 1){
st.push(g[top][i]);
cout<<g[top][i]<<" ";
vis[g[top][i]]=1;
flag=1;
break;
}
}
if(!flag){
st.pop();
}
}
this completely reflect what sir #justin was saying in 2nd comment....
try this here
https://practice.geeksforgeeks.org/problems/depth-first-traversal-for-a-graph/1. just give it a shot.... you will get it.Hope this help.
Chapter 4.8 Block Structure in K&R's The C Programming Language 2.Ed.:
An automatic variable declared and initialized in a
block is initialized each time the block is entered.
I might have missed seeing the relevant description in the book like:
An automatic variable declared and initialized in a
block is allocated only one time before the block is entered.
But a simple test can prove the assumption held:
#include <stdio.h>
int main(int argc, char *argv[]) {
for (int i = 0; i < 2; i++) {
for (int j = 0; j < 2; j++) {
int k;
printf("%p\n", &k);
}
}
return 0;
}
The two snippets below generate the same assembly.
// snippet 1
void test() {
int var;
while(1) var = 4;
}
// snippet 2
void test() {
while(1) int var = 4;
}
output:
test():
push rbp
mov rbp, rsp
.L2:
mov DWORD PTR [rbp-4], 4
jmp .L2
Link: https://godbolt.org/z/36hsM6Pen
So, until profiling opposes or computation extensive constructor is involved, keeping declation close to its usage should be the default approach.

random numbers within a function c++

I am trying to better use and understand functions and in this case I need to figure out how to make this function in particular return 2 different random numbers.I have set up the ctime and can successfully call my function and make my variable(message) equal the random number but when I call it again and ask it to print out the newest random number they are both the same.
#include <iostream>
#include <ctime>
#include <cstdlib>
using namespace std;
int RandomNumberGen (int x);
int main()
{
srand(unsigned(time(0)));
int Ran;
int message;
message = RandomNumberGen (Ran);
cout << "Number 1 " << message << endl;
message = RandomNumberGen (Ran);
cout << "Number 2 " << message << endl;
return 0;
}
int RandomNumberGen (int Ran)
{
unsigned int RandomNum = 0;
RandomNum = rand()%8 + 4;
return RandomNum;
}
As you can see I set the function to output a random number and write it on screen and then I call the same function again and write that one on screen(write the second # on screen).Yet every time I call the function both numbers are written on the screen as the same even though I am generating a random number each time.
I know this is a simple and easy task but please let me know if what I am attempting to do is possible or do I need another separate function for the 2nd number.
My end goal is to base a lot of events and things off of one random number function so I can essentially call the function for a random number and then let it determine what happens next.
I placed in the code as you asked.I hope this is what you meant.I appreciate all the help and am very grateful for the answers.I plan to be using this more so I will be sure to get it right for you guys!
You called srand in the function RandomNumberGen, which is called multiple times. That's wrong, srand should be called only once, try put it in main instead.
Or better, instead of the C library crandom functions, use the methods in random if that's available.
Take a look at this question and particularly the answer. It should clarify some things about srand() and rand()
I quote the answer:
Seed is usually taken from the current time, which are the seconds, as in time(NULL), so if you always set the seed before taking the random number, you will get the same number as long as you call the srand/rand combo multiple times in the same second.
Basically what you should do, is call srand() only once at the beginning of your application and not in the function. Because your function will be called twice at a very short interval. Almost all the time in the same second. Generating the same starting sequence for the rand() function
you shouldn't seed the random number generator with each call, but only once. your two calls to the function probably execute so fast that the value of time(null) is still the same, and the same seed will produce the same pseudo random sequence.
(btw., you should use null or NULL, not 0 for a null pointer, and you do not use the parameter you pass to your randomnumbergen function, so you could just remove it entirely.)

What are some possible causes of program crashing when returning a value?

I have a bunch of code roughly equivalent to this:
bool test(double e, short a, short b, short c) {
// Things being calculated here...
cout << "debug_3" << endl;
return (1 - abs(cos_th)) < (1 - cos(e));
}
int main() {
// something...
cout << "debug_0" << endl;
if(test(e,1,2,0)) {
cout << "debug_4" << endl;
// Bunch of useful operations...
}
// something...
}
Running the code generates the output:
debug_3
After which the program crashes (displaying "The program has stopped working..." in Windows). I have never encountered crashing at value return and I don't know what causes it or how I could fix it. Any thoughts on the issue?
EDIT: Some more info:
In my builds I also verify that the values of cos_th and e are valid.
People seem to point to the second something as the source of problems but my problem seems resolved (i.e. no crashes) when I get rid of the if-statement with a call to test()...
The only things we can fix without knowing what system is, is to change the type of a b and c to unsigned short since they are just array indexes, and make sure they are within array bounds. You might also need to make sure this is not zero since you divide by the result:
sqrt((Xca*Xca+Yca*Yca+Zca*Zca)*(Xba*Xba+Yba*Yba+Zba*Zba))
Use cerr instead of cout to make sure the output is flushed but you still don't see debug 4.
Put more output inside an else condition or after the if: maybe the function returns false?
If you can't locate the error precisely, use a debugger.
Crash at return usually means that your function overwrites stack (and thus the return address) and your program jumps to nowhere. You can verify this by stepping instruction by instruction at the disassembly level.

cout or printf which of the two has a faster execution speed C++?

I have been coding in C++ for a long time. I always wondered which has a faster execution speed printf or cout?
Situation: I am designing an application in C++ and I have certain constraints such as time limit for execution. My application has loads printing commands on the console. So which one would be preferable printf or cout?
Each has its own overheads. Depending on what you print, either may be faster.
Here are two points that come to mind -
printf() has to parse the "format" string and act upon it, which adds a cost.
cout has a more complex inheritance hierarchy and passes around objects.
In practice, the difference shouldn't matter for all but the weirdest cases. If you think it really matters - measure!
EDIT -
Oh, heck, I don't believe I'm doing this, but for the record, on my very specific test case, with my very specific machine and its very specific load, compiling in Release using MSVC -
Printing 150,000 "Hello, World!"s (without using endl) takes about -
90ms for printf(), 79ms for cout.
Printing 150,000 random doubles takes about -
3450ms for printf(), 3420ms for cout.
(averaged over 10 runs).
The differences are so slim this probably means nothing...
Do you really need to care which has a faster execution speed? They are both used simply for printing text to the console/stdout, which typically isn't a task that demands ultra-high effiency. For that matter, I wouldn't imagine there to be a large difference in speed anyway (though one might expect printf to be marginally quicker because it lacks the minor complications of object-orientedness). Yet given that we're dealing with I/O operations here, even a minor difference would probably be swamped by the I/O overhead. Certainly, if you compared the equivalent methods for writing to files, that would be the case.
printf is simply the standard way to output text to stdout in C.
'cout' piping is simply the standard way to output text to stdout in C++.
Saying all this, there is a thread on the comp.lang.cc group discussing the same issue. Consensus does however seem to be that you should choose one over the other for reasons other than performance.
The reason C++ cout is slow is the default sync with stdio.
Try executing the following to deactivate this issue.
ios_base::sync_with_stdio(false)
http://www.cplusplus.com/reference/iostream/ios_base/sync_with_stdio/
http://msdn.microsoft.com/es-es/library/7yxhba01.aspx
On Windows at least, writing to the console is a huge bottleneck, so a "noisy" console mode program will be far slower than a silent one. So on that platform, slight differences in the library functions used to address the console will probably make no significant difference in practice.
On other platforms it may be different. Also it depends just how much console output you are doing, relative to other useful work.
Finally, it depends on your platform's implementation of the C and C++ I/O libraries.
So there is no general answer to this question.
Performance is a non-issue for comparison; can't think of anything where it actually counts (developing a console-program). However, there's a few points you should take into account:
Iostreams use operator chaining instead of va_args. This means that your program can't crash because you passed the wrong number of arguments. This can happen with printf.
Iostreams use operator overloading instead of va_args -- this means your program can't crash because you passed an int and it was expecting a string. This can happen with printf.
Iostreams don't have native support for format strings (which is the major root cause of #1 and #2). This is generally a good thing, but sometimes they're useful. The Boost format library brings this functionality to Iostreams for those who need it with defined behavior (throws an exception) rather than undefined behavior (as is the case with printf). This currently falls outside the standard.
Iostreams, unlike their printf equivilants, can handle variable length buffers directly themselves instead of you being forced to deal with hardcoded cruft.
Go for cout.
I recently was working on a C++ console application on windows that copied files using CopyFileEx and was echoing the 'to' and 'from' paths to the console for each copy and then displaying the average throughput at the end of the operation.
When I ran the console application using printf to echo out the strings I was getting 4mb/sec, when replacing the printf with std::cout the throughput dropped to 800kb/sec.
I was wondering why the std::cout call was so much more expensive and even went so far as to echo out the same string on each copy to get a better comparison on the calls. I did multiple runs to even out the comparison, but the 4x difference persisted.
Then I found this answer on stackoverflow..
Switching on buffering for stdout did the trick, now my throughput numbers for printf and std::cout are pretty much the same.
I have not dug any deeper into how printf and cout differ in console output buffering, but setting the output buffer before I begin writing to the console solved my problem.
Another Stack Overflow question addressed the relative speed of C-style formatted I/O vs. C++ iostreams:
Why is snprintf faster than ostringstream or is it?
http://www.fastformat.org/performance.html
Note, however, that the benchmarks discussed were for formatting to memory buffers. I'd guess that if you're actually performing the I/O to a console or file that the relative speed differences would be much smaller due to the I/O taking more of the overall time.
If you're using C++, you should use cout instead as printf belongs to the C family of functions. There are many improvements made for cout that you may benefit from. As for speed, it isn't an issue as console I/O is going to be slow anyway.
In practical terms I have always found printf to be faster than cout. But then again, cout does a lot more for you in terms of type safety. Also remember printf is a simple function whereas cout is an object based on a complex streams hierarchy, so it's not really fair to compare execution times.
To settle this:
#include <iostream>
#include <cstdio>
#include <ctime>
using namespace std;
int main( int argc, char * argcv[] ) {
const char * const s1 = "some text";
const char * const s2 = "some more text";
int x = 1, y = 2, z = 3;
const int BIG = 2000;
time_t now = time(0);
for ( int i = 0; i < BIG; i++ ) {
if ( argc == 1 ) {
cout << i << s1 << s2 << x << y << z << "\n";
}
else {
printf( "%d%s%s%d%d%d\n", i, s1, s2, x, y, z );
}
}
cout << (argc == 1 ? "cout " : "printf " ) << time(0) - now << endl;
}
produces identical timings for cout and printf.
Why don't you do an experiment? On average for me, printing the string helloperson;\n using printf takes, on average, 2 clock ticks, while cout using endl takes a huge amount of time - 1248996720685 clock ticks. Using cout with "\n" as the newline takes only 41981 clock ticks. The short URL for my code is below:
cpp.sh/94qoj
link may have expired.
To answer your question, printf is faster.
#include <iostream>
#include <string>
#include <ctime>
#include <stdio.h>
using namespace std;
int main()
{
clock_t one;
clock_t two;
clock_t averagePrintf;
clock_t averageCout;
clock_t averagedumbHybrid;
for (int j = 0; j < 100; j++) {
one = clock();
for (int d = 0; d < 20; d++) {
printf("helloperson;");
printf("\n");
}
two = clock();
averagePrintf += two-one;
one = clock();
for (int d = 0; d < 20; d++) {
cout << "helloperson;";
cout << endl;
}
two = clock();
averageCout += two-one;
one = clock();
for (int d = 0; d < 20; d++) {
cout << "helloperson;";
cout << "\n";
}
two = clock();
averagedumbHybrid += two-one;
}
averagePrintf /= 100;
averageCout /= 100;
averagedumbHybrid /= 100;
cout << "printf took " << averagePrintf << endl;
cout << "cout took " << averageCout << endl;
cout << "hybrid took " << averagedumbHybrid << endl;
}
Yes, I did use the word dumb. I first made it for myself, thinking that the results were crazy, so I searched it up, which ended up with me posting my code.
Hope it helps,
Ndrewffght
If you ever need to find out for performance reasons, something else is fundamentally wrong with your application - consider using some other logging facility or UI ;)
Under the hood, they will both use the same code, so speed differences will not matter.
If you are running on Windows only, the non-standard cprintf() might be faster as it bypasses a lot of the streams stuff.
However it is an odd requirement. Nobody can read that fast. Why not write output to a file, then the user can browse the file at their leisure?
Anecdotical evidence:
I've once designed a logging class to use ostream operators - the implementation was insanely slow (for huge amounts of data).
I didn't analyze that to much, so it might as well have been caused by not using ostreams correctly, or simply due to the amount of data logged to disk. (The class has been scrapped because of the performance problems and in practice printf / fmtmsg style was preferred.)
I agree with the other replies that in most cases, it doesn't matter. If output really is a problem, you should consider ways to avoid / delay it, as the actual display updates typically cost more than a correctly implemented string build. Thousands of lines scrolling by within milliseconds isn't very informative anyway.
You should never need to ask this question, as the user will only be able to read slower than both of them.
If you need fast execution, don't use either.
As others have mentioned, use some kind of logging if you need a record of the operations.

Condition evaluation in loops?

string strLine;//not constant
int index = 0;
while(index < strLine.length()){//strLine is not modified};
how many times strLine.length() is evaluated
do we need to put use nLength with nLength assigned to strLine.length() just before loop
length will be evaluated every time you go via the loop, however since length is constant time (O(1)) it doesn't make much difference and adding a variable for storing this value will probably have a negligible effect with a small hit on code readability (as well as breaking the code if the string is ever changed).
length() is defined in headers which are included in your source file, so it can be inlined by compiler, it's internal calls also could be inlined, so if compiler would be able to detect that your string instance isn't changed in the loop then it can optimize access to length of a string, so it will be evaluated only once.
In any case I don't think that storing value of string's length is really necessary. Maybe it can save you some nanosecs, but your code will be bigger and there will be some risk when you will decide to change that string inside loop.
Each time it is called ... (each while evaluation).
If you are not changing the string lenght you are better of with a temporary variable like:
string strLine;
int stringLength = strLine.length();
int index = 0;
while(index < stringLength);
I think there is a second question lurking inside this, and that's "which implementation is more clear?"
If, semantically, you mean for the length of strLine to never change inside the body of the loop, make it obvious by assigning to a well named variable. I'd even make it const. This makes it clear to other programmers (and yourself) that the comparison value is never changing.
The other thing this does it make it easier to see what that value is when you're stepping through the code in a debugger. Hover-over works a lot better on a local than it does on a function call.
Saying, "leave it as a function call; the compiler will optimize it" strikes me as premature pessimization. Even though length() is O(1), if not inlined (you can't guarantee that optimizations aren't disabled) it's is still a nontrivial function call. By using a local variable, you clarify your meaning, and you get a possibly non-trivial performance optimization.
Do what makes your intent most clear.
strLine.length() will be evaluated while( i < strLine.length() )
Having said that if the string is constant, most compilers will optimize this( with proper settings ).
If you are going to use a temporally variable use a const qualifier, so the compiler can add optimizations knowing that the value will not change:
string strLine;//not constant
int index = 0;
const int strLenght = strLine.Length();
while(index < strLine.length()){//strLine is not modified};
Chances are that the compiler itself make those optimizations when accessing the Length() method anyway.
Edit: my assembly is a little rusty, but i think that the evaluation takes place just once.
Given this code:
int main()
{
std::string strLine="hello world";
for (int i=0; i < strLine.length(); ++i)
{
std::cout << strLine[i] <<std::endl;
}
}
Generates this assembly:
for (int i=0; i < strLine.length(); ++i)
0040104A cmp dword ptr [esp+20h],esi
0040104E jbe main+86h (401086h)
But for this code
std::string strLine="hello world";
const int strLength = strLine.length();
for (int i=0; i < strLength ; ++i)
{
std::cout << strLine[i] <<std::endl;
}
generates this one:
for (int i=0; i < strLength ; ++i)
0040104F cmp edi,esi
00401051 jle main+87h (401087h)
The same assembly is generated if a const qualifier is not used, so in this case it doesn't make a difference.
Tried with VSC++ 2005
As stated, since the string::length function is likely entirely defined in a header, and is required to be O(1), it's almost certain to evaluate to a simple member access, and get inlined into your code. Since you don't declare the string as volatile, the compiler is allowed to imagine that no outside code is going to change it, and optimize the call to a single memory access and leave the value in a register if it finds that that is a good idea.
By grabbing and caching the value yourself, you increase the chances that the compiler will be able to do the same thing. In many cases, the compiler will not even generate the code to write the string length into the stack, and just leave it in a register. Of course, if you call out to different functions that the compiler cannot inline, then the value will have to be written to the stack to prevent the function calls from turfing the register.
Since you are not changing the string, shouldn't you be using
const string strLine;
Just, because then the compiler gets some more information on what can and what cannot change - not sure exactly how smart a C++ compiler can get, though.
strLine.length() will be evaluated every time you go around the loop.
You're correct in that it would be more efficient to use nLength, especially if strLine is long.