in visual c++ 2010 I'm timing the execution as follows
unsigned int start = clock();
//run some stuff
int time_ = clock() - start;
cout<<"this took "<<time_<<" msecs.";
cin.get();
return 0;
in debug mode, the time is reflective of the complexity but in release mode, while it feels a bit faster than debug is showing 0 for the time no matter what, is there setting that I'm not aware of? thanks.
Related
I try to transplant project from Code::Blocks to Visual C++ 2019.
After that, I am undergoing some performance problems.
The most considerable performance dropping is that when I need to generate large amount of random double variables.
After Searching for information from the internet, I still can't give a reasonable explain for this.
I briefly extract part of my code below:
#include <iostream>
#include <random>
#include <time.h>
class PublicFunction {
public:
static double RandomDouble(double, double);
private:
static std::mt19937 generator_;
};
std::mt19937 PublicFunction::generator_(static_cast<int>(time(0)));
double PublicFunction::RandomDouble(double min_value, double max_value) {
std::uniform_real_distribution<double> distribution(min_value, max_value);
return distribution(PublicFunction::generator_);
}
int main() {
double d;
double t_start, t_end;
t_start = clock();
for (unsigned i = 0; i < 100000000; ++i) {
d = PublicFunction::RandomDouble(-100, 100);
}
t_end = clock();
std::cout << (t_end - t_start) << std::endl;
system("PAUSE");
return 0;
}
After running the code in code::blocks which is using C++ 14 and in Visual C++ 2019.
code::blocks finished with execution time: 6155 ms, whereas Visual C++ got 40305 ms, it is almost 7 times slower.
I'm wondering why this problem happened, and is there any way to fix it? Thanks.
First, thanks for #churill commented. His advice is very useful.
There are a lot differences between Debug and Release.
The key factor of influence to this problem is that Release will optimize when compiling.
For more information:
https://stackoverflow.com/a/938665/6367757
medium article
See the following code, which is my attempt to print the time elapsed between loops.
void main()
{
while (true)
{
static clock_t timer;
clock_t t = clock();
clock_t elapsed = t - timer;
float elapsed_sec = (float)elapsed / (float)CLOCKS_PER_SEC;
timer = t;
printf("[dt:%d ms].\n", (int)(elapsed_sec*1000));
}
}
However, if I set a breakpoint and sit there for 10 seconds, when i continue execution the elapsed time includes that 5 seconds -- and I don't want it to, for my intended usage.
I assume clock() is the wrong function, but what is the correct one?
Note that if there IS no standard C or C++ single call for this -- well, how do you compute it? Is there a posix way?
I suspect that this is actually information only knowable with platform-specific calls. If that is the case, i'd like to at least know how to do so on windows (msvc).
Yes, trying to measure the CPU time of your process would be dependent on support from your operating system. Rather than look up what support is available from various operating systems, though, I would propose that your approach is flawed.
Debugging typically uses a debug build that has most optimizations turned off (to make it easier to do things like set breakpoints accurately). Timings on a non-optimized build lack practical value. Hence any timings of your program should usually be ignored when you are using breakpoints–even if the breakpoint is outside the timed section.
To combine using breakpoints with timings, I would do the debugging in two phases. In one phase, you set breakpoints and look at what is happening in the debug build. In the other phase, you use identical input (redirect a file into std::cin if it helps) and time the process in the release build. There may be some back-and-forth between the stages as you work out what is going on. That's fine; the point is not to have exactly two phases, but to keep breakpoints and timings separate.
Although JaMit gives a better answer (here), it is possible but it depends entirely on your compiler and the amount of overhead this creates will probably slow down your program too much to get an accurate result. You can use whatever time recording function you please but either way you would have to:
Record the start of the loop
Record the start of the breakpoint
Programmatically cause a breakpoint
Record the end of the breakpoint
Record the end of the loop.
If you're looking for speed though, you really need to be testing in an optimized release mode without breakpoints and writing the output to the console or a file. Nonetheless, it is possible to do what you're trying to do, and here's a sample solution.
#include <chrono>
#include <intrin.h> //include for visual studio break
#include <iostream>
int main(void) {
for (int c = 0; c < 100; c++) {
//Start of loop
auto start = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::high_resolution_clock::now().time_since_epoch()).count();
/*Do stuff here*/
auto startBreak = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::high_resolution_clock::now().time_since_epoch()).count();
//visual studio only
__debugbreak();
auto endBreak = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::high_resolution_clock::now().time_since_epoch()).count();
/*Do more stuff here*/
auto end = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::high_resolution_clock::now().time_since_epoch()).count();
/*Time for 1 pass of the loop, including all the records of getting the time*/
std::cout << end - start - (endBreak - startBreak) << "\n";
}
return 0;
}
given your objective: which is my attempt to print the time elapsed between loops.
in the C language:
Note that clock() returns the number of clock tics since the program started.
This code measures the elapsed time for each loop by showing the start/end time for each loop
#include <stdio.h>
#include <time.h>
int main( void )
{
clock_t loopStart = clock();
clock_t loopEnd;
for( int i=0; i< 10000; i++ )
{
loopEnd = clock();
// something here to be timed
printf("[%lf : %lf ms].\n", (double)loopStart/CLOCKS_PER_SEC, (double)loopEnd/CLOCKS_PER_SEC );
loopStart = loopEnd;
}
}
Of course, if you want to display the actual number of clock ticks per loop, then remove the division by CLOCKS_PER_SEC and calculate the difference and only display the difference
So I think there was a bad version of C++ where unordered_set was incredibly slow. This piece of code took one minute to execute:
int main() {
unordered_set<int> blah;
for (int i = 0; i < 1000000; ++i)
{
blah.insert(i);
}
cout << "done 2" << endl;
return 0;
}
It took about 40 seconds to get to the output statement, then took another 20 seconds to deallocate the object. It's C# counterpart with 10 times the insertions executes in about a second:
static void Main(string[] args)
{
HashSet<int> set = new HashSet<int>();
for (int i = 0; i < 10000000; ++i)
{
set.Add(i);
}
}
This caused my FacebookHackerCup solution to not run in time :(. How can this be fixed for someone using the visual studio C++ IDE? I don't know what version it runs via command line or how to upgrade it.
Run without debugging
Need to compile in Release mode
Potentially need to turn on optimizations by going to project properties -> Configuration Properties -> C/C++ -> Optimization -> Maximize Speed (/O2). Mine was already set for Release, not for Debug.
This makes the original code run in about a second, and the 10,000,000 variant to run in about 8 seconds (so still about ~4 times slower than C# in Debug mode with debugging). EDIT: It seems C#'s speed does not greatly change with or without debugging or building in Release mode for this particular program.
I've made a small application that averages the numbers between 1 and 1000000. It's not hard to see (using a very basic algebraic formula) that the average is 500000.5 but this was more of a project in learning C++ than anything else.
Anyway, I made clock variables that were designed to find the amount of clock steps required for the application to run. When I first ran the script, it said that it took 3770000 clock steps, but every time that I've run it since then, it's taken "0.0" seconds...
I've attached my code at the bottom.
Either a.) It's saved the variables from the first time I ran it, and it's just running quickly to the answer...
or b.) something is wrong with how I'm declaring the time variables.
Regardless... it doesn't make sense.
Any help would be appreciated.
FYI (I'm running this through a Linux computer, not sure if that matters)
double avg (int arr[], int beg, int end)
{
int nums = end - beg + 1;
double sum = 0.0;
for(int i = beg; i <= end; i++)
{
sum += arr[i];
}
//for(int p = 0; p < nums*10000; p ++){}
return sum/nums;
}
int main (int argc, char *argv[])
{
int nums = 1000000;//atoi(argv[0]);
int myarray[nums];
double timediff;
//printf("Arg is: %d\n",argv[0]);
printf("Nums is: %d\n",nums);
clock_t begin_time = clock();
for(int i = 0; i < nums; i++)
{
myarray[i] = i+1;
}
double average = avg(myarray, 0, nums - 1);
printf("%f\n",average);
clock_t end_time = clock();
timediff = (double) difftime(end_time, begin_time);
printf("Time to Average: %f\n", timediff);
return 0;
}
You are measuring the I/O operation too (printf), that depends on external factors and might be affecting the run time. Also, clock() might not be as precise as needed to measure such a small task - look into higher resolution functions such as clock_get_time(). Even then, other processes might affect the run time by generating page fault interrupts and occupying the memory BUS, etc. So this kind of fluctuation is not abnormal at all.
On the machine I tested, Linux's clock call was only accurate to 1/100th of a second. If your code runs in less than 0.01 seconds, it will usually say zero seconds have passed. Also, I ran your program a total of 50 times in .13 seconds, so I find it suspicous that you claim it takes 2 seconds to run it once on your computer.
Your code incorrectly uses the difftime, which may display incorrect output as well if clock says time did pass.
I'd guess that the first timing you got was with different code than that posted in this question, becase I can't think of any way the code in this question could produce a time of 3770000.
Finally, benchmarking is hard, and your code has several benchmarking mistakes:
You're timing how long it takes to (1) fill an array, (2) calculate an average, (3) format the result string (4) make an OS call (slow) that prints said string in the right language/font/colo/etc, which is especially slow.
You're attempting to time a task which takes less than a hundredth of a second, which is WAY too small for any accurate measurement.
Here is my take on your code, measuring that the average takes ~0.001968 seconds on this machine.
I'm running the following code, using Visual Studio 2008 SP1, on Windows Vista Business x64, quad core machine, 8gb ram.
If I build a release build, and run it from the command line, it reports 31ms. If I then start it from the IDE, using F5, it reports 23353ms.
Here are the times: (all Win32 builds)
DEBUG, command line: 421ms
DEBUG, from the IDE: 24,570ms
RELEASE, command line: 31ms
RELEASE, from IDE: 23,353ms
code:
#include <windows.h>
#include <iostream>
#include <set>
#include <algorithm>
using namespace std;
int runIntersectionTestAlgo()
{
set<int> set1;
set<int> set2;
set<int> intersection;
// Create 100,000 values for set1
for ( int i = 0; i < 100000; i++ )
{
int value = 1000000000 + i;
set1.insert(value);
}
// Create 1,000 values for set2
for ( int i = 0; i < 1000; i++ )
{
int random = rand() % 200000 + 1;
random *= 10;
int value = 1000000000 + random;
set2.insert(value);
}
set_intersection(set1.begin(),set1.end(), set2.begin(), set2.end(), inserter(intersection, intersection.end()));
return intersection.size();
}
int main(){
DWORD start = GetTickCount();
runIntersectionTestAlgo();
DWORD span = GetTickCount() - start;
std::cout << span << " milliseconds\n";
}
Running under a Microsoft debugger (windbg, kd, cdb, Visual Studio Debugger) by default forces Windows to use the debug heap instead of the default heap. On Windows 2000 and above, the default heap is the Low Fragmentation Heap, which is insanely good compared to the debug heap. You can query the kind of heap you are using with HeapQueryInformation.
To solve your particular problem, you can use one of the many options recommended in this KB article: Why the low fragmentation heap (LFH) mechanism may be disabled on some computers that are running Windows Server 2003, Windows XP, or Windows 2000
For Visual Studio, I prefer adding _NO_DEBUG_HEAP=1 to Project Properties->Configuration Properties->Debugging->Environment. That always does the trick for me.
Pressing pause while in the VS IDE shows that the additional time appears to be spent in malloc/free. This would lead me to believe the debugging support in MS's malloc and free implementation have additional logic if the debugger is attached. This would explain the discrepancy in times from the console and from the debugger.
EDIT: Confirmed by running with CTRL+F5 v. F5 (1047ms v. 9088ms on my machine)
So it sounds like this may just be what happens when one attaches the debugger. However, I just can't my head around the performance changing from 30ms to 23,000ms because of that, especially when the rest of my code seems to run just as fast whether or not the debugger is attached.