Force Visual Studio to Step Into STL classes/functions - c++

Lets assume we have this code snippet
#include <vector>
int main()
{
std::vector<int> a = { 1,2 };
a.push_back(3);
return 0;
}
In VS 2019 I am attempting to Step Into (F11) the constructor and push_back function, but VS simply steps over it.
There are other solutions like Debugging C++ app in Visual Studio 2017 steps into not my code is there a way to turn this off? or Skip STL Code when debugging C++ Code in Visual Studio 2012?, which actually ask for the opposite (turn step into off). So I tried to reverse their solutions, e.g. adding
<Function><Name>std::.*</Name><Action>StepInto</Action></Function> in
C:\...\Visual Studio\2019\Professional\Common7\Packages\Debugger\Visualizers\default.natstepfilter
but it doesn't work.
I am running Debug x64 with these options
/JMC /permissive- /GS /W4 /Zc:wchar_t /ZI /Gm- /Od /sdl /Fd"x64\Debug\vc142.pdb" /Zc:inline /fp:precise /D "_CRT_SECURE_NO_WARNINGS" /D "_MBCS" /errorReport:prompt /WX- /Zc:forScope /RTC1 /Gd /MDd /std:c++17 /FC /Fa"x64\Debug\" /EHsc /nologo /Fo"x64\Debug\" /Fp"x64\Debug\EnvTest.pch" /diagnostics:column
What's the right setting to force VS to step into STL classes/functions?

Turn off: Tools > Options > Debugging > General > [X] Enable Just My Code.
This is not a build/project setting but an IDE option. With this option on you will step over all the standard library code.

Related

Is std::any supported in MSVC 2017?

I try to compile a piece of code with:
cl /c /std:c++latest /Gm- /sdl /Zc:inline /RTC1 /Oy /MDd /FA /EHs main.cxx
but I get this error:
error C2039: 'any': is not a member of 'std'
and I wonder how (if possible) can I get to have this feature. I don't see anything about it on their sites but knowing how much time they take to update them maybe it can be done
Yes, <any> has shipped with every release of VS 2017.
It is, but one has to make sure that the correct c++ version is used.
Right click the project and under Properties->C/C++->Language->C++ Language Standard make sure it is set to the correct one.

Eigen library: eigenvalue computation performance, gcc vs visual studio 2015

I am trying to increase the performance of eigenvalue and eigenvector calculation using the Eigen library using the following piece of code:
MatrixXd eigMat =m.ToMatrixXd(); //internal conversion to MatrixXd
EigenSolver<MatrixXd> es(eigMat,ShouldComputeEigenVectors);
Initially I was using an older version of Eigen with tdm-gcc 4.8 and compiled the code using the optimization at O2 level. The calculation of eigenvalues and vectors for a 1000 by 1000 matrix was taking around 5.4 seconds.
A few months ago I have switched to Visual Studio Community 2015 and upgraded the Eigen library to Eigen 3.3.2. Now the same calculation takes around 18.7 seconds. Why am I getting a worse performance compared with gcc 4.8? Is there anything I can do to go back to 5.4 seconds (needless to say the goal is to catch Matlab which performs in 0.8 seconds).
The settings for VS 2015:
/GS /Qpar /GL /analyze- /W3 /Gy /Zc:wchar_t /I"C:\wxWidgets-3.1.0\lib\vc_dll\mswu" /Zi /Gm- /O2 /Ob2 /sdl /Fd"Release\vc140.pdb" /Zc:inline /fp:precise /D "_CRT_SECURE_NO_WARNINGS" /D "WIN32" /D "_UNICODE" /D "__WXMSW__" /D "UNICODE" /D "WXUSINGDLL" /D "NDEBUG" /D "EIGEN_NO_DEBUG" /D "_MBCS" /errorReport:prompt /WX- /Zc:forScope /arch:SSE2 /Gd /Oy- /Oi /MD /openmp /Fa"Release\" /EHsc /nologo /Fo"Release\" /Ot /Fp"Release\sciencesuit.pch"
Btw, I tried the following with no or very little (1 second) performance gains:
Different instruction sets, such as AVX2
Floating model point, to Fast
OpenMP and No OpenMP options
Optimization, full optimization Ox
Thanks in advance.
The short answer is that cl (the Visual Studio compiler) doesn't do as good a job as gcc when it comes to performance, especially when it comes to template libraries such as Eigen.
That being said, try using the older version of Eigen with Visual Studio. There were some changes in Eigen that created drops in performance with Visual Studio (e.g. this).

log10() performance on Visual Studio 2015 a lot slower than Visual Studio 2013 for x86

We have ported a VS2013 C++/MFC application to VS2015 and are having some rather disturbing issues with the performance and code generated by the VS2015 compiler.
Note this is for x86.
It is magnitudes slower on log10() calls. When profiling a Release build using CPU sampling, we see that these calls take up a lot more time than they did before. Going from e.g. 49 samples on the same run for VS2013 to a whopping 7545 samples for the same run in VS2015. This means this function goes from 0.6% of CPU load to 50% for the application in question.
In VS2013 profiler shows:
Function Name Inclusive Samples Exclusive Samples Inclusive Samples % Exclusive Samples %
__libm_sse2_log10 49 49 0.61 0.61
In VS2015 profiler shows:
Function Name Inclusive Samples Exclusive Samples Inclusive Samples % Exclusive Samples %
___sse2_log102 7,545 7,545 50.43 50.43
Why a different function name?
We have looked briefly at the generated assembly for log10. On VS2013 this forwards to disp_pentium4.inc and log10_pentium4.asm. On VS2015 this is different. It seems VS2015 goes back to __libm_sse2_log10 in Debug.
Could the __sse2_log102 be the cause of this performance difference alone? We have checked that results output from functions calling these are within expected floating point differences.
We are compiling with target v140_xp and have the following compile options:
/Yu"stdafx.h" /MP /GS- /GL /analyze- /W4 /wd"4510" /wd"4610" /Zc:wchar_t /Z7 /Gm- /Ox /Ob2 /Zc:inline /fp:fast /D "WINVER=0x0501" /D "WIN32" /D "_WINDOWS" /D "NDEBUG" /D "_CRT_SECURE_NO_WARNINGS" /D "_CRT_SECURE_NO_DEPRECATE" /D "_SCL_SECURE_NO_WARNINGS" /D "_USING_V110_SDK71_" /D "_UNICODE" /D "UNICODE" /errorReport:prompt /WX- /Zc:forScope /GR /arch:SSE2 /Gd /Oy /Oi /MT
Also shown here when viewing properties:
All project settings are the same for both VS2013 and VS2015. Note we are using SSE2 and have floating point model set to fast.
Has anyone encountered the same issue and know how to fix this?
Here my comment as an answer.
It appears that VS2015 has changed the implementation of log10 in release builds, where it calls this new __sse2_log102 function instead of the old __libm_sse2_log10 and that this new implementation is the cause of a huge performance difference.
The fix for us in this case was to call an implementation available in Intels Performance Primitives (IPP) library. E.g. instead of calling:
return log10(v);
Call this instead:
double result;
ippsLog10_64f_A53(&v, &result, 1);
return result;
This resulted in the performance issue to disappear, in fact it was slightly faster using an old IPP 7.0 release. Not all can use and pay for IPP, though, so we hope Microsoft fixes this.
Below is the version of VS2015 that has shown this issue.

Maximum performance configuration for a release build VS2010

I wanted to know the optimum performance configuration I can obtain for a release build. I do not need any debugging info in a release build and if omitting it helps boost performance in a release build I am more than happy to abide by those changes.
Kindly let me know if these setting are acceptable or if any of these settings should be changed for better performance.This is the configuration I have
Build Type : Release
Debug Information Format : Program Database (/Zi)
Preprocesors : Following are the preprocessors
WIN32 QT_LARGEFILE_SUPPORT QT_DLL QT_NO_DEBUG NDEBUG QT_CORE_LIB
QT_GUI_LIB
Generate Debug Info : Yes (/Debug)
Optimization : Maximize Speed (/O2)
Whole Program Optimization : No
Overview of entire configuration
/I".\GeneratedFiles" /I"." /I"C:\Qt\4.8.4\include"
/I".\GeneratedFiles\Release" /I"C:\Qt\4.8.4\include\QtCore"
/I"C:\Qt\4.8.4\include\QtGui"
/I"....\External\boost-win-1.47-32bit-vs2010\include\boost-1_47"
/I"....\External\ta-lib-0.4.0-msvc\ta-lib\c\include\"
/I"....\External\Qpid-32Bit\Debug\include\" /I"..\Common\"
/I"....\External\log4cplus-1.1.2-rc1\include" /Zi /nologo /W1 /WX-
/O2 /Oy- /D "WIN32" /D "QT_LARGEFILE_SUPPORT" /D "QT_DLL" /D
"QT_NO_DEBUG" /D "NDEBUG" /D "QT_CORE_LIB" /D "QT_GUI_LIB" /Gm- /EHsc
/MD /GS /fp:precise /Zc:wchar_t- /Zc:forScope /Fp"Release\WOPR.pch"
/Fa"Release\" /Fo"Release\" /Fd"Release\vc100.pdb" /Gd /analyze-
/errorReport:queue
Should any of the above options be changed inorder to obtain maximum runtime performance.
If I have omitted any options kindly let me know.
If you want to get the most optimized code from your compiler, you can try a profile guided optimization of your critical code. However, this kind of optimization is not as easy to achieve than simply tweaking to compiler options.
The achieve this, you will need to have a suite of tests that represents real-life scenarios. Instrument you code, run theses tests, and then:
The instrumentation data will tell you where you spend most of your CPU time. Try to optimize (by hand) the parts of your code that seems to take a lot of CPU.
Compile again your critical code with the instrumentation data as input.
I have never used this with Visual Studio (only Intel Compilers). VS2010 seems to have profile-guide optimization features.

Compiling Windows C++ application with long doubles in VS2010

At work we have MSVS2010 Ultimate, and I'm writing a program which runs exhaustive simulations using real numbers. I'm getting non-trivial round-off errors and I've already taken reasonable steps to ensure my algorithm is as numerically stable as possible.
I'd like to switch to 128-bit quadruple precision floating point numbers (long double, right?), to see how much of a difference it makes.
I've replaced all relevant instances of double with long double, recompiled, and ran my dummy simulation again but have exactly the same result as before.
These are my (debug) compiler options as per my project property page in C/C++:
/ZI /nologo /W3 /WX- /Od /Oy- /D "_MBCS" /Gm /EHsc /RTC1 /GS /fp:precise /Zc:wchar_t /Zc:forScope /Fp"Debug\FFTU.pch" /Fa"Debug\" /Fo"Debug\" /Fd"Debug\vc100.pdb" /Gd /analyze- /errorReport:queue
My dev CPU is a Core2 Duo T7300 but the target machine will be an i7. Both installations are Windows 7 64-bit.
You could switch to a non-Microsoft compiler such as gcc, Borland, or Intel. Those all recognize long double as 80-bit extended precision, the native internal format of the 8087.