Using visual profiler - profiling

I have some code in CUDA that I want to profile.Unfortunately on the machine I work visual profiler does not work.Would it be possible that I am able to test the code on a visual profiler on some other machine or something like that?
(basically I am looking for a workaround so that I can find bottlenecks).

Use this guide: Profiling CUDA Applications on Windows with NVIDIA Compute Visual Profiler

Since an answer hasn't been accepted yet, I suggest giving the newest version of Visual Profiler a try.
The new NVIDIA Visual Profiler (v4.1) supports automated performance analysis to identify performance improvement opportunities in your application. It also links directly to the most useful sections of the Best Practices Guide for the issues it detects. The Visual Profiler is still available for free as part of the CUDA Toolkit on NVIDIA's developer web site: http://www.nvidia.com/getcuda.
If you're still not able to get it working, please file a bug via your (free) NVIDIA registered developer account so the team working on Visual Profiler can investigate further.

Related

Is DirectX 12 development now available to the public?

With Windows Technical Preview build 10074, D3D12.dll, d3d12SDKLayers.dll and d3d12warp.dll are included in %WINDIR%\System32. With Visual Studio 2015 RC with Tools for Windows 10 (aka Windows Kits 10 - 10069), d3d12.lib, d3d12.h etc. are included. Although there seems to be no press release from Microsoft about its availability, the inclusion of these would seem to indicate that is is now available. Is this correct reasoning, or is something else required?
You can begin developing DirectX 12 applications using the resources you described above. The API itself is not yet complete and the GPU drivers available are not yet of a final shipping quality, so do not be surprised if not everything is fully functional or bug free. Try and validate your application against more than one manufacturer's GPU and also against the WARP driver if you encounter problems.
Preliminary documentation is available on MSDN

how to find out what part of my code is slowing my c++ program

I wrote 2 versions of my program, wich is an evolutionary algorithm in c++. The first version is procedural and works fine and very fast. The second version is completely OOP, and the program finds results, but is very very slow (like 10 times slower than the 1st version). Is there a way to maybe measure time of segments of code inside loops or something like that? Any advice or idea would help.
Thanks in advance.
Use a profiler. Which one is best depends on the platform/operating environment; e.g. with g++ you can use gprof, or if you don't want to recompile you can use oprofile, assuming Linux. On Solaris you could use dtrace. On other platforms, such as Windows or Mac, add the tag for your platform to the question...
You need a profiler to find performance related issues in your program.
Depending on the Visual Studio edition, you have various levels of profiling support in your Visual Studio. If you're lucky enough to be at the Visual Studio Ultimate or Premium edition, you have very good profiling support built right in.
If you're on Visual Studio Express or Visual Studio Professional, there is sadly no profiling support built into Visual Studio, but you can use for example info at this link how to do it manually for free with those editions anyway.
Use a profiler. If you're compiling with gcc, look up gprof, for example.
For your particular case, I suggest downloading and using this tool: http://www.codersnotes.com/sleepy/
It is a very simple (but efficient) sampling profiler.
Just launch your app with Ctrl+F5 (release) in Visual Studio, run this program (Very Sleepy), double click your exe name, wait, and you will see a detailed report with function names.
For the next level, if needed, use VTune.
You can use the \callcap compiler flag in VS. You can read about it here.
Basically you can add this flag only for the .cpp file that you want to analyze, define the enter/exit functions, rebuild your app, and run it. I suggest you split the code you are trying to analyze (and suspecting of being slow) into functions, and then you can see which piece of code takes more time to execute.
It's a little more work, compared to an already available profiler, but it's worth giving it a try.

How do I perform post-mortem debugging for Windows applications?

I develop unmanaged C++ applications in MSVC2008. And occasionally the application crashes at the customer site. I found an article on this. But it was written in 2002 for Visual Studio .NET. Has things changed since? Can the same technique be used? Is there a newer method?
If you are debugging C and C++ apps for Windows, you want to learn how to use WinDBG (distributed as "debugging tools for Windows"). It has a bit of a learning curve, but the documentation is really good and it really is the best the platform has to offer.
As to your question, you can view a crash dump with windbg -z <dump filename>.
Usually release mode binaries (which typically run at customer site) are built with optimization (for speed/memory etc). Troubleshooting optimized binaries is usually not as easy.
So, first check if the crash is reproducible with release mode binaries built without optimization. If yes, then the job is easy (ier).
Here is some info.
Also look at a tool called ADPlus from microsoft

Any way to profile code for cache behavior?

As the title says I'd like to somehow get the cache behavior of my code. I'm running Windows 7 64-bit edition, compiling on Visual Studio 2008 Professional Edition, compiling C++ code.
I understand that there's Valgrind under Linux, but are there any free alternatives I could use, or methods otherwise?
VTune will give you pretty detailed cache and pipeline analysis. It's not cheap though. I believe some level/edition of VS (I remember it was "team edition" on XP) had a decent profiler.
Try AQTime. I'm pretty sure that some of it options include cache profiling.

Which IDE should I use for this art project?

I have an art project that will require processing a live video feed to use as the basis of a particle system, which will be rendered using OpenGL and projected on a stage. I have a CUDA enabled graphics card, and I was thinking it would be nice to be able to use that for the image and particle system processing. This project only needs to run on my computer.
I am normally a C# asp.net Visual Studio kinda guy, but for this project I plan on using c++. Should I do the work in Eclipse on Ubuntu or Visual Studio in Windows?
I realize this can be fairly arbitrary, but I wondering if one IDE/OS might be better suited for this kind of work than the other
Are you aware of OpenFrameworks? This might just help shortcut to what you need.
As far as the CUDA or OpenGL support is concerned you are fine with either of them. The nVidia examples are also multiplatform.
The real question is if you plan on using any GUI Toolkit as there are a only a few choices that are really portable.
In the end I'd recommend going with what you feel more comfortable with or where you will have the biggest knowledge gain (if learning something is a goal of the project.).
+1 for Visual Studio.
I haven't heard about any IDE especially good for such tasks.
If you already know VS, I see no reason to learn anything else.
While the CUDA toolkit is cross-platform, i recommend Linux in this case:
The debugger is based on gdb and the usability of the gcc toolchain is just much better on *nixes. You also don't seem to have any windows specific dependencies.
Since you're already familiar with Visual Studio you should probably stick with it. In addition, you'll be able to use the Nexus debugger to debug both the OpenGL and CUDA components.