we have C++ code that we want to profile with Nividia Nsight Eclipse (Linux version) before adding CUDA code to it. The idea is to profile C++ first, find hotspots, convert those to CUDA, profile again, and iterate through this process to successively speed up the code. However, when we profile C++ only it looks like the profiler requires some existing CUDA code before it generates a timeline and profile output. Has anyone else encountered this?
Nsight Eclipse Edition can only profile CUDA code. You may want to install 3rd party profiling plug-ins to profile host code.
You may try installing OProfile integration from the Eclipse Foundation site (paste http://download.eclipse.org/releases/indigo/ into Help/Install New Software... dialog) - I just tried it but was unable to properly setup oprofile command-line.
You can manually instrument your code using nvtx (NVIDIA Tools Extension) and have the timeline shown in Nsight, but for automatic profiling and detailed counters it can only profile GPU code.
Yes, Nsight Eclipse can profile C++ code. To rephrase your question, it can also profile Host (CPU) C++ code. By default, it only profiles GPU code. CPU profiling is a much more manual task; it will not profile functions automatically.
You need to use NVTX. Like so:
#include "nvToolsExt.h"
nvtxNameOsThread(0,"InputVideo");
nvtxRangePush(__FUNCTION__);
// .. do some CPU computing here
nvtxRangePop();
Build with -lnvToolsExt -L/usr/local/cuda/lib64
The path to libnvToolsExt.so will be different for everyone. NVTX comes with the CUDA Toolkit.
The CUDA blog has a post on this.
Related
I saw that the libstdc++ Profile Mode has been deprecated recently (see GCC 7 changes).
I just know that the Profile Mode provides some useful suggestions about the usage of c++ standard library. But since it is deprecated, how to get similar suggestions instead?
I'd suggest looking at Callgrind and KCacheGrind as UI. A quick search presented these results:
How to profile C++ application with Callgrind / KCacheGrind
Callgrind: Profile a specific part of my code
I've been previously using OpenGL profiler for mac to debug my graphics work and it was working like a charm with xcode 7.2.
I then upgraded xcode to version 8 when it came out, and the profiler was gone. I redownloaded it, but ever since I have not been able to record any trace or stop at any breakpoint, and therefore cannot inspect any resource anymore either.
There is currently no profiler after the one developed for xcode 7.2.
Is there any way to use the last OpenGL profiler with xcode 8.x?
Thanks in advance.
You can find here a similar post on the apple developer forum.
A bug repport is currently under revision.
The only way I found to use OpenGL Profiler on macOS 10.12 is to
enable remote profiling from the preferences. (checkbox is disabled
while you don't set a password) Once enabled, you can connect to it
using another machine (it may be 10.12) and attach profiler to the
running application you want to debug.
I would like to debug some CUDA code in Linux. However, I came across an error that pertains to X11 not being able to share the GPU with the NSight visual debugger using Eclipse Nsight.
However today I came across this.
3.4.2. Single-GPU Debugging with the Desktop Manager Running
CUDA-GDB can be used to debug CUDA applications on the same GPU that
is running the desktop GUI.
Note: This is a BETA feature available on Linux and supports devices
with SM3.5 compute capability. There are two ways to enable this
functionality:
Use the following command: set cuda software_preemption on Export the
following environment variable: CUDA_DEBUGGER_SOFTWARE_PREEMPTION=1
Either of the options above will activate software preemption. These
options must be set prior to running the application. When the GPU
hits a breakpoint or any other event that would normally cause the GPU
to freeze, CUDA-GDB releases the GPU for use by the desktop or other
applications. This enables CUDA-GDB to debug a CUDA application on the
same GPU that is running the desktop GUI, and also enables debugging
of multiple CUDA applications context-switching on the same GPU.
Note: The options listed above are ignored for GPUs with less than
SM3.5 compute capability.
From here: http://docs.nvidia.com/cuda/cuda-gdb/index.html#single-gpu-debugging-with-desktop-manager-running
Question:
So before I ask my project manager for a new compute SM3.5 compute capability graphics card, can anyone verify that this is working?
Does it work well?
My platform is Centos 7.0, Intel 64-bit.
I got the card, anyway in CentOS 7 it works well. There's some slowdown when it goes into the kernel, but it does what I wanted it to. I can see the variable values inside the kernel.
One thing though, as of today 18/2/2016, I cannot press "stop" when debugging kernels. It hangs the whole system. Oh well, it did say it is a beta feature.
Usually I program on Linux, now I'v setup a Windows environment just to debug with the nsight version of Visual Studio.
But when I try to start the debugger (either Graphics or CUDA Debugging), it doesn't work. The CUDA debugger just disconnects and the Graphics debugger disconnects with
FrameDebugger: Unsupported operation encountered; saving compatibility log to 'C:\Users\##\Documents\NVIDIA Nsight\nvcompatlog.txt'
The file then says
cuGraphicsGLRegisterImage (Registering GL textures for CUDA-Interop is unsupported)
Does it mean there is no way to debug CUDA, when there is interop present? It's hard to believe and so I want to make sure the problem is not on my computer only.
cuGraphicsGLRegisterImage is not supported in Graphics Debugger as the nvcomlog.txt said.
The Cuda Debugger should work. Please contact devtools-support#nvidia.com, you may be asked for the code.
I have some code in CUDA that I want to profile.Unfortunately on the machine I work visual profiler does not work.Would it be possible that I am able to test the code on a visual profiler on some other machine or something like that?
(basically I am looking for a workaround so that I can find bottlenecks).
Use this guide: Profiling CUDA Applications on Windows with NVIDIA Compute Visual Profiler
Since an answer hasn't been accepted yet, I suggest giving the newest version of Visual Profiler a try.
The new NVIDIA Visual Profiler (v4.1) supports automated performance analysis to identify performance improvement opportunities in your application. It also links directly to the most useful sections of the Best Practices Guide for the issues it detects. The Visual Profiler is still available for free as part of the CUDA Toolkit on NVIDIA's developer web site: http://www.nvidia.com/getcuda.
If you're still not able to get it working, please file a bug via your (free) NVIDIA registered developer account so the team working on Visual Profiler can investigate further.