Nsys Does not show the CUDA kernels profiling output - c++

My system is V100 with the following information:
| NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.6 |
NVIDIA Nsight Systems version 2021.5.2.53-28d0e6e
sudo sh -c “echo 2 >/proc/sys/kernel/perf_event_paranoid”
/bin/bash: /proc/sys/kernel/perf_event_paranoid: Read-only file system
Note that perf_event_paranoid is 3.
Output:
Generated:
/home/build/Baseline.nsys-rep
That’s my command prefix:
nsys profile --capture-range=cudaProfilerApi --trace-fork-before-exec true --force-overwrite true -s cpu --cudabacktrace=all --stats=true -t cuda,nvtx,osrt,cudnn,cublas -o Baseline -w true
That's when I check nsys status:
nsys status -e
Timestamp counter supported: No
Sampling Environment Check
Linux Kernel Paranoid Level = -1: OK
Linux Distribution = Ubuntu
Linux Kernel Version = 5.0.0-1032-azure: OK
Linux perf_event_open syscall available: OK
Sampling trigger event available: OK
Intel(c) Last Branch Record support: Not Available
Sampling Environment: OK
That's the output from the Nsight viewer: (No Kernel data)
Profile Output
That's the diagnostics view:
Diagnostics View

I tried CUDA Version 11.0 and that only made Nsight produce profiles with my device driver. Other Cuda versions were not getting me the NSight Profiles.
Please check the following post for more details:
https://forums.developer.nvidia.com/t/nsys-does-not-show-the-kernels-output/229526/17

Related

openocd fails to load board/ti_cc26x0_launchpad.cfg

Has anyone got openocd to work with the TI cc2640r2 launchpad? I built the latest openocd source but it fails to initialise.
OS is Ubuntu 18.04.1 LTS and openocd was built with
configure --enable-xds110 --enable-cmsis-dap
make
make install
Running
openocd -f board/ti_cc26x0_launchpad.cfg
gets the output
Open On-Chip Debugger 0.10.0+dev-00676-g346ce2f1 (2019-02-05-00:53)
Licensed under GNU GPL v2
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
adapter speed: 2500 kHz
Error: The 'jtag configure' command must be used after 'init'.
placing 'debug level 3' statements inside the script files show that it is failing within target/ti_cc26x0.cfg at line 25, which is
jtag configure $_CHIPNAME.cpu -event tap-enable "icepick_c_tapenable $_CHIPNAME.jrc 0"
The scripts must have worked (at least once) as they are part of the source distribution.
I use the zephyr folk of open-ocd:
git clone https://github.com/zephyrproject-rtos/openocd.git
cd open-ocd
configure
make
make install
I also needed to reduce the JTAG clock speed:
diff --git a/tcl/board/ti_cc26x0_launchpad.cfg b/tcl/board/ti_cc26x0_launchpad.cfg
index 3613a47f7..2580faa52 100644
--- a/tcl/board/ti_cc26x0_launchpad.cfg
+++ b/tcl/board/ti_cc26x0_launchpad.cfg
## -2,6 +2,6 ##
# TI CC26x0 LaunchPad Evaluation Kit
#
source [find interface/xds110.cfg]
-adapter_khz 2500
+adapter_khz 1500
transport select jtag
source [find target/ti_cc26x0.cfg]

OProfile failed to generate callgraph

I'm trying to generate a callgraph using oprofile and for some reason it fails.
I'm using the below command to config it:
opcontrol --shutdown
opcontrol --reset
opcontrol --no-vmlinux
opcontrol --separate=library
opcontrol --event=default
opcontrol --callgraph=20
opcontrol --status
Here I get:
Daemon not running
Event 0: CPU_CLK_UNHALTED:100000:0:1:1
Separate options: library
vmlinux file: none
Image filter: none
Call-graph depth: 20
Buffer size: 10000000
CPU buffer watershed: 2560000
CPU buffer size: 160000
Then when trying to generate callgraph (for example using opreport pdpd -l --callgraph -o profile_pdp.txt)
I get:
30 0.7659 libpthread-2.5.so pthread_mutex_lock
30 100.000 libpthread-2.5.so pthread_mutex_lock [self]
My linux kernel version is 2.6.18
I do get the following error when running opreport (don't know if relevant):
opreport: /usr/lib64/libstdc++.so.6: no version information available (required by opreport)
Any idea why I can't get the full callgraph?
Found the issue, it was working with a 64bit kernel while debugging 32bit exe, don't know whay it is an issue for oprofile.

SVM - AMD VM extensions = 0 (1)

I get error while installing Android OS on Virtualbox-5.0_5.0.14-105127:
Kernel panic - not syncing: Attempted to kill the idle task!
Virtualbox->Settings->System->Acceleration->Default (all enabled)
Host OS - Linux Mint 17.3
motherboard - FM2A85X-ITX
CPU - A10-5800K
In log:
SVM - AMD VM extensions = 0 (1)
Tried different versions of Virtualbox.
Performed, but without success:
sudo killall VBoxSVC
export VBOX_HWVIRTEX_IGNORE_SVM_IN_USE=true
VirtualBox
At the same time on the same machine with Windows 7, but in Virtualbox 4.2.x it's ok.
How to overcome the problem?

apportable debug give me attach error with

Environment: OS X 10.9, XCode 4.6.3,
tweejump git:(master) ✗ apportable --version
Apportable SDK version release_1.0.31 (53ea42fec9b094b91c988f3bfde6dff8ba683a4d starter)
clang version 7fc8b05e4f57f61dbbbe5c8e62581b0e0c42941e
gdb version ff0611b8b721b3bf393c655c7d147de52cc850ac
android sdk version r21.0.1.1
android ndk version r8d.1
unknown ninja
I downloaded tweetjump built it and install this game.
Then I want to check if I can debug with gdb using
apportable just_debug
and
ROOTED=yes apportable just_debug
all these two commands gave me same information;
building with TARGET_ARCH_ABI:armeabi ARM_NEON:False
Building to /Users/xxx/.apportable/SDK/Build/android-armeabi-debug
Loading configuration.
Finished parsing configuration.
scons: Building targets ...
Debugging...
Starting: Intent { cmp=com.iplayful.tweejump/com.apportable.activity.VerdeActivity (has extras) }
Warning: Activity not started, its current task has been brought to the front
Failed to load one the Breakpoints files:
/Users/xxx/workspace/tweejump/tweejump.xcodeproj/xcuserdata/xxx.xcuserdatad/xcdebugger/Breakpoints.xcbkptlist
/Users/xxx/workspace/tweejump/tweejump.xcodeproj/xcuserdata/xxx.xcuserdatad/xcdebugger/Breakpoints_v2.xcbkptlist
Attaching to pid 8085
Cannot attach to lwp 8085: Operation not permitted (1)
Exiting
I saw some run-as answer, but how can an android newbie work it out. Can I have a step by step tutorial.
Edit1:
device: SAMSUNG SCH-I739
Android version: 4.1.2
Edit2:
I searched and found a debug solution:
$ adb shell
$ su
$ cd /data/data/com.iplayful.tweejump/lib/gdbserver :1111 --attach 26337
in my Mac:
$ ~/.apportable/toolchain/macosx/gdb/bin/arm-elf-linux-gdb
(gdb) file ./gdb/app_process
(gdb) shell adb forward tcp:1111 tcp:1111
(gdb) target remote :1111
(gdb) continue
then, gdb attached to gdbserver.
But gdb can't find the symbol, so this is the second question.
If I use this method to debug game, where to find game's symbol and libraries?
It looks like there is a gdbserver running on the device in a bad state.
Try rebooting the device and then apportable just_debug
If there are still issues, add the Android device and Android version to the question.

Compiling on Vortex86: "Illegal instruction"

I'm using an embedded PC which has a Vortex86-SG CPU, Ubuntu 10.04 w/ kernel 2.6.34.10-vortex86-sg. Unfortunately we can't compile a new kernel, cause we don't have any source code, not even drivers or patches.
I have to run a small project written in C++ with OpenFrameworks. The framework compiles right each script in of_v0071_linux_release/scripts/linux/ubuntu/install_*.sh.
I noticed that in order to compile against Vortex86/Ubuntu 10.04, the following options must be added in every config.make file:
USER_CFLAGS = -march=i486
USER_LDFLAGS = -lGLEW
In effects, it compiles without errors, but the generated binary doesn't start at all:
root#jb:~/openframeworks/of_v0071_linux_release/apps/myApps/emptyExample/bin# ./emptyExample
Illegal instruction
root#jb:~/openframeworks/of_v0071_linux_release/apps/myApps/emptyExample/bin# echo $?
132
Strace last lines:
munmap(0xb77c3000, 4096) = 0
rt_sigprocmask(SIG_BLOCK, [PIPE], NULL, 8) = 0
--- SIGILL (Illegal instruction) # 0 (0) ---
+++ killed by SIGILL +++
Illegal instruction
root#jb:~/openframeworks/of_v0071_linux_release/apps/myApps/emptyExample/bin#
Any idea to solve this problem?
I know I am a bit late on this but I recently had my own issues trying to compile the kernel for the vortex86dx. I finally was able to build the kernel as well. Use these steps at your own risk as I am not a Linux guru and some settings you may have to change to your own preference/hardware:
Download and use a Linux distribution that runs on a similar kernel version that you plan on compiling. Since I will be compiling Linux 2.6.34.14, I downloaded and installed Debian 6 on virtual box with adequate ram and processor allocations. You could potentially compile on the Vortex86DX itself, but that would likely take forever.
Made sure I hade decencies: #apt-get install ncurses-dev kernel-package
Download kernel from kernel.org (I grabbed Linux-2.6.34.14.tar.xz). Extract files from package.
Grab Config file from dmp ftp site: ftp://vxmx:gc301#ftp.dmp.com.tw/Linux/Source/config-2.6.34-vortex86-sg-r1.zip. Please note vxmx user name. Copy the config file to freshly extracted Linux source folder.
Grab Patch and at ftp://vxdx:gc301#ftp.dmp.com.tw/Driver/Linux/config%26patch/patch-2.6.34-hda.zip. Please note vxdx user name. Copy to kernel source folder.
Patch Kernel: #patch -p1 < patchfilename
configure kernel with #make menuconfig
Load Alternate Configuration File
Enable generic x86 support
Enable Math Emulation
I disabled generic IDE support because I will using legacy mode(selectable in bios)
Under Device Drivers -> Ethernet (10 or 100Mbit) -> Make sure RDC R6040 Fast Ethernet Adapter Support is selected
USB support -> Select Support for Host-side USB, EHCI HCD (USB 2.0) support, OHCI HCD support
safe config as .config
check serial ports: edit .config manually make sure CONFIG_SERIAL_8250_NR_UARTS = 4 (or more if you have additional), CONFIG_SERIAL_8250_RUNTIME_UARTS = 4(or more if you have additional). If you are to use more that 4 serial ports make use config_serail_8250_MANY_PORTs is set.
compile kernel headers and source: #make-kpkg --initrd kernel_image kernel_source kernel_headers modules_image