Core file truncated for a dpdk application running inside pod/container - centos7

I am running a dpdk application inside a container (running in the Kubernetes cluster). When the application crashes (segmentation fault), I get a truncated core file. ulimit -c is set to unlimited, there is no disk space issue. The expected core file is around 10 GB in size. Rarely do I get a complete core file. On tracing in the kernel code, I see that the dump_interrupted function is returning true (fs/coredump.c).
OS: Cent OS 7
Kernel version used: 3.10.0-693.2.2.rt56.623.el7.x86_64
Is there any way to increase the time for core collection in the kernel?

Related

Get user events from VTune doesn't work with attach to process

TLDR;
I am attempting to run a command line vtune attach to process analysis for some instrumented code with the application instrumentation lib supplied by intel. I have succeeded, in collecting user events when running within the vtune application (both command line and GUI). When I use -target-pid command line option to connect to the same application, user events do not show up in the profile. The environment setup suggested in the instructions for attaching to a process does not work.
The long version
I have broken this down again and again, and i have hit the minimum amount of things going on here. I am running Ubuntu 20.04 with intel vtune installed as part of the oneapi installer package. I have built an example application, which i can share, but it basically spawns threads and does some random computations. I have instrumented the code with itt as such:
#include <ittnotify.h>
__itt_event cloud_in_event = __itt_event_create( "CloudIn", 7 );
...
void add() {
__itt_event_start( cloud_in_event );
...
This works correctly when run through the gui. Aka, i compile my application with the following:
g++ -g -O3 -fno-asm -std=c++17 -I/opt/intel/oneapi/vtune/latest/sdk/include -DUSE_THR example.cpp -g -o ./example -lpthread -lm -L/opt/intel/oneapi/vtune/2021.4.0/sdk/lib64 -littnotify -ldl -D_LINUX
I start the gui using:
. /opt/intel/oneapi/setvars.sh && vtune-gui &
Run it using the cpu hotspot analysis in hw mode. The application runs and i get this in the output:
Yay, my user event is there. All is well.
The equivalent command line also works:
/opt/intel/oneapi/vtune/2021.4.0/bin64/vtune -collect hotspots -knob sampling-mode=hw -knob stack-size=0 -app-working-dir /home/development/example/example --app-working-dir=/home/development/example/example -- /home/development/hovermap/example/example
However, if i run the application on its own (using the correct setup for the link path in the environment variables for INTEL_LIBITTNOTIF), then attach with the GUI to that process (or with the command line). There are no user events (aka, the CloudIn event in the above image) in the profiler data.
If I print out the environment variables in the application, there are quite vast differences in the environments when profiling directly, vs when attaching. For example, there is the following:
INTEL_JIT_PROFILER32=/opt/intel/oneapi/vtune/2021.4.0/lib32/runtime/libittnotify_collector.so
INTEL_JIT_PROFILER64=/opt/intel/oneapi/vtune/2021.4.0/lib64/runtime/libittnotify_collector.so
ENABLE_JITPROFILING=1
Exists in the gui based run environment, but the setup instructional says nothing about these environment variables. I have also tried setting them with no luck.
Any ideas what extra stuff i need to set up?
If you want to attach to application that uses ITT API you need to set up additional environment variables before running it, for example:
export INTEL_LIBITTNOTIFY32=/opt/intel/oneapi/vtune/2021.4.0/lib64/runtime/libittnotify_collector.so
export INTEL_LIBITTNOTIFY64=/opt/intel/oneapi/vtune/2021.4.0/lib64/runtime/libittnotify_collector.so
./example
These environment variables are described in Attach ITT APIs to a Launched Application help topic in VTune User Guide.

java.lang.OutOfMemoryError when running bazel build

I have been trying to install the ONOS controller on my Ubuntu VM on my MAC computer following the steps in this link: Download ONOS code & Build ONOS.
However, the building process is not successful after executing the following command:
~/onos$ bazel build onos
The above command outputs the following:
Starting local Bazel server and connecting to it...
INFO: Analysed target //:onos (759 packages loaded, 12923 targets configured).
INFO: Found 1 target...
.
.
.
enconfig-native; [2,128 / 2,367] //models/openconfig:onos-models-openconfig-native; ERROR: /home/mohamedzidan/onos/models/openconfig/BUILD:11:1: Building models/openconfig/libonos-models-openconfig-native-class.jar (2 source jars) failed (Exit 1)
[2,128 / 2,367] //models/openconfig:onos-models-openconfig-native; An exception has occurred in the compiler (10.0.1). Please file a bug against the Java compiler via the Java bug reporting page (http://bugreport.java.com) after checking the Bug Database (http://bugs.java.com) for duplicates. Include your program and the following diagnostic in your report. Thank you.
java.lang.OutOfMemoryError: Java heap space
at jdk.compiler/com.sun.tools.javac.util.ArrayUtils.ensureCapacity(ArrayUtils.java:60)
at jdk.compiler/com.sun.tools.javac.util.SharedNameTable.fromUtf(SharedNameTable.java:132)
at jdk.compiler/com.sun.tools.javac.util.Names.fromUtf(Names.java:392)
at jdk.compiler/com.sun.tools.javac.util.ByteBuffer.toName(ByteBuffer.java:159)
at jdk.compiler/com.sun.tools.javac.jvm.ClassWriter$CWSignatureGenerator.toName(ClassWriter.java:320)
at jdk.compiler/com.sun.tools.javac.jvm.ClassWriter$CWSignatureGenerator.access$300(ClassWriter.java:266)
at jdk.compiler/com.sun.tools.javac.jvm.ClassWriter.typeSig(ClassWriter.java:335)
at jdk.compiler/com.sun.tools.javac.jvm.ClassWriter.writeMethod(ClassWriter.java:1153)
at jdk.compiler/com.sun.tools.javac.jvm.ClassWriter.writeMethods(ClassWriter.java:1653)
at jdk.compiler/com.sun.tools.javac.jvm.ClassWriter.writeClassFile(ClassWriter.java:1761)
at jdk.compiler/com.sun.tools.javac.jvm.ClassWriter.writeClass(ClassWriter.java:1679)
at jdk.compiler/com.sun.tools.javac.main.JavaCompiler.genCode(JavaCompiler.java:743)
at jdk.compiler/com.sun.tools.javac.main.JavaCompiler.generate(JavaCompiler.java:1641)
at jdk.compiler/com.sun.tools.javac.main.JavaCompiler.generate(JavaCompiler.java:1609)
at jdk.compiler/com.sun.tools.javac.main.JavaCompiler.compile(JavaCompiler.java:959)
at jdk.compiler/com.sun.tools.javac.api.JavacTaskImpl.lambda$doCall$0(JavacTaskImpl.java:100)
at jdk.compiler/com.sun.tools.javac.api.JavacTaskImpl$$Lambda$97/1225568095.call(Unknown Source)
at jdk.compiler/com.sun.tools.javac.api.JavacTaskImpl.handleExceptions(JavacTaskImpl.java:142)
at jdk.compiler/com.sun.tools.javac.api.JavacTaskImpl.doCall(JavacTaskImpl.java:96)
at jdk.compiler/com.sun.tools.javac.api.JavacTaskImpl.call(JavacTaskImpl.java:90)
at com.google.devtools.build.buildjar.javac.BlazeJavacMain.compile(BlazeJavacMain.java:113)
at com.google.devtools.build.buildjar.SimpleJavaLibraryBuilder$$Lambda$70/778731861.invokeJavac(Unknown Source)
at com.google.devtools.build.buildjar.ReducedClasspathJavaLibraryBuilder.compileSources(ReducedClasspathJavaLibraryBuilder.java:57)
at com.google.devtools.build.buildjar.SimpleJavaLibraryBuilder.compileJavaLibrary(SimpleJavaLibraryBuilder.java:116)
at com.google.devtools.build.buildjar.SimpleJavaLibraryBuilder.run(SimpleJavaLibraryBuilder.java:123)
at com.google.devtools.build.buildjar.BazelJavaBuilder.processRequest(BazelJavaBuilder.java:105)
at com.google.devtools.build.buildjar.BazelJavaBuilder.runPersistentWorker(BazelJavaBuilder.java:67)
at com.google.devtools.build.buildjar.BazelJavaBuilder.main(BazelJavaBuilder.java:45)
[2,128 / 2,367] //models/openconfig:onos-models-openconfig-native; Target //:onos failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 1386.685s, Critical Path: 117.31s
INFO: 379 processes: 125 linux-sandbox, 254 worker.
**FAILED: Build did NOT complete successfully**
Your output shows java.lang.OutOfMemoryError: Java heap space. You can increase the amount of memory available to javac with something like this:
BAZEL_JAVAC_OPTS="-J-Xms384m -J-Xmx512m"
If that still doesn't work, try progressively increasing sizes for -Xmx. This issue is discussed further at:
https://github.com/bazelbuild/bazel/issues/1308
Summary
If bazel runs out of memory while building, and you see this error:
java.lang.OutOfMemoryError: Java heap space
...then do this:
Increase your RAM or your virtual memory swap file size, to emulate having more RAM (details on how to do this are below).
From now on, build with this bazel command, for example, to give Bazel more heap space (RAM) while building. In this case I am giving it 32GB maximum RAM:
# Do this to give Bazel up to 32GB of RAM wile building
time bazel --host_jvm_args=-Xmx32g build //...
# ...instead of doing this
time bazel build //...
Details
If Bazel fails with any versions of the following error, it's because it ran out of heap space while trying to build.
Example error:
java.lang.OutOfMemoryError: Java heap space
I see that error in your output you pasted. Although very much not well-known, some monster-sized projects and mono-repos can require a heap of 16GB or more, so I recommend you just create a massive 32GB~64GB swap file (virtual memory) on your Linux build machine and let it run with it! Give it the whole thing to build!
CAUTION: if you have a standard HDD (spinning Hard Disk Drive), this may cause the build to run dozens or even hundreds of times slower than using physical RAM to build! This is because HDDs are HORRIBLY HORRIBLY HORRIBLY SLOW!
BUUUUT: If you have a 2.5" or 3.5" SSD (Solid State Drive), then it works ok, or 100x BETTER STILL IF YOU HAVE AN m.2 form-factor SSD! This is because an m.2 form-factor SSD is INCREDIBLY FAST, so you can get away with HUGE swap files being used in place of RAM all the time because these disks operate so fast!
If using a top-of-the-line internal m.2 form-factor SSD, I expect the following build with virtual memory to be only ~2x slower than using physical RAM only (of the same size) to build. If you have a super slow spinning HDD, however, the same build which takes 2 hrs using a swap file on the internal m.2 SSD might take up to multiple days or more using a swap file on a spinning HDD.
Your results may vary, of course, but favor a smaller JVM bazel heap (to use less of your virtual memory), the slower you expect your virtual memory (swap file) to be.
Increase your system’s swap file (virtual memory) to at least 32~64 GB. To add or remove a swapfile, follow the detailed instructions here: https://linuxize.com/post/how-to-add-swap-space-on-ubuntu-18-04/. UPDATE: use my own instructions here instead: How do I increase the size of swapfile without removing it in the terminal?. My instructions avoid the pitfalls of fallocate by using dd instead, as I explain in my answer there.
In short, here is how to add a swapfile:
sudo dd if=/dev/zero of=/swapfile count=64 bs=1G # Create a 64 GiB file
sudo mkswap /swapfile # turn this new file into swap space
sudo chmod 0600 /swapfile # only let root read from/write to it,
# for security
sudo swapon /swapfile # enable it
swapon --show # verify this new 64GB swap file is
# now active
sudo gedit /etc/fstab # edit the /etc/fstab file to make these
# changes persistent (load them each boot)
# ADD this line to bottom (w/out the # comment symbol):
# /swapfile none swap sw 0 0
cat /proc/sys/vm/swappiness # not required: verify your systems
# "swappiness" value. Note: values now range 0 to 200 (they used to only
# go up to 100), and have a default value of 60. I highly recommend
# you follow my instructions here to set your swappiness to 0,
# however, to improve your system's performance:
# https://askubuntu.com/a/1445347/327339
To resize or delete your swapfile: if you ever need to resize your swap file you just made above, you can delete it like this:
sudo swapoff -v /swapfile # turn swap file off
sudo swapon --show # verify the swap file is off
free -h # you can also look at this as an
# indication the swap file is off
sudo rm /swapfile # remove the swap file
Then, you can either follow the instructions above again to recreate it at a new size, or if you are permanently deleting it you'll need to edit your /etc/fstab file to remove the /swapfile none swap sw 0 0 line you previously added to the bottom of it.
Add --host_jvm_args=-Xmx32g to any bazel command, right after the word bazel. This sets the max Java Virtual Memory, or bazel build heap in this case, to 32GB, which goes into your swap file once your physical RAM is full. If you have a high-speed SSD drive, which will operate surprisingly well with swap, expect to wait a few hrs max for your build to complete, depending on the repo size. If you have an old spinning HDD, expect a repo that takes 2 hrs to buld with a swap file on an internal m.2 SSD to take maybe up to several days perhaps to build with a swap file on a slow spinning HDD--especially if it's an external instead of internal HDD.
Here is a sample full bazel command with this bazel startup option added, to build an entire repo:
time bazel --host_jvm_args=-Xmx32g build //...
...instead of this:
time bazel build //...
The time addition there just prints out a more readable printout of how long the build took is all (I like it). Just be sure to set your max Java Virtual Memory allotted to bazel for any bazel build command by putting --host_jvm_args=-Xmx32g (or similar) after the word bazel any time you need it.
Note that setting the max heap like we are doing here with -Xmx is NOT the same thing as setting the default heap like others might do with -Xms. Setting the max heap still starts with the default heap but lets it grow if needed. The other answer shows setting both via an environment variable.
Done!
References:
*****[my own answer] Ask Ubuntu: How do I increase the size of swapfile without removing it in the terminal?
https://linuxize.com/post/how-to-add-swap-space-on-ubuntu-18-04/
https://serverfault.com/questions/684771/best-way-to-disable-swap-in-linux/684792#684792
My answer: How do I configure swappiness?
See also:
https://github.com/bazelbuild/bazel/issues/1308

"virtual memory exhausted" when building Docker image

When building a Docker image, there's some compilations of C++ scripts and I ended up with errors like:
src/amun/CMakeFiles/cpumode.dir/build.make:134: recipe for target 'src/amun/CMakeFiles/cpumode.dir/cpu/decoder/encoder_decoder_state.cpp.o' failed
virtual memory exhausted: Cannot allocate memory
But when building the same .cpp code on the host machine, it works fine.
After some checking, the error message seems to be similar to the one that people get on a Raspberry Pi, https://www.bitpi.co/2015/02/11/how-to-change-raspberry-pis-swapfile-size-on-rasbian/
And after some more googling, this post on the Mac forum says that:
Swapfiles are dynamically created as needed, until either the disk is
full, or the kernel runs out of page table space. I do not think you
can change the page table space limits in the Mac OS X kernel. I have
not seen anything in all the years I've been using OS X.
Is there a way to increase the swap space for Docker build on Mac OS?
If not, how else can be done to overcome the "virtual memory exhausted" error when building a Docker image?
That does not seem trivial to do with XHyve.
As stated in this thread
I think the default size of the VM is 16GB. I kept running out of swap space even after bumping the ram on the VM up to 16GB.
Check if the method used for a VirtualBox VM would apply in XHyve: see "How to increase the swap space available in the boot2docker virtual machine?"
boot2docker ssh
export SWAPFILE=/mnt/sda1/swapfile
sudo dd if=/dev/zero of=$SWAPFILE bs=1024 count=4194304
sudo mkswap $SWAPFILE
sudo chmod 600 $SWAPFILE
sudo swapon $SWAPFILE
exit
Check also this Digital Ocean Setup, again to test in your XHyve context.
mkswap is also seen here or in docker-root-xhyve/contrib/makehdd/makehdd.sh.
Since you have enough available memory in your host, I recommend you to assign more memory to the Docker VM that is behind.
As stated here:
As I can see that you are on OSX, which runs docker over a Linux VM. Configure the max memory clicking the whale icon in the task bar. By default is 2G.
For further information: https://docs.docker.com/docker-for-mac/#memory

Compiling on Vortex86: "Illegal instruction"

I'm using an embedded PC which has a Vortex86-SG CPU, Ubuntu 10.04 w/ kernel 2.6.34.10-vortex86-sg. Unfortunately we can't compile a new kernel, cause we don't have any source code, not even drivers or patches.
I have to run a small project written in C++ with OpenFrameworks. The framework compiles right each script in of_v0071_linux_release/scripts/linux/ubuntu/install_*.sh.
I noticed that in order to compile against Vortex86/Ubuntu 10.04, the following options must be added in every config.make file:
USER_CFLAGS = -march=i486
USER_LDFLAGS = -lGLEW
In effects, it compiles without errors, but the generated binary doesn't start at all:
root#jb:~/openframeworks/of_v0071_linux_release/apps/myApps/emptyExample/bin# ./emptyExample
Illegal instruction
root#jb:~/openframeworks/of_v0071_linux_release/apps/myApps/emptyExample/bin# echo $?
132
Strace last lines:
munmap(0xb77c3000, 4096) = 0
rt_sigprocmask(SIG_BLOCK, [PIPE], NULL, 8) = 0
--- SIGILL (Illegal instruction) # 0 (0) ---
+++ killed by SIGILL +++
Illegal instruction
root#jb:~/openframeworks/of_v0071_linux_release/apps/myApps/emptyExample/bin#
Any idea to solve this problem?
I know I am a bit late on this but I recently had my own issues trying to compile the kernel for the vortex86dx. I finally was able to build the kernel as well. Use these steps at your own risk as I am not a Linux guru and some settings you may have to change to your own preference/hardware:
Download and use a Linux distribution that runs on a similar kernel version that you plan on compiling. Since I will be compiling Linux 2.6.34.14, I downloaded and installed Debian 6 on virtual box with adequate ram and processor allocations. You could potentially compile on the Vortex86DX itself, but that would likely take forever.
Made sure I hade decencies: #apt-get install ncurses-dev kernel-package
Download kernel from kernel.org (I grabbed Linux-2.6.34.14.tar.xz). Extract files from package.
Grab Config file from dmp ftp site: ftp://vxmx:gc301#ftp.dmp.com.tw/Linux/Source/config-2.6.34-vortex86-sg-r1.zip. Please note vxmx user name. Copy the config file to freshly extracted Linux source folder.
Grab Patch and at ftp://vxdx:gc301#ftp.dmp.com.tw/Driver/Linux/config%26patch/patch-2.6.34-hda.zip. Please note vxdx user name. Copy to kernel source folder.
Patch Kernel: #patch -p1 < patchfilename
configure kernel with #make menuconfig
Load Alternate Configuration File
Enable generic x86 support
Enable Math Emulation
I disabled generic IDE support because I will using legacy mode(selectable in bios)
Under Device Drivers -> Ethernet (10 or 100Mbit) -> Make sure RDC R6040 Fast Ethernet Adapter Support is selected
USB support -> Select Support for Host-side USB, EHCI HCD (USB 2.0) support, OHCI HCD support
safe config as .config
check serial ports: edit .config manually make sure CONFIG_SERIAL_8250_NR_UARTS = 4 (or more if you have additional), CONFIG_SERIAL_8250_RUNTIME_UARTS = 4(or more if you have additional). If you are to use more that 4 serial ports make use config_serail_8250_MANY_PORTs is set.
compile kernel headers and source: #make-kpkg --initrd kernel_image kernel_source kernel_headers modules_image

how to program the STM32 flash using openOCD and gdb

I'm using an Olimex ARM-USB-OCD dongle with openOCD and GDB to program and debug an stm32f103 micro. The IDE I'm using came from the Olimex dev-kit CD and makes use of eclipse ganymede.
I can load a small program into the RAM and step through the code without any problems.
I now have a much larger program which doesn't fit into RAM (which is only 20K) and so I'd like to run it from flash (which is 128K).
I've modified the linker script indicating the program code should go in the flash section (address 0x8000000), but gdb fails to load the program.
(gdb)
20 load main.out
&"load main.out\n"
load main.out
~"Loading section .text, size 0xb0e6 lma 0x8000000\n"
Loading section .text, size 0xb0e6 lma 0x8000000
&"Load failed\n"
Load failed
What should I do to get gdb to load the program into flash?
Have you considered flashing directly with openocd? I am doing this in a similar setup, but with an ARM7 microcontroller.
openocd -f flash.cfg
Here is my flash.cfg
set CHIPNAME at91sam7x512
source [find interface/olimex-arm-usb-ocd.cfg]
source [find target/at91sam7sx.cfg]
init
halt
flash probe 0
flash probe 1
flash erase_sector 0 0 15
flash erase_sector 1 0 15
flash write_image my-image.elf
at91sam7 gpnvm 0 set
at91sam7 gpnvm 1 set
at91sam7 gpnvm 2 set
shutdown
The GPNVM stuff is Atmel SAM7 specific, but I think this script should give you a good starting point for making a STM32 version. Openocd can be a bit confusing in the beginning, but the documentation is good and worth reading (http://openocd.berlios.de/). The current stable version (0.4.0) is quite old, so if you have problems, download the latest source code and compile your own.