GCC compilation times slower on Debian Stretch/Buster than Wheezy/Jessie - c++

in my compagny we are building programs for different versions of Debian. We are using a Jenkins build chains with Virtual Machine on ESXI.
The programs compils with GCC. Based on some test we found that the compilation time on Stretch/Buster is 50% slower than on Wheezy/Jessie.
For example, a simple Hello World program :
jessie
------
real 0m0.099s
user 0m0.076s
sys 0m0.012s
buster
------
real 0m0,201s
user 0m0,168s
sys 0m0,032s
For small programs, it's not really important but for bigger projects, time difference is really visible (even with -O3 falgs) :
jessie
------
real 0m29.996s
user 0m26.636s
sys 0m1.688s
buster
------
real 0m59,051s
user 0m53,226s
sys 0m5,164s
Our biggest project takes 25 min on Jessie to compile against 45 min on Stretch.
Note this is done on two different virtual machine but on the same physical machine. The CPU models is : Intel(R) Core(TM) i7-4770 CPU # 3.40GHz.
I think that one reason might be the meldown and spectre patch that is applied to the kernel. But i don't know if this patch is enabled on stretch.
Do you have any idea about the possible reasons of this performance difference? How i can check it? And how to fix it if possible.
Regards.

Related

Qt, QML and C++: tips to increase compilation speed

I am a Qt developer. I have a very fast Ryzen 3900X and I'm looking for tips to speed up my daily work (faster builds in linux)
This is a sample project:
full rebuild 24 cores: Elapsed time: 00:38.
modify main.cpp: Elapsed time: 00:02.
modify main.qml: Elapsed time: 00:05.
Now I am using these options which noticeably increase compilation speed (in project .pro):
unix{
CONFIG+= use_gold_linker # betterlink speed
QMAKE_CXX = ccache $$QMAKE_CXX # use ccache. apt install ccache
QMAKE_CC = ccache $$QMAKE_CC # use ccache
}
CONFIG+=qtquickcompiler # compile QML always (debug&release) (Qt>=5.11).QML debugging may not work
(You can use CONFIG+=qtquickcompiler or build always in release.)
And I get:
full rebuild 24 cores: Elapsed time: 00:05.
modify main.cpp: Elapsed time: 00:01.
modify main.qml: Elapsed time: 00:01.
Now it's really very fast, and I can modify a QML file, build and execute in just 1 second.
More ideas to increase compilation speed?
any ideas for better compilation times in windows + VS2019?
Qmake build system seems very slow in win32
any ideas for macos? (I haven't tried ccache and gold linker on mac os)

OpenGL program with tensorflow C++ gives failed call to cuInit : CUDA_ERROR_OUT_OF_MEMORY

I have trained a model with no issues using tensorflow on python. I am now trying to integrate inference for this model into a pre-existing OpenGL enabled software. However, I get a CUDA_ERROR_OUT_OF_MEMORY during cuInit (that is, even earlier than loading the model, just at session creation). It does seem, that OpenGL has taken some MiBs of memory (around 300 MB), as shown by gpustat or nvidia-smi.
Is it possible there is a clash as both TF and OpenGL are trying to access/allocate the GPU memory? Has anyone encountered this problem before? Most references I found googling around are at model loading time, not at session/CUDA initialization. Is this completely unrelated to OpenGL and I am just barking up the wrong tree? A simple TF C++ inference example works. Any help is appreciated.
Here is the tensorflow logging output, for completeness:
2018-01-08 12:11:38.321136: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-01-08 12:11:38.379100: E tensorflow/stream_executor/cuda/cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_OUT_OF_MEMORY
2018-01-08 12:11:38.379388: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: rosenblatt
2018-01-08 12:11:38.379413: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: rosenblatt
2018-01-08 12:11:38.379508: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 384.98.0
2018-01-08 12:11:38.380425: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:369] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 384.98 Thu Oct 26 15:16:01 PDT 2017 GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5)"""
2018-01-08 12:11:38.380481: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 384.98.0
2018-01-08 12:11:38.380497: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:300] kernel version seems to match DSO: 384.98.0
EDIT: Removing all references to OpenGL resulted in the same problem, so it has nothing to do with a clash between the libraries.
Ok, the problem was the use of the sanitizer in the debug version of the binary. The release version, or the debug version with no sanitizer work as expected.

tensorflow unusual CUDA related error

I've been using tensorflow for nearly two years and have never seen this one. On a new Ubuntu box, I have a fresh install of tensorflow in a virtualenv. When I ran a sample code, i got a Invalid Device error. It occurred when tf.Session() is called.
WARNING:tensorflow:From full_code.py:27: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use `tf.global_variables_initializer` instead.
2017-06-05 11:01:55.853842: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-05 11:01:55.853867: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-05 11:01:55.853876: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-06-05 11:01:55.853886: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-05 11:01:55.853893: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-06-05 11:01:55.937978: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties:
name: GeForce GTX 660 Ti
major: 3 minor: 0 memoryClockRate (GHz) 1.0455
pciBusID 0000:04:00.0
Total memory: 2.95GiB
Free memory: 2.91GiB
2017-06-05 11:01:55.938063: W tensorflow/stream_executor/cuda/cuda_driver.cc:485] creating context when one is currently active; existing: 0x19e5370
2017-06-05 11:01:56.014220: E tensorflow/core/common_runtime/direct_session.cc:137] Internal: failed initializing StreamExecutor for CUDA device ordinal 1: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_INVALID_DEVICE
Here is the full spec.
Ubuntu 14.04
CUDA 8.0
GeForce GTX 660 Ti
python 3.4.3
Thanks to someone from google, i figured out what went wrong. In this Dell box, there are two Nvidia graphic cards. First one comes with the manufacturer and is a NVS 310 card. As far as I know, this one does not have any compute capability and I never intend to use much of it.
I then added a second card, a GTX 660 Ti and I intended to use this one for all computations.
When Tensorflow is invoked, it defaults to Device 0, which is the NVS 310. And of course it throws an invalid error.
When I do this,
CUDA_VISIBLE_DEVICES=1 python myscript.py
it works.

Fix crash due to SIGILL in OpenSSL

I've read several Q&As here regarding the fact that OpenSSL tries different instructions to test if cpu supports them, which causes SIGILL. But those answers usually state that OP was running the app under gdb, but I'm not. So my app on OpenWrt MIPS router actually crashes when using OpenSSL, whenever I make a call to OpenSSL library. The crash is illegal instruction. I actually don't have a backtrace, though my app is a debug build. It works fine on Ubuntu and MacOS.
I made sure that both my executable and ssl libs are of the same cpu architecture.
Result of cat /proc/cpuinfo:
system type : Atheros AR9330 rev 1
machine : 8devices Carambola2 board
processor : 0
cpu model : MIPS 24Kc V7.4
BogoMIPS : 265.42
wait instruction : yes
microsecond timers : yes
tlb_entries : 16
extra interrupt vector : yes
hardware watchpoint : yes, count: 4, address/irw mask: [0x0ffc, 0x0ffc, 0x0ffb, 0x0ffb]
isa : mips1 mips2 mips32r1 mips32r2
ASEs implemented : mips16
shadow register sets : 1
kscratch registers : 0
package : 0
core : 0
VCED exceptions : not available
VCEI exceptions : not available
What worries me is that toolchain toolchain-mips_34kc_gcc-5.2.0_musl-1.1.11 mentions 34kc in its name. I wonder if it's ok to build with this toolchain for 24 Kc cpu. Though everything else except for openssl works fine.
So could you please answer what are my options to fix it?
I don't know what the problem was, but the app didn't work with the openssl library provided in the toolchain and copied to target board. When libopenssl was installed via opkg from official carambola2 repos, the problem is gone. So it must have been some incompatibility.

Compiling on Vortex86: "Illegal instruction"

I'm using an embedded PC which has a Vortex86-SG CPU, Ubuntu 10.04 w/ kernel 2.6.34.10-vortex86-sg. Unfortunately we can't compile a new kernel, cause we don't have any source code, not even drivers or patches.
I have to run a small project written in C++ with OpenFrameworks. The framework compiles right each script in of_v0071_linux_release/scripts/linux/ubuntu/install_*.sh.
I noticed that in order to compile against Vortex86/Ubuntu 10.04, the following options must be added in every config.make file:
USER_CFLAGS = -march=i486
USER_LDFLAGS = -lGLEW
In effects, it compiles without errors, but the generated binary doesn't start at all:
root#jb:~/openframeworks/of_v0071_linux_release/apps/myApps/emptyExample/bin# ./emptyExample
Illegal instruction
root#jb:~/openframeworks/of_v0071_linux_release/apps/myApps/emptyExample/bin# echo $?
132
Strace last lines:
munmap(0xb77c3000, 4096) = 0
rt_sigprocmask(SIG_BLOCK, [PIPE], NULL, 8) = 0
--- SIGILL (Illegal instruction) # 0 (0) ---
+++ killed by SIGILL +++
Illegal instruction
root#jb:~/openframeworks/of_v0071_linux_release/apps/myApps/emptyExample/bin#
Any idea to solve this problem?
I know I am a bit late on this but I recently had my own issues trying to compile the kernel for the vortex86dx. I finally was able to build the kernel as well. Use these steps at your own risk as I am not a Linux guru and some settings you may have to change to your own preference/hardware:
Download and use a Linux distribution that runs on a similar kernel version that you plan on compiling. Since I will be compiling Linux 2.6.34.14, I downloaded and installed Debian 6 on virtual box with adequate ram and processor allocations. You could potentially compile on the Vortex86DX itself, but that would likely take forever.
Made sure I hade decencies: #apt-get install ncurses-dev kernel-package
Download kernel from kernel.org (I grabbed Linux-2.6.34.14.tar.xz). Extract files from package.
Grab Config file from dmp ftp site: ftp://vxmx:gc301#ftp.dmp.com.tw/Linux/Source/config-2.6.34-vortex86-sg-r1.zip. Please note vxmx user name. Copy the config file to freshly extracted Linux source folder.
Grab Patch and at ftp://vxdx:gc301#ftp.dmp.com.tw/Driver/Linux/config%26patch/patch-2.6.34-hda.zip. Please note vxdx user name. Copy to kernel source folder.
Patch Kernel: #patch -p1 < patchfilename
configure kernel with #make menuconfig
Load Alternate Configuration File
Enable generic x86 support
Enable Math Emulation
I disabled generic IDE support because I will using legacy mode(selectable in bios)
Under Device Drivers -> Ethernet (10 or 100Mbit) -> Make sure RDC R6040 Fast Ethernet Adapter Support is selected
USB support -> Select Support for Host-side USB, EHCI HCD (USB 2.0) support, OHCI HCD support
safe config as .config
check serial ports: edit .config manually make sure CONFIG_SERIAL_8250_NR_UARTS = 4 (or more if you have additional), CONFIG_SERIAL_8250_RUNTIME_UARTS = 4(or more if you have additional). If you are to use more that 4 serial ports make use config_serail_8250_MANY_PORTs is set.
compile kernel headers and source: #make-kpkg --initrd kernel_image kernel_source kernel_headers modules_image