Finding which version of valgrind is running - c++

in C/C++, I can include valgrind headers to know at runtime whether or not my software is running on valgrind :
#include <valgrind/valgrind.h>
bool RunningOnValgrind()
{
return RUNNING_ON_VALGRIND ? true : false;
}
This is documented in the valgrind manual.
I would like to be able to know if the valgrind I am being run on supports AVX instructions. How do I write a function that returns this information ?
From valgrind release notes, I known that these are supported from version 3.8 onwards. Hence one solution would be to spawn a process to execute valgrind --version and then parse the output but there must be a better way.

If you look in valgrind/valgrind.h, you see you can check valgrind version number this way after including valgrind.h:
#if defined(__VALGRIND_MAJOR__) && defined(__VALGRIND_MINOR__) \
&& (__VALGRIND_MAJOR__ > 3 \
|| (__VALGRIND_MAJOR__ == 3 && __VALGRIND_MINOR__ >= 8))
/* code to say avx is supported */
#endif
This has many limitations, however: it assumes you are using a globally-installed version, and that your path isn't pointing to some personal, user-built valgrind that isn't in the default location (which I have done) and it also assumes that you are building on the machine where you will run valgrind, and not shipping around a pre-built executable (which in my experience is something that happens often) so you can't rely on that.
With that in mind, spawning a sub-process with valgrind --version and checking the output may truly be the best alternative.

You can use VALGRIND_MAJOR to detect the version in runtime, or use --version flag to get the exact version.

Related

gdb script: How can a script determine if it is invoked under `gdb` or `gdb-multiarch`?

I'd like to define a command which does X under gdb-multiarch, but prints out a helpful message when run under normal gdb. How can my script determine which of the two its run under?
Why? When I start gdb-multiarch, I can bind to a qemu-arm session. When I try that in gdb, I get bizarre errors. It's easy to forget and run gdb (and not -multiarch), and I want to my bind-to-qemu tell me "This must be run under gdb-multiarch".
Your question presumes that there is some difference between gdb and gdb-multiarch, but there doesn't have be any such difference.
Presumably on the OS you are using the gdb and gdb-multiarch are configured differently, with gdb only supporting native architecture, while gdb-multiarch supports cross-architecture debugging.
Presumably what you actually want to detect is that the target-architecture you need (arm ?) is / isn't supported by the current binary.
In the bind-to-qemu user-defined function, you can try to set architecture arm.
If that errors out, the rest of bind-to-qemu should not execute.

No source file for Netaccel_link error on running program

I have an OCaml program that worked fine on Ubuntu 16 but when recompiled and run on Ubuntu 20 I get the following error:-
$ ocamldebug ./linearizer
OCaml Debugger version 4.08.1
(ocd) r
Loading program... done.
Time: 89534
Program end.
Uncaught exception: Sys_error "Illegal seek"
(ocd) b
Time: 89533 - pc: 624888 - module Netaccel_link
No source file for Netaccel_link.
I thought this was due to missing dev libraries but:-
$ sudo apt install libocamlnet-ocaml-dev
Reading package lists... Done
Building dependency tree
Reading state information... Done
libocamlnet-ocaml-dev is already the newest version (4.1.6-1build6).
0 upgraded, 0 newly installed, 0 to remove and 20 not upgraded.
What setup step am I missing on Ubuntu 20?
This looks like a regression bug in libocamlnet and you should report an issue there or, I am a bit pessimistic that you will get any response, you can try to debug the issue yourself.
The problem that you are facing has nothing to do with missing libraries (they will be reported during installation or, if the package is broken, end up in linker errors). It may result, however, from some misconfiguration of the system. If that is true, then you're lucky as you can fix it yourself.
I will give you some advice that might help you in debugging this issue. For more, please try using discuss.ocaml.org as a more suitable media (SO doesn't favor this kind of a discussion and we might get deleted by admins).
The illegal seek exception is thrown when the seek operation is applied on a non-regular file, aka ESPIPE Unix error. So check your inputs. It could be that what was previously regarded as a file in Ubuntu is now a pipe or a socket.
Try to use ltrace or strace to pinpoint the culprit e.g.,
ltrace ./linearizer
or, if it overwhelms you, try strace
strace ./linearizer
Instead of using ocamldebug you can use plain gdb. You can use gdb's interfaces to provide the path to the source code (though most likely it won't work since ocamlnet is not compiled with debug information). I believe that it will give you a more meaningful backtrace.
Instead of using the system installation try using opam. Install your dependencies with opam and try older versions as well as newer versions of the OCaml compiler. Also, try different versions of ocamlnet. Ideally, try to reproduce the environment that used to work for you.
When nothing else works, you can use objdump -d and look at the disassembly of your binary. OCaml is using a pretty readable and intuitive name mangling scheme (<module_name>__<function_name>_<uid>), so you can easily find the source code (search for <module_name>.ml file and look for the <function_name> there)
Finally, just use docker or any other container to run your application. Consider switching from ocamlnet to something more modern and supported.

Installing HDF5 library on Cygwin: "make check" stuck at testswmr.sh, no error message

I am currently installing the HDF5 library, more precisely the hdf5-1.10.0-patch1, on Cygwin, as I want to use it with Fortran. Following the instructions from the hdfgroup website
(here is the link), I did the following:
./configure --enable-fortran
make > "out1_check.txt" 2> "warn1_check.txt" &
make check > "out2_check.txt" 2> "warn2_check.txt" &
The execution of the last command (make check) proceeds as it should, until it gets stuck. The process does not stop and something is happening (8-12% CPU are in use by sh.exe, already 39 hours of CPU time) but "out2_check.txt" looks like
Making check in src
...
[many successful checks]
...
============================
No need to test testlinks_env.sh again.
============================
============================
Testing testswmr.sh
Unfortunately, I do not have the output file from the first run of make check, but it did not contain more information on Testing testswmr.sh. There was never any error message.
So, what is this testswmr.sh, why does it get stuck and how can I finalize the installation process? Maybe I can skip the remaining checks and just proceed to make install?
Important note: an older version of HDF5 is already installed from the Cygwin repo. It does not seem to support Fortran however, so I decided to install the current version myself.
Available (and used) compilers are gcc and gfortran.
As far as I can tell, only Intel Fortran is supported on Windows. There is no Cygwin download here https://support.hdfgroup.org/HDF5/release/obtain518.html and I have never come across a report of experience for Cygwin/Fortran/HDF5.
Your options:
Use Intel Fortran
Use Linux or Mac
Sorry

How can I use valgrind with Python C++ extensions?

I have Python extensions implemented on C++ classes. I don't have a C++ target to run valgrind with. I want to use valgrind for memory check.
Can I use valgrind with Python?
Yes, you can use valgrind with Python. You just need to use the valgrind suppression file provided by the Python developers, so you don't get a bunch of false positives due to Python's custom memory allocation/reallocation functions.
The valgrind suppression file can be found here: http://svn.python.org/projects/python/trunk/Misc/valgrind-python.supp
IMPORTANT: You need to uncomment the lines for PyObject_Free and PyObject_Realloc in the suppression file*.
The recommended usage syntax is:
$ valgrind --tool=memcheck --suppressions=valgrind-python.supp \
python -E -tt ./my_python_script.py
See also this README file from the Python SVN repo which describes the different ways of using Python with valgrind:
http://svn.python.org/projects/python/trunk/Misc/README.valgrind
* - Alternatively, you can recompile Python with PyMalloc disabled, which allows you to catch more memory leaks that won't show up if you just suppress PyMalloc.
In Python 2.7 and 3.2 there is now a --with-valgrind compile-time flag that allows the Python interpreter to detect when it runs under valgrind and disables PyMalloc. This should allow you to more accurately monitor your memory allocations than otherwise, as PyMalloc just allocates memory in big chunks.
Yes you can: you do have a target to run valgrind with -- it's the python interpreter itself:
valgrind python foo.py
However, the results of above may not be very satisfactory -- Python is built in opt mode and with a special malloc, which may drown you in false positives.
You'll likely get better results by first building a debug version of Python. Start here.

analysis of core file

I'm using Linux redhat 3, can someone explain how is that possible that i am able to analyze
with gdb , a core dump generated in Linux redhat 5 ?
not that i complaint :) but i need to be sure this will always work... ?
EDIT: the shared libraries are the same version, so no worries about that, they are placed in a shaerd storage so it can be accessed from both linux 5 and linux 3.
thanks.
You can try following commands of GDB to open a core file
gdb
(gdb) exec-file <executable address>
(gdb) set solib-absolute-prefix <path to shared library>
(gdb) core-file <path to core file>
The reason why you can't rely on it is because every process used libc or system shared library,which will definitely has changes from Red hat 3 to red hat 5.So all the instruction address and number of instruction in native function will be diff,and there where debugger gets goofed up,and possibly can show you wrong data to analyze. So its always good to analyze the core on the same platform or if you can copy all the required shared library to other machine and set the path through set solib-absolute-prefix.
In my experience analysing core file, generated on other system, do not work, because standard library (and other libraries your program probably use) typically will be different, so addresses of the functions are different, so you cannot even get a sensible backtrace.
Don't do it, because even if it works sometimes, you cannot rely on it.
You can always run gdb -c /path/to/corefile /path/to/program_that_crashed. However, if program_that_crashed has no debug infos (i.e. was not compiled and linked with the -g gcc/ld flag) the coredump is not that useful unless you're a hard-core debugging expert ;-)
Note that the generation of corefiles can be disabled (and it's very likely that it is disabled by default on most distros). See man ulimit. Call ulimit -c to see the limit of core files, "0" means disabled. Try ulimit -c unlimited in this case. If a size limit is imposed the coredump will not exceed the limit size, thus maybe cutting off valuable information.
Also, the path where a coredump is generated depends on /proc/sys/kernel/core_pattern. Use cat /proc/sys/kernel/core_pattern to query the current pattern. It's actually a path, and if it doesn't start with / then the file will be generated in the current working directory of the process. And if cat /proc/sys/kernel/core_uses_pid returns "1" then the coredump will have the file PID of the crashed process as file extension. You can also set both value, e.g. echo -n /tmp/core > /proc/sys/kernel/core_pattern will force all coredumps to be generated in /tmp.
I understand the question as:
how is it possible that I am able to
analyse a core that was produced under
one version of an OS under another
version of that OS?
Just because you are lucky (even that is questionable). There are a lot of things that can go wrong by trying to do so:
the tool chains gcc, gdb etc will
be of different versions
the shared libraries will be of
different versions
so no, you shouldn't rely on that.
You have asked similar question and accepted an answer, ofcourse by yourself here : Analyzing core file of shared object
Once you load the core file you can get the stack trace and get the last function call and check the code for the reason of crash.
There is a small tutorial here to get started with.
EDIT:
Assuming you want to know how to analyse core file using gdb on linux as your question is little unclear.