Creating debug builds of Ocaml code with Jbuilder - ocaml

I have been reading the tutorials for the Ocaml language and for Jbuilder. The official tutorial indicates that one must compile Ocaml code using the '-g' flag with ocamlc in order to then run ocamldebug.
I cannot find any mention of debug builds on the Jbuilder documentation. The only section that seems close is https://jbuilder.readthedocs.io/en/latest/jbuild.html#ocaml-flags. However, even if I add '-g' as a compilation flag..
(executable
((name [REDACTED])
(public_name [REDACTED])
(libraries ([REDACTED]))
(flags (:standard -w -9+27-30-32-40#8
-safe-string
-linkall
-g))
(modules ([REDACTED]))))
..I still don't seem to get a debug binary:
$ ocamldebug [REDACTED]
OCaml Debugger version 4.04.2
(ocd) r
Loading program... [REDACTED] is not a bytecode file.
Am I doing something wrong? If not, what is the recommended way to produce debug builds from jbuilder?

ocamldebug only works with bytecode builds. You're producing native code. To create a bytecode build, you can invoke jbuilder using prog.bc instead of prog.exe.
Note that this might not be what you're after: you can also debug native programs using plain old gdb, but you'll need to be a bit familiar with the runtime.

Related

Fatal error: debugger does not support channel locks

I am trying to use ocamldebug with my project, to understand why a 3rd party lib I'm using is not behaving the way I expected.
https://ocaml.org/manual/debugger.html
The OCaml debugger is invoked by running the program ocamldebug with the name of the bytecode executable file as first argument
I have added (modes byte exe) to my dune file.
When I run dune build I can see the bytecode file output, alongside the exe, as _build/default/bin/cli.bc
When I pass this to ocamldebug I get the following error:
ocamldebug _build/default/bin/cli.bc
OCaml Debugger version 4.12.0
(ocd) r
Loading program... done.
Fatal error: debugger does not support channel locks
Lost connection with process 33035 (active process)
between time 170000 and time 180000
Restart from time 170000 and try to get closer of the problem ? (y or n)
If I choose y the console seems to hang indefinitely.
I found the source of the error here:
https://github.com/ocaml/ocaml/blob/f68acd1a618ac54790a8347fad466084f15a9a9e/runtime/debugger.c#L144
/* The code in this file does not bracket channel I/O operations with
Lock and Unlock, so fail if those are not no-ops. */
if (caml_channel_mutex_lock != NULL ||
caml_channel_mutex_unlock != NULL ||
caml_channel_mutex_unlock_exn != NULL)
caml_fatal_error("debugger does not support channel locks");
...but I don't know what might be triggering it.
My project is using cmdliner and lwt ...I think at this early point of execution it hasn't hit any lwt code though.
Is ocamldebug incompatible with cmdliner?
If that's the case then I will need to make a new entrypoint just for debugging I guess. (currently the bin/cli is the only executable artefact in my project, the code I need to debug is all under lib/s)
It looks like that the OCaml debugger is broken for your version of macOS. Please, report the issue to the OCaml issue tracker including the detailed information on your system. I can't reproduce it on my machine, but I am using a pretty old version of macOS (10.11.6) and I have the 4.12 debugger working flawlessly.
As a workaround, try using an older version of OCaml, as this channel lock test was introduced very recently you can install any version prior to 4.12,
opam switch create 4.11.0
eval $(opam env)
Then, do not forget to rebuild your project (previously installing the required dependencies),
opam install lwt cmdliner
dune build
and then you can use the debugger to your taste.

No source file for Netaccel_link error on running program

I have an OCaml program that worked fine on Ubuntu 16 but when recompiled and run on Ubuntu 20 I get the following error:-
$ ocamldebug ./linearizer
OCaml Debugger version 4.08.1
(ocd) r
Loading program... done.
Time: 89534
Program end.
Uncaught exception: Sys_error "Illegal seek"
(ocd) b
Time: 89533 - pc: 624888 - module Netaccel_link
No source file for Netaccel_link.
I thought this was due to missing dev libraries but:-
$ sudo apt install libocamlnet-ocaml-dev
Reading package lists... Done
Building dependency tree
Reading state information... Done
libocamlnet-ocaml-dev is already the newest version (4.1.6-1build6).
0 upgraded, 0 newly installed, 0 to remove and 20 not upgraded.
What setup step am I missing on Ubuntu 20?
This looks like a regression bug in libocamlnet and you should report an issue there or, I am a bit pessimistic that you will get any response, you can try to debug the issue yourself.
The problem that you are facing has nothing to do with missing libraries (they will be reported during installation or, if the package is broken, end up in linker errors). It may result, however, from some misconfiguration of the system. If that is true, then you're lucky as you can fix it yourself.
I will give you some advice that might help you in debugging this issue. For more, please try using discuss.ocaml.org as a more suitable media (SO doesn't favor this kind of a discussion and we might get deleted by admins).
The illegal seek exception is thrown when the seek operation is applied on a non-regular file, aka ESPIPE Unix error. So check your inputs. It could be that what was previously regarded as a file in Ubuntu is now a pipe or a socket.
Try to use ltrace or strace to pinpoint the culprit e.g.,
ltrace ./linearizer
or, if it overwhelms you, try strace
strace ./linearizer
Instead of using ocamldebug you can use plain gdb. You can use gdb's interfaces to provide the path to the source code (though most likely it won't work since ocamlnet is not compiled with debug information). I believe that it will give you a more meaningful backtrace.
Instead of using the system installation try using opam. Install your dependencies with opam and try older versions as well as newer versions of the OCaml compiler. Also, try different versions of ocamlnet. Ideally, try to reproduce the environment that used to work for you.
When nothing else works, you can use objdump -d and look at the disassembly of your binary. OCaml is using a pretty readable and intuitive name mangling scheme (<module_name>__<function_name>_<uid>), so you can easily find the source code (search for <module_name>.ml file and look for the <function_name> there)
Finally, just use docker or any other container to run your application. Consider switching from ocamlnet to something more modern and supported.

OCaml memory profiling with Memprof - TypeRex Utility

My program uses all of available memory, so I wanted to check which functions and abstracts are spoiling my project. I decided to use Memprof, so I installed their compiler and compiled my code with command
ocamlfind ocamlopt -package xml-light unix.cmxa str.cmxa -c -g NKJPxmlbasics.ml NKJP.mli NKJP.ml test.ml
and then run as suggested in tutorial
ocp-memprof --exec ./test
But there is error instead of result:
Error: no memory profiling information found. Possible causes:
- the application was not compiled with memory profiling support;
- the application exited before any major garbage collection was performed.
I even managed once to make it work but I have no idea how it happened
http://memprof.typerex.org/users/97beffbaec332eb7b2a048b94f7a38cf/2015-12-15_17-33-50_ab17218e800fe0a68fc2cfa54c13bfa6_16194/index.html
Is there any way to use this tool properly in this situation? What am I missing?
ocamlfind ... -c ... does not generate any executable. So, the ./test that you are running was probably generated by a previous command, probably without the memprof switch.

`cabal repl` causes GHC panic on simple project with C++ files

I've uploaded the project as a zip file so you can try it out.
https://dl.dropboxusercontent.com/u/35032740/ShareX/2015/11/Buggy.zip
I wanted to write a wrapper around the clipper library. The code compiles fine with cabal build, runs with cabal run but cabal repl produces this error:
Preprocessing executable 'Buggy' for Buggy-0.1.0.0...
GHCi, version 7.10.2: http://www.haskell.org/ghc/ :? for help
GHC runtime linker: fatal error: I found a duplicate definition for symbol
_ZNSt6vectorIN10ClipperLib8IntPointESaIS1_EE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPS1_S3_EERKS1_
whilst processing object file
dist\build\Buggy\Buggy-tmp\wrapper.o
This could be caused by:
* Loading two different object files which export the same symbol
* Specifying the same object file twice on the GHCi command line
* An incorrect `package.conf' entry, causing some object to be
loaded twice.
ghc.exe: panic! (the 'impossible' happened)
(GHC version 7.10.2 for x86_64-unknown-mingw32):
loadObj "dist\\build\\Buggy\\Buggy-tmp\\wrapper.o": failed
Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug
For reference, here's the cabal file
-- Initial Buggy.cabal generated by cabal init. For further documentation,
-- see http://haskell.org/cabal/users-guide/
name: Buggy
version: 0.1.0.0
-- synopsis:
-- description:
-- license:
license-file: LICENSE
author: Luka Horvat
maintainer: lukahorvat9#gmail.com
-- copyright:
-- category:
build-type: Simple
-- extra-source-files:
cabal-version: >=1.10
executable Buggy
main-is: Main.hs
c-sources: clipper.cpp
, wrapper.cpp
-- other-modules:
-- other-extensions:
build-depends: base >=4.8 && <4.9
-- hs-source-dirs:
default-language: Haskell2010
extra-libraries: stdc++
Any ideas what the cause might be here?
I'm running Windows 10, 64bit.
I don't know the details of object file formats on Windows, so I'm guessing a bit.
Probably clipper.o and wrapper.o both define a weak symbol named _ZNSt6vectorIN10ClipperLib8IntPointESaIS1_EE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPS1_S3_EERKS1_. (I see the same on Linux.) This probably came from a template instantiation (of vector). Weak symbols instruct the system linker to just pick any copy of the symbol if it encounters duplicates.
GHCi on Windows doesn't use the system linker, it has its own runtime linker that can load object files into itself while it runs. As a result it is generally not feature compatible with the system linker. Probably the runtime linker does not understand weak symbols, at least on Windows (https://ghc.haskell.org/trac/ghc/ticket/3333). From the error you got, we can assume that it treats them as regular symbols, and two regular symbols are not allowed to have the same name.
As a workaround, you may be able to build your C++ files with -fno-weak as described in https://stackoverflow.com/a/26454930/190376.
If that doesn't work, an alternative is to build your C++ files into a DLL, which you can have GHCi load using the system dynamic loader, avoiding this whole issue. On Linux this would look like
g++ wrapper.cpp clipper.cpp -shared -fPIC -o libclipper.so
ghci -L. -lclipper
though I imagine the details are different on Windows.
The specific error isn't what I'm used to seeing, but those backslashes say you're on Windows, and this otherwise looks like GHC bug #3242 which has been causing pain for years now. Good news: the cause was finally isolated two weeks ago. Bad news: the fix didn't make the deadline for 7.10.3, though at least the 8.0.1 milestone seems secure at this point.
Probably still worth posting your error text to that bug's thread; mine is only an educated guess, someone there will know for sure.

Need GLIBC debug information from rpmbuild of updated source

I'm working on RHEL WS 4.5.
I've obtained the glibc source rpm matching this system, opened it to get its contents using rpm2cpio.
Working in that tree, I've created a patch to mtrace.c (i want to add more stack backtrace levels) and incorporated it in the spec file and created a new set of RPMs including the debuginfo rpms.
I installed all of these on a test vm (created from the same RH base image) and can confirm that my changes are included.
But with more complex executions, I crash in mtrace.c ... but gdb can't find the debug information so I don't get line number info and I can't actually debug the failure.
Based on dates, I think I can confirm that the debug information is installed on the test system in /usr/src/debug/glibc-2.3.6/
I tried
sharedlibrary libc*
in gdb and it tells me the symbols are already loaded.
My test includes a locally built python and full symbols are found for python.
My sense is that perhaps glibc isn't being built under rpmbuild with debug enabled. I've reviewed the glibc.spec file and even built with
_enable_debug_packages
defined as 1 which looked like it might influence the result. My review of the configure scripts invoked during the rpmbuild build step didn't give me any hints.
Hmmmm .. just found /usr/lib/debug/lib/libc-2.3.4.so.debug
and /usr/lib/debug/lib/tls/i486/libc-2.3.4.so.debug
but both of these are reported as stripped by the file command.
It appears that you are installing non-matching RPMs:
/usr/src/debug/glibc-2.3.6
just found /usr/lib/debug/lib/libc-2.3.4.so.debug
There are not for the same version; there is no way they came from the same -debuginfo RPM.
both of these are reported as stripped by the file command.
These should not show as stripped. Either they were not built correctly, or your strip is busted.
Also note that you don't actually have to get all of this working to debug your problem. In the RPMBUILD directory, you should be able to find the glibc build directory, with full-debug libc.so.6. Just copy that library into your VM, and you wouldn't have to worry about the debuginfo RPM.
Try verifying that debug info for mtrace.c is indeed present. First see if the separate debug info for GLIBC knows about a compilation unit called mtrace.c:
$ eu-readelf -w /usr/lib/debug/lib64/libc-2.15.so.debug > t
$ grep mtrace t
name (strp) "mtrace.c"
name (strp) "mtrace"
1 0 0 0 mtrace.c
[10480] "mtrace.c"
[104bb] "mtrace"
[5052] symbol: mtrace, CUs: 446
Then see if GDB actually finds the source file from the glibc-debuginfo RPM:
(gdb) set pagination off
(gdb) start # pause your test program right after main()
(gdb) set logging on
Copying output to gdb.txt.
(gdb) info sources
Quit GDB then grep for mtrace in gdb.txt and you should find something like /usr/src/debug/glibc-2.15-a316c1f/malloc/mtrace.c
This works with GDB 7.4. I'm not sure the GDB version shipped with RHEL 4.5 supports all the command used above. Building upstream GDB from source is in fact easier than Python though.
When trying to add strack traces to mtrace, make sure you don't call malloc() directly or indirectly in the GLIBC malloc hooks.