Say, I have mylibrary.ml which provides wrappings to library.c and I want to bytecode-compile and provide mylibrary.ml as a library for other ocaml code. Compiling this to bytecode (and I am not considering compiling ocaml to native code here) produces a number of files, and I am wondering if there is any reason to keep them all? Or to provide them all to other users of the library?
I (so far) understand that I need the bytecode library object mylibrary.cma so that I can use mylibrary in the ocaml toplevel as
ocaml mylibrary.cma
or that I can
#load "mylibrary.cma";;
from an ocaml script. Then also the compiled interface mylibrary.cmi, and the dllmylibrary.so (which contains the C parts of the code) are needed for the above to work. And the non compiled interface definition file mylibrary.mli is nice to keep for documentation purposes.
But is there any reason to retain also the mylibrary.cmo file, if I have the mylibrary.cma file? In what kind of case would someone like to have that, too?
EDIT: I mean, I need to construct the .cmo in the makefile and then use that to construct the .cma, but I was thinking to remove .cmo after this, to keep the directory marginally cleaner.
So, apparently the purpose of different files are (when limiting this to bytecode compiler):
mylibrary.mli - human readable interface definition (not strictly needed, compiler only needs .cmi)
mylibrary.cmi - compiled interface, needed when compiling code calling mylibrary
library.o - C object file
dlllibrary.so - shared library object made of the .o
dlllibrary.a - static library object made of the .o
mylibrary.cmo - bytecode object compiled from mylibrary.ml
mylibrary.cma - bytecode library
Then, mylibrary.cma (with mylibrary.cmi and dlllibrary.so) are needed when loading mylibrary from the toplevel:
#load "mylibrary.cma";;
OR
One could compile a bytecode program that dynamically links with mylibrary.cma (mylibrary.cmi and dlllibrary.so also needed):
ocamlc mylibrary.cma <program>.ml
OR
dynamically linking with bytecode object, instead of the bytecode library (files needed: mylibrary.cmo, mylibrary.cmi, dlllibrary.so):
ocamlc dlllibrary.so mylibrary.cmo <program>.ml
(Note: then run the bytecode with: ocamlrun -I . <program>, assuming dlllibrary.so is in the current directory.)
OR
statically linking with objects (files needed: mylibrary.cmo, mylibrary.cmi liblibrary.a)
ocamlc -custom liblibrary.a mylibrary.cmo <program>.ml
OR
statically linking with library objects (files needed: mylibrary.cma, mylibrary.cmi, liblibrary.a)
ocamlc -custom -I . mylibrary.cma <program>.ml
So depending on how one is going to use the library in the future, different files are needed. Except .mli is only needed for human readers, and the object file.o compiled from the C library is not needed (but would be needed when compiling to native code in some cases).
You should even keep the source code of the library. Very often when upgrading the Ocaml compiler (e.g. from 3.12 to future 3.13), you previous *.cmo or *.cma files won't be able to work without recompilation.
Assuming you can always clean and recompile things (e.g. you have the two traditional targets clean and all to make) you can only keep the *.cma
Related
I have multiple files in my project which are compiled to object files using either -fPIC or without this option. When compiling those files for the use in a shared library, this option is necessary, else not. Thus when I compile the project into a shared library, this option is necessary.
Nevertheless I never am sure if the last compilation was done into the shared library or not. If not, and I want to compile it into the shared library, the compilation fails, and I have to delete the generated object files. The recompilation of those object files takes quite a lot of time, which I would like to avoid.
Is there a way for the makefile to detect if the object files have been compiled with or without this option right at the beginning, so that either the compilation can continue, or both have to be recompiled, without generating either an error or spending a lot of time in an unnecessary recompilation loop?
Q: "Is there a way for the makefile to detect if the object files have been compiled with or without this option"
Short answer: No
Long answer: if source files can be build with different options and you have to have access to the different builds at the same time, then you must have several output folders. And the makefiles must generate them in the correct folder.
Say something like this for a simple debug/release ("-g" flag) :
|
-src
-include
-BUILD
|
-obj
| -debug
| -release
-bin
|-debug
|-release
Of course, this approach has limitations. For example, if you need to have both "debug/release" and "PIC/not PIC", then you will need 4 output folders.
You can also mix my approach with the one proposed by #BasileStarynkevitch (generating specific names).
A possible approach could be to have your own convention about object files and their naming. For example, file foo.c would be compiled into foo.pic.o when it is position-independent code, and into foo.o when it is not. And you could adapt that idea to other cases (e.g. foo.dbg.o for a file with DWARF debug information, compiled with -g). You could also use subdirectories, e.g. compile foo.c into PICOBJ/foo.o or PICOBJ/foo.pic.o for a PIC object file, but this might be less convenient with make (beware of recursive makes).
Writing appropriate rules for your Makefile is then quite easy.
Be aware of other build automation systems, e.g. ninja.
I am trying to incorporate a C library into some Rcpp code.
I can use the C library in a C++ program easily. I 'make' the C library, which creates the .a and .dll files in the /lib folder. I can then use the package by including the header in the program and running something like this from command line:
cc myfile.cpp -o myfile -Ipath.to.header path.to.lib.a -lz
This essentially tells the compiler to take the .cpp program, include headers from -I, and to link to two libraries.
Getting this to work with Rcpp shouldn't be overly difficult if I understand makevars correctly (which I unfortunately don't seem to).
I add the library to a folder in my package, and in src I add a makevars and makevars.win that look like this:
PKG_CFLAGS=
# specify header location
PKG_CPPFLAGS=-Ipath.to.lib/include
# specify libs to link to
PKG_LIBS=path.to.lib/lib/file.a -lz
# make library
path.to.lib/lib/file.a:
cd path.to.lib;$(MAKE)
This correctly 'makes' the .a and .dll files for the library, however none of the Rcpp magic runs (i.e. in the build I never see the g++ system call that compiles the files in src), so "no Dll was created".
I am fairly certain this is a problem in my makevars target that makes the library. When I remove that portion from the makevars, and 'make' the library from the command line myself before building the package, I get the proper g++ calls with my -I and -l statements, but I get errors about undefined references.
I notice that the -l statements are only included in the final g++ call where the final .dll is made, but isn't included in the earlier g++ calls where the files with the library headers are compiled.
So I have two problems:
How do I fix my makevars so that it 'makes' the library, but doesn't stop Rcpp from compiling the files in src?
How do I deal with the undefined references? The library is clearly not header-only, so I am guessing it needs the -l statement in the earlier g++ calls, but that may not even be possible.
The best approach is to avoid complicated src/Makevars file altogether.
One easy-ish approach around this: use configure to build your static library, then once you actually build just refer to it in src/Makevars.
I use that scheme in Rblpapi (where we copy an externally supplied library in) and in nloptr where we download nlopt sources and build it 'when needed' (ie when no libnlopt is on the system).
Super-simple, totally boring setup: I have a directory full of .hpp and .cpp files. Some of these .cpp files need to be built into executables; naturally, these .cpp files #include some of the .hpp files in the same directory, which may then include others, etc. etc. Most of those .hpp files have corresponding .cpp files, which is to say: if some_application.cpp #includes foo.hpp, either directly or transitively, then chances are there's also a foo.cpp file that needs to be compiled and linked into the some_application executable.
Super-simple, but I'm still clueless about what the "best" way to build it is, either in SCons or CMake (neither of which I have any expertise in yet, other than staring at documentation for the last day or so and becoming sad). I fear that the sort of solution I want may actually be impossible (or at least grossly overcomplicated) to pull off in most build systems, but if so, it'd be nice to know that so I can just give up and be less picky. Naturally, I'm hoping I'm wrong, which wouldn't be surprising given how ignorant I am about build systems (in general, and about CMake and SCons in particular).
CMake and SCons can, of course, both automatically detect that some_application.cpp needs to be recompiled whenever any of the header files it depends on (either directly or transitively) changes, since they can "parse" C++ files well enough to pick out those dependencies. OK, great: we don't have to list each .cpp-#includes-.hpp dependency by hand. But: we still need to decide what subset of object files need to get sent to the linker when it's time to actually generate each executable.
As I understand it, the two most straightforward alternatives to dealing with that part of the problem are:
A. Explicitly and laboriously enumerating the "anything using this object file needs to use these other object files too" dependencies by hand, even though those dependencies are exactly mirrored by the corresponding-.cpp-transitively-includes-the-corresponding-.hpp dependencies that the build system already went to the trouble of figuring out for us. Why? Because computers.
B. Dumping all the object files in this directory into a single "library", and then having all executables depend on and link in that one library. This is much simpler, and what I understand most people would do, but it's also kinda sloppy. Most of the executables don't actually need everything in that library, and wouldn't actually need to be rebuilt if only the contents of one or two .cpp files changed. Isn't this setting up exactly the kind of unnecessary computation a supposed "build system" should be avoiding? (I suppose maybe they wouldn't need to be rebuilt if the library were dynamically linked, but suffice it to say I dislike dynamically linked libraries for other reasons.)
Can either CMake or SCons do better than this in any remotely straightforward fashion? I see a bunch of limited ways to twiddle the automatically generated dependency graph, but no general-purpose way to do so interactively ("OK, build system, what do you think the dependencies are? Ah. Well, based on that, add the following dependencies and think again: ..."). I'm not too surprised about that. I haven't yet found a special-purpose mechanism in either build system for dealing with the super-common case where link-time dependencies should mirror corresponding compile-time #include dependencies, though. Did I miss something in my (admittedly somewhat cursory) reading of the documentation, or does everyone just go with option (B) and quietly hate themselves and/or their build systems?
Your statement in point A) "anything using this object file needs to use these other object files too" is something that will indeed need to be done by hand. Compilers dont automatically find object files needed by a binary. You have to explicitly list them at link time. If I understand your question correctly, you dont want to have to explicitly list the objects needed by a binary, but want the build tool to automatically find them. I doubt there is any build too that does this: SCons and Cmake definitely dont do this.
If you have an application some_application.cpp that includes foo.hpp (or other headers used by these cpp files), and subsequently needs to link the foo.cpp object, then in SCons, you will need to do something like this:
env = Environment()
env.Program(target = 'some_application',
source = ['some_application.cpp', 'foo.cpp'])
This will only link when 'some_application.cpp', 'foo.hpp', or 'foo.cpp' have changed. Assuming g++, this will effectively translate to something like the following, independently of SCons or Cmake.
g++ -c foo.cpp -o foo.o
g++ some_application.cpp foo.o -o some_application
You mention you have "a directory full of .hpp and .cpp files", I would suggest you organize those files into libraries. Not all in one library, but logically organize them into smaller, cohesive libraries. Then your applications/binaries would link the libraries they need, thus minimizing recompilations due to not used objects.
I had more or less the same problem as you have and I solved it as follows:
import SCons.Scanner
import os
def header_to_source(header_file):
"""Specify the location of the source file corresponding to a given
header file."""
return header_file.replace('include/', 'src/').replace('.hpp', '.cpp')
def source_files(main_file, env):
"""Returns list of source files the given main_file depends on. With
the function header_to_source one must specify where to look for
the source file corresponding to a given header. The resulting
list is filtered for existing files. The resulting list contains
main_file as first element."""
## get the dependencies
node = File(main_file)
scanner = SCons.Scanner.C.CScanner()
path = SCons.Scanner.FindPathDirs("CPPPATH")(env)
deps = node.get_implicit_deps(env, scanner, path)
## collect corresponding source files
root_path = env.Dir('#').get_abspath()
res = [main_file]
for dep in deps:
source_path = header_to_source(
os.path.relpath(dep.get_abspath(), root_path))
if os.path.exists(os.path.join(root_path, source_path)):
res.append(source_path)
return res
The header_to_source method is the one you need to modify such that it returns the source file corresponding to a given header file. Then the method source_file gives you all the source files you need to build the given main_file (including the main_file as first element). Non existing files are automatically removed. So the following should be sufficient to define the target for an executable:
env.Program(source_files('main.cpp', env))
I am not sure whether this works in all possible setups, but at least for me it works.
I have access to a large C++ project, full of files and with a very complicated makefile courtesy of automake & friends
Here is an idea of the directory structure.
otherproject/
folder1/
some_headers.h
some_files.cpp
...
folderN/
more_headers.h
more_files.cpp
build/
lots_of things here
objs/
lots_of_stuff.o
an_executable_I_dont_need.exe
my_stuff/
my_program.cpp
I want to use a class from the big project, declared in say, "some_header.h"
/* my_program.cpp */
#include "some_header.h"
int main()
{
ThatClass x;
x.frobnicate();
}
I managed to compile my file by painstakingly passing lots of "-I" options to gcc so that it could find all the header files
g++ my_program.cpp -c -o myprog.o -I../other/folder1 ... -I../other/folderN
When it comes to compiling I have to manually include all his ".o"s, which is probably overkill
g++ -o my_executable myprog.o ../other/build/objs/*.o
However, not only do I have to do things like manually removing his "main.o" from the list, but this isn't even enough since I forgot to also link against all the libraries that he happened to use.
otherproject/build/objs/StreamBuffer.h:50: undefined reference to `gzread'
At this point I am starting to feel I am probably doing something very wrong. How should I proceed? What is the usual and what is the best approach this kind of issue?
I need this to work on Linux in case something platform-specific needs to be done.
Generally the project's .o files should come grouped together into a library (on Linux, .a file if it's a static library, or .so if it's a dynamic library), and you link to the library using the -L option to specify the location and the -l option to specify the library name.
For example, if the library file is at /path/to/big_project/libbig_project.a, you would add the options -L /path/to/big_project -l big_project to your gcc command line.
If the project doesn't have a library file that you can link to (e.g. it's not a library but an executable program and you just want some of the code used by the executable program), you might want to try asking the project's author to create such a library file (if he/she is familiar with "automake and friends" it shouldn't be too much trouble for him), or try doing so yourself.
EDIT Another suggestion: you said the project comes with a makefile. Try makeing it with the makefile, and see what its compiler command line looks like. Does it have many includes and individual object files as well?
Treating an application which was not developed as a library as if it was a library isn't likely to work. As an offhand example, omitting the main might wind up cutting out initialization code that the class you want depends upon.
The responsible thing to do here is to read the code, understand it, and turn the functionality you want into a proper library. Build the "exe you don't need" with debug symbols and set breakpoints in the constructors and methods of the class. Step into them so you get a grasp on the functionality and what parts of the program are relevant and irrelevant to your needs.
Hopefully the code is under some kind of version control system that supports branching (such as Git). If not, make your own repository that does. Edit the files until you've organized them into a library and code that uses the library. Make sure it works properly within the context of the original program. Then turn around and use this library in your own program.
If you've done a good job, you might be able to convince the original authors to accept the separation back into their original codebase. If not, at least version control has your back so you can manage integration of future changes.
I have some doubt about how do programs use shared library.
When I build a shared library ( with -shared -fPIC switches) I make some functions available from an external program.
Usually I do a dlopen() to load the library and then dlsym() to link the said functions to some function pointers.
This approach does not involve including any .h file.
Is there a way to avoid doing dlopen() & dlsym() and just including the .h of the shared library?
I guess this may be how c++ programs uses code stored in system shared library. ie just including stdlib.h etc.
Nick, I think all the other answers are actually answering your question, which is how you link libraries, but the way you phrase your question suggests you have a misunderstanding of the difference between headers files and libraries. They are not the same. You need both, and they are not doing the same thing.
Building an executable has two main phases, compilation (which turns your source into an intermediate form, containing executable binary instructions, but is not a runnable program), and linking (which combines these intermediate files into a single running executable or library).
When you do gcc -c program.c, you are compiling, and you generate program.o. This step is where headers matter. You need to #include <stdlib.h> in program.c to (for example) use malloc and free. (Similarly you need #include <dlfcn.h> for dlopen and dlsym.) If you don't do that the compiler will complain that it doesn't know what those names are, and halt with an error. But if you do #include the header the compiler does not insert the code for the function you call into program.o. It merely inserts a reference to them. The reason is to avoid duplication of code: The code is only going to need to be accessed once by every part of your program, so if you needed further files (module1.c, module2.c and so on), even if they all used malloc you would merely end up with many references to a single copy of malloc. That single copy is present in the standard library in either it's shared or static form (libc.so or libc.a) but these are not referenced in your source, and the compiler is not aware of them.
The linker is. In the linking phase you do gcc -o program program.o. The linker will then search all libraries you pass it on the command line and find the single definition of all functions you've called which are not defined in your own code. That is what the -l does (as the others have explained): tell the linker the list of libraries you need to use. Their names often have little to do with the headers you used in the previous step. For example to get use of dlsym you need libdl.so or libdl.a, so your command-line would be gcc -o program program.o -ldl. To use malloc or most of the functions in the std*.h headers you need libc, but because that library is used by every C program it is automatically linked (as if you had done -lc).
Sorry if I'm going into a lot of detail but if you don't know the difference you will want to. It's very hard to make sense of how C compilation works if you don't.
One last thing: dlopen and dlsym are not the normal method of linking. They are used for special cases where you want to dynamically determine what behavior you want based on information that is, for whatever reason, only available at runtime. If you know what functions you want to call at compile time (true in 99% of the cases) you do not need to use the dl* functions.
You can link shared libraries like static one. They are then searched for when launching the program. As a matter of fact, by default -lXXX will prefer libXXX.so to libXXX.a.
You need to give the linker the proper instructions to link your shared library.
The shared library names are like libNAME.so, so for linking you should use -lNAME
Call it libmysharedlib.so and then link your main program as:
gcc -o myprogram myprogram.c -lmysharedlib
If you use CMake to build your project, you can use
TARGET_LINK_LIBRARIES(targetname libraryname)
As in:
TARGET_LINK_LIBRARIES(myprogram mylibrary)
To create the library "mylibrary", you can use
ADD_LIBRARY(targetname sourceslist)
As in:
ADD_LIBRARY(mylibrary ${mylibrary_SRCS})
Additionally, this method is cross-platform (whereas simply passing flags to gcc is not).
Shared libraries (.so) are object files where the actual source code of function/class/... are stored (in binary)
Header files (.h) are files indicating (the reference) where the compiler can find function/class/... (in .so) that are required by the main code
Therefore, you need both of them.