Linking unknown list of outs to cc_library - build

I have a genrule that takes "in" a config file, and spits "out" a large number of built files (*.so*s and *.h files):
genrule(
name = "ros_auto",
local = True,
srcs = [":package_list"],
outs = ["ros_built.tgz"],
tools = [":build_packages.bash"],
cmd = "$(location :build_packages.bash) $< $#"
)
Next up I need to take all of the files that are output from the above genrule, and create a cc_library from them like below (at the moment the only manageable way I can find to register the output files in the genrule is tarballing them & declaring the tar):
cc_library(
name = "ros",
srcs = glob(["lib/*.so*"]),
hdrs = glob([
"include/**/*.h",
"include/**/*.hpp",
]),
strip_include_prefix = "include",
visibility = ["//visibility:public"],
)
No matter where I look I seem to continue to find deadend after deadend (http_archive uses a download_and_extract method which assumes the *tgz is remote, the cc_library implementation is inaccessible / unextendible Java, etc.).
I would've thought the problem of "I have node A that generates a tonne of outfiles, which node B depends on" would be extremely common and have a simple solution. Am I missing something obvious?
Context:
I have this working with a local repository rule that takes in the local directory, and uses the cc_library rule above as the build_file parameter (but that means building the first step of this process has to be done completely outside of the Bazel build process which is not how this should be done):
new_local_repository(
name = "ros",
path = "/tmp/ros_debug",
build_file = "//ros:BUILD.bazel",
licenses = ["https://docs.ros.org/diamondback/api/licenses.html"]
)

A basic bazel philosophie is that a build should depend on unambigious inputs/dependencies and outputs strictly defined files in order to guarantee reproducable builds. Having a genrule generating a "random" number of files that a target should depend on is against this philosophy. Thats why you cant find a good way to achieve this.
A way to work around this is to have a single outs file in your genrule where you write down the names of the generated files (or a log of your script or whatever). Then you can define an additional filegroup containing all the generated files, using glob. Then add your genrule as dependency to the rule of node B and the filegroup to the srcs. Therefore it is guaranteed, that the files are generated before building the node B

Related

Why does bazel not see includes defined in bazelrc?

I am migrating large legacy makefiles project to Bazel. Project used to copy all sources and headers into single "build dir" before build, and because of this all source and header files use single level includes, without any prefix (#include "1.hpp").
Bazel requires that modules (libraries) use relative path to header starting at WORKSPACE file, however my goal is to introduce Bazel build files, which require 0 modifications of a source code.
I use bazelrc to globally set paths to includes as if structure was flat:
.bazelrc:
build --copt=-Ia/b/c
/a/b/BUILD
cc_library(
name = "lib",
srcs = ["c/1.cpp"],
hdrs = ["c/1.hpp"],
visibility = ["//visibility:public"]
)
When I build this target, I see my -I flag in compiler invocation, but compilation fails because bazel can not find header 1.hpp:
$ bazel build -s //a/b:lib
...
a/b/c/1.cpp:13:10: fatal error: 1.hpp: No such file or directory
13 | #include "1.hpp"
|
Interestingly enough, it prints me gcc command that it invokes during build and if I run this command, compiler is able to find 1.hpp and 1.cpp compiles.
How to make bazel "see" this includes? Do I really need to additionally specify copts for every target in addition to global -I flags?
Bazel use sandboxing: for each action (compile a C++ file, link a library) the specific build directory is prepared. That directory contains only files (using symlinks and other Linux sorcery), which are explicitly defined as dependency/source/header for given target.
That trick with --copt=-Ia/b/c is a bad idea, because that option will work only for targets, which depend on //a/b:lib.
Use includes or strip_include_prefix attribute instead:
cc_library(
name = "lib",
srcs = ["c/1.cpp"],
hdrs = ["c/1.hpp"],
strip_include_prefix = "c",
visibility = ["//visibility:public"]
)
and add the lib as a dependency of every target, which need to access these headers:
cc_binary(
name = "some bin",
srcs = ["foo.cpp"],
deps = ["//a/b:lib"],
)

Bazel cc_library with no srcs doesn't compile on its own

I have a cc_library that is header-only. Whenever I try to compile such library by itself, it won't actually compile anything. I purposely put some errors to try to get such errors on compilation, but bazel doesn't actually compile anything. Here's a small example.
// test.h
This should not compile fdsafdsafdsa
int foo() { return 1; }
# BUILD
cc_library(
name = 'test',
hdrs = ['test.h']
)
// bazel build :test
INFO: Analyzed target //:test (2 packages loaded, 3 targets configured).
INFO: Found 1 target...
Target //:test up-to-date (nothing to build)
INFO: Elapsed time: 0.083s, Critical Path: 0.00s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action
Is this behavior intended?
I also ran the same experiment but splitting the .h and .cc files, and in that case, I got the error when I compiled.
cc_library (other rules as well incl. pkg_tar for instance) does not have to have any sources. This is also valid:
cc_library(
name = "empty",
srcs = [],
)
And it is actually quite useful too. You may have configurable attributes such as deps (or srcs) where actual content is only applicable for certain conditions:
cc_binary(
name = "mybinary",
srcs = ["main.c"],
deps = select({
":platform1": [":some_plat1_only_lib"],
":platform2": [":empty"], # as defined in the above snippet
}),
)
Or (since above you could have just as well used [] for :platform2 deps) where you have a larger tree and you expect developers to just depend on //somelib:somelib, you could use this empty library through an alias to give them a single label without having to worry about all the platform specific details and how providing certain function is handled where:
# somelib/BUILD:
alias(
name = "somelib",
actual = select({
":platform1": [":some_plat1_only_lib"],
":platform2": [":empty"], # as defined in the above snippet
}),
visibility = ["//visibility:public"], # set appropriately
)
And mybinary or any other target could now say:
cc_binary(
name = "mybinary",
srcs = ["main.c"],
deps = ["//somelib"],
)
And of course, as the other answer here states, there are header only libraries.
Also in the example you've used in your question. (bazel or not) you would normally not (and it would not be very useful either) compile a header file on its own. You would only use its content and only then see the compiler fails as you attempt to build a source the header is #included from. That is for bazel build to fail, another target would have to depend on test and #include "test.h".
A header only library means that you don't need to build anything. Simply include the headers you need in your program and use them.

What is the correct way to create a system header only library in bazel?

We are migrating a CMake project to Bazel. We have several header only libraries that are tagged SYSTEM in CMake project to suppress some warnings. When migrating these to Bazel, the way we are able to make this work is by using the below
cc_library(
name = "lib",
srcs = ["include/header1.h", ...],
includes = ["include"],
)
This works but as per Bazel C++ documentation, it is not recommended to have interface/public headers in srcs. Those should be part of hdrs. Adding these to headers doesn't work because it uses the regular -I based inclusion instead of -isystem.
Is our way of doing this fine, although not recommended by bazel? If not, what would be the correct way of doing it?
EDIT:
After some digging, found the textual_hdrs attribute on cc_library and using that it seems to work too. And this seems to be a cleaner approach than adding the public headers to srcs. Now the rule looks like this
cc_library(
name = "lib",
textual_hdrs = ["include/header1.h", ...],
includes = ["include"],
)
This looks like a good solution for us, except that the documentation on textual_hdrs isn't clear enough to indicate that this is what it is meant for.
PS: It is really not possible for us to refactor the code to fix the warnings as there are numerous libraries like this and just completely outside the scope of this migration effort.
It turns out adding it to hdrs does work if you make sure that strip_include_prefix is None (or not passed). We had a macro that was wrapping up the cc_library instance and it was defaulting strip_include_prefix to empty string. Interestingly this doesn't affect textual_hdrs or srcs, but does affect hdrs
In summary the below seems to work fine
cc_library(
name = "lib",
hdrs = ["include/header1.h", ...],
includes = ["include"],
)

Building LLVM with Bazel

I've got a project currently using CMake, which I would like to switch over to Bazel. The primary dependency is LLVM, which I use to generate LLVM IR. Looking around, there doesn't seem to be a whole lot of guidance on this as only TensorFlow seems to use LLVM from Bazel (and auto-generates its config as far as I can tell). There was also a thread on bazel-discuss I found which discussed a similar issue, though my attempts to replicate it have failed.
Currently, my best run has got to be this (fetcher.bzl):
def _impl(ctx):
# Download LLVM master
ctx.download_and_extract(url = "https://github.com/llvm-mirror/llvm/archive/master.zip")
# Run `cmake llvm-master` to generate configuration.
ctx.execute(["cmake", "llvm-master"])
# The bazel-discuss thread says to delete llvm-master, but I've
# found that only generated files are pulled out of master, so all
# the non-generated ones get dropped if I delete this.
# ctx.execute(["rm", "-r", "llvm-master"])
# Generate a BUILD file for the LLVM dependency.
ctx.file('BUILD', """
# Build a library with all the LLVM code in it.
cc_library(
name = "lib",
srcs = glob(["**/*.cpp"]),
hdrs = glob(["**/*.h"]),
# Include the x86 target and all include files.
# Add those under llvm-master/... as well because only built files
# seem to appear under include/...
copts = [
"-Ilib/Target/X86",
"-Iinclude",
"-Illvm-master/lib/Target/X86",
"-Illvm-master/include",
],
# Include here as well, not sure whether this or copts is
# actually doing the work.
includes = [
"include",
"llvm-master/include",
],
visibility = ["//visibility:public"],
# Currently picking up some gtest targets, I have that dependency
# already, so just link it here until I filter those out.
deps = [
"#gtest//:gtest_main",
],
)
""")
# Generate an empty workspace file
ctx.file('WORKSPACE', '')
get_llvm = repository_rule(implementation = _impl)
And then my WORKSPACE file looks like the following:
load(":fetcher.bzl", "get_llvm")
git_repository(
name = "gflags",
commit = "46f73f88b18aee341538c0dfc22b1710a6abedef", # 2.2.1
remote = "https://github.com/gflags/gflags.git",
)
new_http_archive(
name = "gtest",
url = "https://github.com/google/googletest/archive/release-1.8.0.zip",
sha256 = "f3ed3b58511efd272eb074a3a6d6fb79d7c2e6a0e374323d1e6bcbcc1ef141bf",
build_file = "gtest.BUILD",
strip_prefix = "googletest-release-1.8.0",
)
get_llvm(name = "llvm")
I would then run this with bazel build #llvm//:lib --verbose_failures.
I would consistently get errors from missing header files. Eventually I found that running cmake llvm-master generated many header files into the current directory, but seemed to leave the non-generated ones in llvm-master/. I added the same include directories under llvm-master/ and that seems to catch a lot of the files. However, currently it seems that tblgen is not running and I am still missing critical headers required for the compilation. My current error is:
In file included from external/llvm/llvm-master/include/llvm/CodeGen/MachineOperand.h:18:0,
from external/llvm/llvm-master/include/llvm/CodeGen/MachineInstr.h:24,
from external/llvm/llvm-master/include/llvm/CodeGen/MachineBasicBlock.h:22,
from external/llvm/llvm-master/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h:20,
from external/llvm/llvm-master/include/llvm/CodeGen/GlobalISel/ConstantFoldingMIRBuilder.h:13,
from external/llvm/llvm-master/unittests/CodeGen/GlobalISel/PatternMatchTest.cpp:10:
external/llvm/llvm-master/include/llvm/IR/Intrinsics.h:42:38: fatal error: llvm/IR/IntrinsicEnums.inc: No such file or directory
Attempting to find this file in particular, I don't see any IntrinsicEnums.inc, IntrinsicEnums.h, or IntrinsicEnums.dt. I do see a lot of Instrinsics*.td, so maybe one of them generates this particular file?
It seems like tblgen is supposed to convert the *.td files to *.h and *.cpp files (please correct me if I am misunderstanding). However, this doesn't seem to be running. I saw that in Tensorflow's project, they have a gentbl() BUILD macro, though it is not practical for me to copy it as it has way too many dependencies on the rest of Tensorflow's build infrastructure.
Is there any way to do this without something as big and complex as Tensorflow's system?
I posted to the llvm-dev mailing list here and got some interesting responses. LLVM definitely wasn't designed to support Bazel and doesn't do so particularly well. It appears to be theoretically possible by using Ninja to output all the compile commands and then consume them from Bazel. This is likely to be pretty difficult and would require a separate tool which outputs Skylark code to be run by Bazel.
This seemed pretty complex for the scale of project I was working on, so my workaround was to download the pre-built binaries from releases.llvm.org. This included all the necessary headers, libraries, and tooling binaries. I was able to make a simple but powerful toolchain based around this in Bazel for my custom programming language.
Simple example (limited but focused): https://github.com/dgp1130/llvm-bazel-foolang
Full example (more complex and less focused): https://github.com/dgp1130/sanity-lang

How incremental is cc_library in bazel

In bazel documentation (https://docs.bazel.build/versions/master/cpp-use-cases.html) there's an example like this:
cc_library(
name = "build-all-the-files",
srcs = glob(["*.cc"])
hdrs = glob(["*.h"]),
)
How incremental it is? I.e. if I change only one of the *.cc files, will it rebuild the whole target or only what's required?
It will just recompile the modified file. Bazel will then link the library if the object file changes (so if you just change a comment, it may skip the link step).
You still have doubts?
Add the flag -s when you build and you will see what Bazel actually runs.