I have a project that builds via CMake and requires a lot of manual installations of additional deps. I want to migrate this project to Bazel and make these libs automatically downloadable. I found a solution for Boost, but I can't understand how to add icu4c and other libs which builds via other tools.
There are many ways to make use of third-party libraries using Bazel. The chosen approach depends on different properties of the third-party library, e.g.: Does the third-party library already support Bazel? Is the library available only as a pre-build package? Does the library use code generators, or any other tools, or transitive dependencies?
Given the example of {fmt} which uses CMake as build system you can proceed as the following:
First approach: Inject a BUILD file
In your WORKSPACE file you can do something like:
maybe(
new_git_repository,
name = "fmt",
branch = "master",
remote = "https://github.com/fmtlib/fmt",
build_file = "//third_party:fmt.BUILD",
)
The corresponding fmt.BUILD file can look like this:
cc_library(
name = "fmt",
srcs = [
#"src/fmt.cc", # No C++ module support
"src/format.cc",
"src/os.cc",
],
hdrs = [
"include/fmt/args.h",
"include/fmt/chrono.h",
"include/fmt/color.h",
"include/fmt/compile.h",
"include/fmt/core.h",
"include/fmt/format.h",
"include/fmt/format-inl.h",
"include/fmt/locale.h",
"include/fmt/os.h",
"include/fmt/ostream.h",
"include/fmt/printf.h",
"include/fmt/ranges.h",
"include/fmt/xchar.h",
],
includes = [
"include",
"src",
],
strip_include_prefix = "include",
visibility = ["//visibility:public"],
)
Advantages:
fmt-8.01 does not have out-of-the-box support for {fmt}. This way Bazel can make use of {fmt} without the need that {fmt} knows anything about Bazel
fmt-8.0.1 needs not to be modified
Disadvantages:
Reinvent the wheel: Every Bazel project that wants to use {fmt} has to reinvent this fmt.BUILD file.
Maintenance costs: If different Bazel projects want to adapt to future versions of {fmt} every single project has to do this maintenance by its own. Maybe new files will be introduced.
Missing Knowledge: Maybe for some reason, it makes sense to define some special defines upfront, etc. It also takes some time and knowledge of {fmt} to set up such a BUILD file. What is the best practice to build this lib?
Second approach: Bazelize {fmt}
Add a WORKSPACE file and BUILD file to the {fmt} repository.
This way {fmt} gets bazelized and can be used in your Bazel builds.
You could use it then this way:
Example
Create a WORKSPACE.bazel file with the following content:
load("#bazel_tools//tools/build_defs/repo:git.bzl", "git_repository")
# Fetch bazelized fmt
git_repository(
name = "fmt",
branch = "bazel-support", # A copy of master where BUILD.bazel, WORKSPACE.bazel, .bazelrc and .bazelversion are moved to root
remote = "https://github.com/<user_or_organisation>/fmt", # replace <user_or_organisation> by a valid account
)
Create a BUILD.bazel file and add a dependency to {fmt} (wit the content of fmt.BUILD).
In favor of keeping the {fmt} project directory clean, those files were not added to the project root directory (see here for details).
Third approach: Using the {fmt} repository with Bazel
Even though the {fmt} repository does not contain a WORKSPACE file in its root directory, there is an easy approach to use the {fmt} repository with Bazel out of the box. This is demonstrated in the following example.
Add to your WORKSPACE file:
load("#bazel_tools//tools/build_defs/repo:git.bzl", "new_git_repository")
# Fetch all files from fmt including the BUILD file `support/bazel/BUILD.bazel`
new_git_repository(
name = "fmt_workaround",
branch = "master",
remote = "https://github.com/fmtlib/fmt/",
build_file_content = "# Empty build file on purpose"
)
# Now the BUILD file `support/bazel/BUILD.bazel` can be used:
new_git_repository(
name = "fmt",
branch = "master",
remote = "https://github.com/fmtlib/fmt/",
build_file = "#fmt_workaround//:support/bazel/BUILD.bazel"
)
Create a BUILD.bazel file and add a dependency to {fmt}:
cc_binary( # Build a binary
name = "Demo", # Name of the binary
srcs = ["main.cpp"], # List of files - we only have main.cpp
deps = ["#fmt//:fmt"], # Depend on fmt
)
Make use of {fmt} in main.cpp:
#include "fmt/core.h"C
int main() {
fmt::print("The answer is {}.\n", 42);
}
The expected output of this example is The answer is 42.
Forth approach: Make use of patch_cmd
load("#bazel_tools//tools/build_defs/repo:git.bzl", "git_repository")
git_repository(
name = "fmt",
branch = "master",
patch_cmds = [
"mv support/bazel/.bazelrc .bazelrc",
"mv support/bazel/.bazelversion .bazelversion",
"mv support/bazel/BUILD.bazel BUILD.bazel",
"mv support/bazel/WORKSPACE.bazel WORKSPACE.bazel",
],
# Windows related patch commands are only needed in the case MSYS2 is not installed
patch_cmds_win = [
"Move-Item -Path support/bazel/.bazelrc -Destination .bazelrc",
"Move-Item -Path support/bazel/.bazelversion -Destination .bazelversion",
"Move-Item -Path support/bazel/BUILD.bazel -Destination BUILD.bazel",
"Move-Item -Path support/bazel/WORKSPACE.bazel -Destination WORKSPACE.bazel",
],
remote = "https://github.com/fmtlib/fmt",
)
More details here.
Other libraries
I have written a few blog posts about Bazelizing different libs:
Bazel: Bazelizing Qt5 for macOS
Bazel: Bazelizing Embree 3.13.0
Bazel: Bazelizing Qt5 & Qt6
Bazel: Handling external dependencies in OpenEXR
Bazel: Offical support for OpenEXR
Bazel: Bazelizing OpenEXR
Bazel: Bazelizing Embree 3.12.1
Related
I am working on a c++ project with bazel BUILD system in the vscode IDE environment. To illustrate, one could take one of the large open source projects such as tensorflow.
While the intellisense functionality works very well for source/header dependencies within the project folder itself, vscode seems unable to recognize headers included from third_party/external dependencies, such as protobuf headers in the tensorflow project (see below screenshot). So is there a way for vscode to recognize such headers, with the help of both c++/clang and bazel plugins?
To provide more details:
This protobuf header is included by the following BAZEL target in tensorflow/lite/toco/BUILD,
cc_library(
name = "toco_port",
srcs = [
"toco_port.cc",
],
hdrs = [
"format_port.h",
"toco_port.h",
"toco_types.h",
],
deps = [
"//tensorflow/core:framework_lite",
"//tensorflow/core:lib",
"//tensorflow/core:lib_internal",
"#com_google_absl//absl/status",
"#com_google_protobuf//:protobuf_headers",
],
)
which in turn is defined by the following workspace rule in tensorflow/workspace2.bzl
tf_http_archive(
name = "com_google_protobuf",
patch_file = ["//third_party/protobuf:protobuf.patch"],
sha256 = "cfcba2df10feec52a84208693937c17a4b5df7775e1635c1e3baffc487b24c9b",
strip_prefix = "protobuf-3.9.2",
system_build_file = "//third_party/systemlibs:protobuf.BUILD",
system_link_files = {
"//third_party/systemlibs:protobuf.bzl": "protobuf.bzl",
"//third_party/systemlibs:protobuf_deps.bzl": "protobuf_deps.bzl",
},
urls = tf_mirror_urls("https://github.com/protocolbuffers/protobuf/archive/v3.9.2.zip"),
)
The downloaded external repository is usually stored in a local directory ~/.cache/_bazel, specified by the --output_user_root flag in bazel.
I am migrating large legacy makefiles project to Bazel. Project used to copy all sources and headers into single "build dir" before build, and because of this all source and header files use single level includes, without any prefix (#include "1.hpp").
Bazel requires that modules (libraries) use relative path to header starting at WORKSPACE file, however my goal is to introduce Bazel build files, which require 0 modifications of a source code.
I use bazelrc to globally set paths to includes as if structure was flat:
.bazelrc:
build --copt=-Ia/b/c
/a/b/BUILD
cc_library(
name = "lib",
srcs = ["c/1.cpp"],
hdrs = ["c/1.hpp"],
visibility = ["//visibility:public"]
)
When I build this target, I see my -I flag in compiler invocation, but compilation fails because bazel can not find header 1.hpp:
$ bazel build -s //a/b:lib
...
a/b/c/1.cpp:13:10: fatal error: 1.hpp: No such file or directory
13 | #include "1.hpp"
|
Interestingly enough, it prints me gcc command that it invokes during build and if I run this command, compiler is able to find 1.hpp and 1.cpp compiles.
How to make bazel "see" this includes? Do I really need to additionally specify copts for every target in addition to global -I flags?
Bazel use sandboxing: for each action (compile a C++ file, link a library) the specific build directory is prepared. That directory contains only files (using symlinks and other Linux sorcery), which are explicitly defined as dependency/source/header for given target.
That trick with --copt=-Ia/b/c is a bad idea, because that option will work only for targets, which depend on //a/b:lib.
Use includes or strip_include_prefix attribute instead:
cc_library(
name = "lib",
srcs = ["c/1.cpp"],
hdrs = ["c/1.hpp"],
strip_include_prefix = "c",
visibility = ["//visibility:public"]
)
and add the lib as a dependency of every target, which need to access these headers:
cc_binary(
name = "some bin",
srcs = ["foo.cpp"],
deps = ["//a/b:lib"],
)
I would just like to know if someone has tried doing this?
I am currently using nelhage/rules_boost for my boost dependencies(just to make some things compile for the meantime), but since the code I'm working with is only 100% compatible with 1.55 I cannot use his rules for long.
I could also try adapting his code to work with boost 1.55, but I think it would make it a lot easier if I just make Bazel depend on an installation of boost since I am also working with containers.
I usually use boost as pre-built external dependency with Bazel. I just reference the local installation in my WORKSPACE file and then create a BUILD file for it, e.g.:
# WORKSPACE file
new_local_repository(
name = "boost",
path = "/your/path/to/boost",
build_file = "third_party/boost.BUILD",
)
In the BUILD file you can choose to split headers and libs into separate rules or combine them together. In the following example I keep all the headers as a rule and separate libraries into different rules:
# third_party/boost.BUILD
cc_library(
name = "boost-headers",
hdrs = glob(["include/boost/**"]),
visibility = ["//visibility:public"],
includes = ['include'],
)
cc_library(
name = "boost-atomic",
srcs = ["lib/libboost_atomic.a"],
visibility = ["//visibility:public"],
)
cc_library(
name = "boost-chrono",
srcs = ["lib/libboost_chrono.a"],
visibility = ["//visibility:public"],
)
...
Then in my binary/library I pick-up the dependencies:
cc_binary(
name = 'main',
srcs = ['main.cc'],
deps = [
'#boost//:boost-headers',
'#boost//:boost-regex',
]
)
This should also work is you have boost installed into /usr/include / /usr/lib, but I haven't tried to be honest.
Hope this helps.
I've got a project currently using CMake, which I would like to switch over to Bazel. The primary dependency is LLVM, which I use to generate LLVM IR. Looking around, there doesn't seem to be a whole lot of guidance on this as only TensorFlow seems to use LLVM from Bazel (and auto-generates its config as far as I can tell). There was also a thread on bazel-discuss I found which discussed a similar issue, though my attempts to replicate it have failed.
Currently, my best run has got to be this (fetcher.bzl):
def _impl(ctx):
# Download LLVM master
ctx.download_and_extract(url = "https://github.com/llvm-mirror/llvm/archive/master.zip")
# Run `cmake llvm-master` to generate configuration.
ctx.execute(["cmake", "llvm-master"])
# The bazel-discuss thread says to delete llvm-master, but I've
# found that only generated files are pulled out of master, so all
# the non-generated ones get dropped if I delete this.
# ctx.execute(["rm", "-r", "llvm-master"])
# Generate a BUILD file for the LLVM dependency.
ctx.file('BUILD', """
# Build a library with all the LLVM code in it.
cc_library(
name = "lib",
srcs = glob(["**/*.cpp"]),
hdrs = glob(["**/*.h"]),
# Include the x86 target and all include files.
# Add those under llvm-master/... as well because only built files
# seem to appear under include/...
copts = [
"-Ilib/Target/X86",
"-Iinclude",
"-Illvm-master/lib/Target/X86",
"-Illvm-master/include",
],
# Include here as well, not sure whether this or copts is
# actually doing the work.
includes = [
"include",
"llvm-master/include",
],
visibility = ["//visibility:public"],
# Currently picking up some gtest targets, I have that dependency
# already, so just link it here until I filter those out.
deps = [
"#gtest//:gtest_main",
],
)
""")
# Generate an empty workspace file
ctx.file('WORKSPACE', '')
get_llvm = repository_rule(implementation = _impl)
And then my WORKSPACE file looks like the following:
load(":fetcher.bzl", "get_llvm")
git_repository(
name = "gflags",
commit = "46f73f88b18aee341538c0dfc22b1710a6abedef", # 2.2.1
remote = "https://github.com/gflags/gflags.git",
)
new_http_archive(
name = "gtest",
url = "https://github.com/google/googletest/archive/release-1.8.0.zip",
sha256 = "f3ed3b58511efd272eb074a3a6d6fb79d7c2e6a0e374323d1e6bcbcc1ef141bf",
build_file = "gtest.BUILD",
strip_prefix = "googletest-release-1.8.0",
)
get_llvm(name = "llvm")
I would then run this with bazel build #llvm//:lib --verbose_failures.
I would consistently get errors from missing header files. Eventually I found that running cmake llvm-master generated many header files into the current directory, but seemed to leave the non-generated ones in llvm-master/. I added the same include directories under llvm-master/ and that seems to catch a lot of the files. However, currently it seems that tblgen is not running and I am still missing critical headers required for the compilation. My current error is:
In file included from external/llvm/llvm-master/include/llvm/CodeGen/MachineOperand.h:18:0,
from external/llvm/llvm-master/include/llvm/CodeGen/MachineInstr.h:24,
from external/llvm/llvm-master/include/llvm/CodeGen/MachineBasicBlock.h:22,
from external/llvm/llvm-master/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h:20,
from external/llvm/llvm-master/include/llvm/CodeGen/GlobalISel/ConstantFoldingMIRBuilder.h:13,
from external/llvm/llvm-master/unittests/CodeGen/GlobalISel/PatternMatchTest.cpp:10:
external/llvm/llvm-master/include/llvm/IR/Intrinsics.h:42:38: fatal error: llvm/IR/IntrinsicEnums.inc: No such file or directory
Attempting to find this file in particular, I don't see any IntrinsicEnums.inc, IntrinsicEnums.h, or IntrinsicEnums.dt. I do see a lot of Instrinsics*.td, so maybe one of them generates this particular file?
It seems like tblgen is supposed to convert the *.td files to *.h and *.cpp files (please correct me if I am misunderstanding). However, this doesn't seem to be running. I saw that in Tensorflow's project, they have a gentbl() BUILD macro, though it is not practical for me to copy it as it has way too many dependencies on the rest of Tensorflow's build infrastructure.
Is there any way to do this without something as big and complex as Tensorflow's system?
I posted to the llvm-dev mailing list here and got some interesting responses. LLVM definitely wasn't designed to support Bazel and doesn't do so particularly well. It appears to be theoretically possible by using Ninja to output all the compile commands and then consume them from Bazel. This is likely to be pretty difficult and would require a separate tool which outputs Skylark code to be run by Bazel.
This seemed pretty complex for the scale of project I was working on, so my workaround was to download the pre-built binaries from releases.llvm.org. This included all the necessary headers, libraries, and tooling binaries. I was able to make a simple but powerful toolchain based around this in Bazel for my custom programming language.
Simple example (limited but focused): https://github.com/dgp1130/llvm-bazel-foolang
Full example (more complex and less focused): https://github.com/dgp1130/sanity-lang
I'm building a program with gRPC library using bazel. My WORKSPACE file:
http_archive(
name = "com_github_grpc_grpc",
urls = ["https://github.com/grpc/grpc/archive/v1.8.3.zip"],
sha256 = "57a2c67abe789ce9e80d49f473515c7479ae494e87dba84463b10bbd0990ad62",
strip_prefix = "grpc-1.8.3",
)
load("#com_github_grpc_grpc//bazel:grpc_deps.bzl", "grpc_deps")
grpc_deps()
BUILD file:
proto_library(
name = "test_proto",
srcs = ["test.proto"],
)
cc_proto_library(
name = "test_cc_proto",
deps = [":test_proto"],
)
cc_binary(
name = "hello",
srcs = ["hello.cc"],
deps = [
":test_cc_proto",
"#com_github_grpc_grpc//:grpc++",
],
)
Compiling this throws error:
every rule of type proto_library implicitly depends upon the target '#com_google_protobuf_cc//:cc_toolchain', but this target could not be found because of: no such package '#com_google_protobuf_cc//': The repository could not be resolved.
If I include com_google_protobuf_cc repository manually, the version doesn't match and I get error saying test.pb.h was generated using a newer version of protoc.
How do I make gRPC load right version of com_google_protobuf_cc?
How are you including Protobuf manually, what did you put in your BUILD and/or WORKSPACE file/s to achieve this? It's hard to comment on what could be wrong without knowing exactly what you have tried.
As far as I know you can include it by downloading the version you require then adding something like the following to your WORKSPACE file:
local_repository(
name = "com_google_protobuf",
path = "../protobuf-3.4.1",
)
local_repository(
name = "com_google_protobuf_cc",
path = "../protobuf-3.4.1",
)
Of course change the paths to match the version and location of your downloaded copy of Protobuf. Alternatively you can probably use http_archive to point it directly to where it should be downloaded, in the same way as you have done for gRPC.