C++ Tensorflow API with TensorRT - c++

My goal is to run a tensorrt optimized tensorflow graph in a C++ application. I am using tensorflow 1.8 with tensorrt 4. Using the python api I am able to optimize the graph and see a nice performance increase.
Trying to run the graph in c++ fails with the following error:
Not found: Op type not registered 'TRTEngineOp' in binary running on e15ff5301262. Make sure the Op and Kernel are registered in the binary running in this process.
Other, non tensorrt graphs work. I had a similar error with the python api, but solved it by importing tensorflow.contrib.tensorrt. From the error I am fairly certain the kernel and op are not registered, but am unaware on how to do so in the application after tensorflow has been built. On a side note I can not use bazel but am required to use cmake. So far I link against libtensorflow_cc.so and libtensorflow_framework.so.
Can anyone help me here? thanks!
Update:
Using the c or c++ api to load _trt_engine_op.so does not throw an error while loading, but fails to run with
Invalid argument: No OpKernel was registered to support Op 'TRTEngineOp' with these attrs. Registered devices: [CPU,GPU], Registered kernels:
<no registered kernels>
[[Node: my_trt_op3 = TRTEngineOp[InT=[DT_FLOAT, DT_FLOAT], OutT=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], input_nodes=["tower_0/down_0/conv_0/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer", "tower_0/down_0/conv_skip/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer"], output_nodes=["tower_0/down_0/conv_skip/Relu", "tower_0/down_1/conv_skip/Relu", "tower_0/down_2/conv_skip/Relu", "tower_0/down_3/conv_skip/Relu"], serialized_engine="\220{I\000...00\000\000"](tower_0/down_0/conv_0/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer, tower_0/down_0/conv_skip/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer)]]

Another way to solve the problem with the error "Not found: Op type not registered 'TRTEngineOp'" on Tensorflow 1.8:
1) In the file tensorflow/contrib/tensorrt/BUILD, add new section with following content :
cc_library(
name = "trt_engine_op_kernel_cc",
srcs = [
"kernels/trt_calib_op.cc",
"kernels/trt_engine_op.cc",
"ops/trt_calib_op.cc",
"ops/trt_engine_op.cc",
"shape_fn/trt_shfn.cc",
],
hdrs = [
"kernels/trt_calib_op.h",
"kernels/trt_engine_op.h",
"shape_fn/trt_shfn.h",
],
copts = tf_copts(),
visibility = ["//visibility:public"],
deps = [
":trt_logging",
":trt_plugins",
":trt_resources",
"//tensorflow/core:gpu_headers_lib",
"//tensorflow/core:lib_proto_parsing",
"//tensorflow/core:stream_executor_headers_lib",
] + if_tensorrt([
"#local_config_tensorrt//:nv_infer",
]) + tf_custom_op_library_additional_deps(),
alwayslink = 1, # buildozer: disable=alwayslink-with-hdrs
)
2) Add //tensorflow/contrib/tensorrt:trt_engine_op_kernel_cc as dependency to the corresponding BAZEL project you want to build
PS: No need to load library _trt_engine_op.so with TF_LoadLibrary

Here are my findings (and some kind of solution) for this problem (Tensorflow 1.8.0, TensorRT 3.0.4):
I wanted to include the tensorrt support into a library, which loads a graph from a given *.pb file.
Just adding //tensorflow/contrib/tensorrt:trt_engine_op_kernel to my Bazel BUILD file didn't do the trick for me. I still got a message indicating that the Ops where not registered:
2018-05-21 12:22:07.286665: E tensorflow/core/framework/op_kernel.cc:1242] OpKernel ('op: "TRTCalibOp" device_type: "GPU"') for unknown op: TRTCalibOp
2018-05-21 12:22:07.286856: E tensorflow/core/framework/op_kernel.cc:1242] OpKernel ('op: "TRTEngineOp" device_type: "GPU"') for unknown op: TRTEngineOp
2018-05-21 12:22:07.296024: E tensorflow/examples/tf_inference_lib/cTfInference.cpp:56] Not found: Op type not registered 'TRTEngineOp' in binary running on ***.
Make sure the Op and Kernel are registered in the binary running in this process.
The solution was, that I had to load the Ops library (tf_custom_op_library) within my C++ Code using the C_API:
#include "tensorflow/c/c_api.h"
...
TF_Status status = TF_NewStatus();
TF_LoadLibrary("_trt_engine_op.so", status);
The shared object _trt_engine_op.so is created for the bazel target //tensorflow/contrib/tensorrt:python/ops/_trt_engine_op.so:
bazel build --config=opt --config=cuda --config=monolithic \
//tensorflow/contrib/tensorrt:python/ops/_trt_engine_op.so
Now I only have to make sure, that _trt_engine_op.so is available whenever it is needed, e.g. by LD_LIBRARY_PATH.
If anybody has an idea, how to do this in a more elegant way (why do we have 2 artefacts which have to be build? Can't we just have one?), I'm happy for every suggestion.
tldr
add //tensorflow/contrib/tensorrt:trt_engine_op_kernel as dependency to the corresponding BAZEL project you want to build
Load the ops-library _trt_engine_op.so in your code using the C-API.

For Tensorflow r1.8, the additions shown below in two BUILD files and building libtensorflow_cc.so with the monolithic option worked for me.
diff --git a/tensorflow/BUILD b/tensorflow/BUILD
index cfafffd..fb8eb31 100644
--- a/tensorflow/BUILD
+++ b/tensorflow/BUILD
## -525,6 +525,8 ## tf_cc_shared_object(
"//tensorflow/cc:scope",
"//tensorflow/cc/profiler",
"//tensorflow/core:tensorflow",
+ "//tensorflow/contrib/tensorrt:trt_conversion",
+ "//tensorflow/contrib/tensorrt:trt_engine_op_kernel",
],
)
diff --git a/tensorflow/contrib/tensorrt/BUILD b/tensorflow/contrib/tensorrt/BUILD
index fd3582e..a6566b9 100644
--- a/tensorflow/contrib/tensorrt/BUILD
+++ b/tensorflow/contrib/tensorrt/BUILD
## -76,6 +76,8 ## cc_library(
srcs = [
"kernels/trt_calib_op.cc",
"kernels/trt_engine_op.cc",
+ "ops/trt_calib_op.cc",
+ "ops/trt_engine_op.cc",
],
hdrs = [
"kernels/trt_calib_op.h",
## -86,6 +88,7 ## cc_library(
deps = [
":trt_logging",
":trt_resources",
+ ":trt_shape_function",
"//tensorflow/core:gpu_headers_lib",
"//tensorflow/core:lib_proto_parsing",
"//tensorflow/core:stream_executor_headers_lib",

As you mentioned, it should work when you add //tensorflow/contrib/tensorrt:trt_engine_op_kernel to the dependency list. Currently the Tensorflow-TensorRT integration is still in progress and may work well only for the python API; for C++ you'll need to call ConvertGraphDefToTensorRT() from tensorflow/contrib/tensorrt/convert/convert_graph.h for the conversion.
Let me know if you have any questions.

Solution: add import
from tensorflow.python.compiler.tensorrt import trt_convert as trt
link discuss: https://github.com/tensorflow/tensorflow/issues/26525

here is my solution, tensorflow is 1.14.
in your BUILD file,exp,tensorflow/examples/your_workspace/BUILD:
in tf_cc_binary:
scrs= [...,"//tensorflow/compiler/tf2tensorrt:ops/trt_engine_op.cc"]
deps=[...,"//tensorflow/compiler/tf2tensorrt:trt_op_kernels"]

Related

ROS personal global planner - undefined symbol: _ZN18base_local_planner12CostmapModelC1ERKN10costmap_2d9Costmap2DE

My setup is: ROS melodic, Ubuntu: 18.04
I want simulate turtlebot3 moving with my own global planner and have been following this tutorial to get started: http://wiki.ros.org/navigation/Tutorials/Writing%20A%20Global%20Path%20Planner%20As%20Plugin%20in%20ROS#Running_the_Plugin_on_the_Turtlebot. The tutorial seem to be made for ROS hydro, but as it was the best source of guidance I could find I hoped it would work. I have been using this turtlebot3 tutorial and commands to get started: https://emanual.robotis.com/docs/en/platform/turtlebot3/nav_simulation/
There is no problem having the robot navigate with 2D Nav Goal in rviz using the in-built planning packages, but when I try to run the the global path planner in my own package I get the following error when I try to launch the 'turtlebot3_navigation.launch' file:
[ INFO] [1661178206.728674676, 7.359000000]: global_costmap: Using plugin "static_layer"
[ INFO] [1661178206.742733426, 7.372000000]: Requesting the map...
[ INFO] [1661178206.945370142, 7.575000000]: Resizing costmap to 384 X 384 at 0.050000 m/pix
[ INFO] [1661178207.047423541, 7.676000000]: Received a 384 X 384 map at 0.050000 m/pix
[ INFO] [1661178207.053220010, 7.678000000]: global_costmap: Using plugin "obstacle_layer"
[ INFO] [1661178207.056864268, 7.685000000]: Subscribed to Topics: scan
[ INFO] [1661178207.079615282, 7.706000000]: global_costmap: Using plugin "inflation_layer"
/opt/ros/melodic/lib/move_base/move_base: symbol lookup error: /home/aut/catkin_ws/devel/lib//libmy_global_planner_lib.so: undefined symbol: _ZN18base_local_planner12CostmapModelC1ERKN10costmap_2d9Costmap2DE
[move_base-4] process has died [pid 625, exit code 127, cmd /opt/ros/melodic/lib/move_base/move_base cmd_vel:=/cmd_vel odom:=odom __name:=move_base __log:=/home/aut/.ros/log/f4c41f78-2225-11ed-befb-b8ca3a965376/move_base-4.log].
log file: /home/aut/.ros/log/f4c41f78-2225-11ed-befb-b8ca3a965376/move_base-4*.log
I ran c++filt on the symbol lookup error and got:
c++filt _ZN18base_local_planner12CostmapModelC1ERKN10costmap_2d9Costmap2DE
base_local_planner::CostmapModel::CostmapModel(costmap_2d::Costmap2D const&)
I've beeen using this code (https://github.com/ros-planning/navigation/blob/noetic-devel/carrot_planner/src/carrot_planner.cpp) and (https://github.com/ros-planning/navigation/blob/noetic-devel/carrot_planner/include/carrot_planner/carrot_planner.h) changing carrot_planner and CarrotPlanner to my_global_planner and MyGlobalPlanner, figuring that using some code is already/should already be working was a good way to avoid confusion about whether my code or something else caused errors.
My CMakeList.txt is looking like this currently:
cmake_minimum_required(VERSION 3.0.2)
project(my_global_planner)
find_package(catkin REQUIRED
actionlib
roscpp
rospy
std_msgs
)
catkin_package(
# INCLUDE_DIRS include
# LIBRARIES my_global_planner
# CATKIN_DEPENDS other_catkin_pkg
# DEPENDS system_lib
)
include_directories(
include
${catkin_INCLUDE_DIRS}
)
add_library(my_global_planner_lib src/my_global_planner/my_global_planner.cpp)
I've been experimenting with it, adding stuff like:
find_package(catkin REQUIRED
COMPONENTS
angles
base_local_planner
costmap_2d
nav_core
pluginlib
roscpp
tf2
tf2_geometry_msgs
tf2_ros
)
and such in the catkin_packages also, but it doesn't seem to have worked and I've returned it to how it was. I've also tried adding more than just:
<buildtool_depend>catkin</buildtool_depend>
<build_depend>nav_core</build_depend>
<exec_depend>nav_core</exec_depend>
to my package.xml, but no luck there either.
I hope I've made the problem clear and provided the needed information without dumping a massive sheet of code here. I feel that I've exhausted all my options at this point and any help or guidance would be greatly appreciated.
From what the error shows, your file has found the corrected header file, however, when looking up the lib, there is no correlated lib file.
There could be many reasons for that, e,g
You have multiple incompatible versions of costmap2D, to solve it, just delete all cost map in both workspace and in /opt/ros. then reinstall the costmap2d with corrected branch https://github.com/ros-planning/navigation/tree/melodic-devel
Secondly, In your cmakelist, do include costmap_2d and navigation inside find_package
Also include costmap_2d and navigation in your manifest

Cross-compiling Bazel Docker Rust image on MacOS to Linux: C++ toolchain not found

I am trying to cross-compile my Bazel Docker Rust image on MacOS to Linux. Unfortunately, I keep getting an error that the C++ toolchain cannot be found.
While resolving toolchains for target //service1:service1: No matching toolchains found for types #bazel_tools//tools/cpp:toolchain_type. Maybe --incompatible_use_cc_configure_from_rules_cc has been flipped and there is no default C++ toolchain added in the WORKSPACE file? See https://github.com/bazelbuild/bazel/issues/10134 for details and migration instructions.
Unfortunately the ticket mentioned doesn't provide much useful information.
I think I am missing a default C++ toolchain but I can't find an easy way to add this.
Here are the relevant snippets in my project
WORKSPACE.bazel
rust_repository_set(
name = "rust_darwin_linux_cross",
exec_triple = "x86_64-apple-darwin",
extra_target_triples = ["x86_64-unknown-linux-gnu-musleabihf"],
iso_date = "2021-06-09",
version = "nightly",
)
service1/BUILD.bazel
platform(
name = "linux-x86_64",
constraint_values = [
"#platforms//os:linux",
"#platforms//cpu:x86_64",
],
)
rust_image(
name = "image",
srcs = ["src/main.rs"],
)
The command that I run: bazel build --platforms //service1:linux-x86_64 //service1:image
When I run this command with --toolchain_resolution_debug=#bazel_tools//tools/cpp:toolchain_type I see that
Type #bazel_tools//tools/cpp:toolchain_type: target platform //:linux-x86_64: No toolchains found.
I really hope someone can point me in the right direction as this is a quite confusing topic and there are almost no clear guides or examples on this 🙏

Building SpiderMonkey for Windows

I'm trying to build SpiderMonkey (32 bit) for Windows. Following the answer here, I performed the instructions here
The command line I used for building is:
PATH=$PATH:"/c/Program Files/LLVM/bin/" JS_STANDALONE=1 ../configure.in --enable-nspr-build --disable-jemalloc --disable-js-shell --disable-tests --target=i686-pc-mingw32 --host=i686-pc-mingw32 --with-libclang-path="C:/Program Files/LLVM/bin"
However, I'm getting various linker errors where SpiderMonkey doesn't find Rust encoding functions, such as:
lld-link: error: undefined symbol: _encoding_mem_convert_latin1_to_utf8_partial
referenced by c:\firefox_90_0\js\src\vm\CharacterEncoding.cpp:109
..\Unified_cpp_js_src17.obj:(unsigned int __cdecl JS::DeflateStringToUTF8Buffer(class
JSLinearString *, class mozilla::Span<char, 4294967295>))
After looking at SpiderMonkey config files (Cargo.toml files), it seems to me that during compilation SpiderMonkey should build jsrust.lib out of Rust bindings. but in fact this doesn't happen and I get the linker errors. any idea?
Yes, you are right, in that during compiling SpiderMonkey mach/mozbuild build jsrust.lib and link it into the resulting dll/js-shell executable.
Also, in my case, building jsrust.lib was also missing a bcrypt import.
this can be easily fixed by applying the following patch to the sources,
which enables mozbuild to traverse into the js/rust directory, and fixes the aforementioned missing import too.
(tested on esr91 and up):
--- a/js/src/moz.build
+++ b/js/src/moz.build
## -7,6 +7,10 ##
include("js-config.mozbuild")
include("js-cxxflags.mozbuild")
+if CONFIG["JS_STANDALONE"]:
+ DIRS += ["rust"]
+ include("js-standalone.mozbuild")
+
# Directory metadata
component_engine = ("Core", "JavaScript Engine")
component_gc = ("Core", "JavaScript: GC")
## -51,10 +55,7 ## if CONFIG["ENABLE_WASM_CRANELIFT"]:
CONFIGURE_SUBST_FILES += ["rust/extra-bindgen-flags"]
if not CONFIG["JS_DISABLE_SHELL"]:
- DIRS += [
- "rust",
- "shell",
- ]
+ DIRS += ["shell"]
TEST_DIRS += [
"gdb",
--- a/js/src/rust/moz.build
+++ b/js/src/rust/moz.build
## -37,4 +37,5 ## elif CONFIG["OS_ARCH"] == "WINNT":
"shell32",
"userenv",
"ws2_32",
+ "bcrypt"
]
(the patch is available as a gist alongside a tested mozbuild config, which builds a 32bit .dll, here: https://gist.github.com/razielanarki/a890f21a037312a46450e244beeba983 )

Can't build a project package using boost/iostream from bazel

I am using https://github.com/nelhage/rules_boost in a bazel project, everything is working fine except when I try to use boost/iostream.
The problem occurs on windows 10, and not on linux. boost/iostream depends on zlib and the file that is downloaded is https://zlib.net/zlib-1.2.11.tar.gz
The error I get is:
ERROR: .../external/net_zlib_zlib/BUILD.bazel:6:1: in cc_library rule #net_zlib_zlib//:zlib: Expected action_config for 'preprocess-assemble' to be configured
ERROR: Analysis of target '.../storage:storage' failed; build abo
rted: Analysis of target '#net_zlib_zlib//:zlib' failed; build aborted
This is the BUILD file:
cc_library(
name = "storage",
srcs = [
"blobstore.cc",
"blobstore.h",
],
hdrs = [
"blobstore.h",
],
deps = [
"#boost//:iostreams",
],
defines = ["BOOST_ALL_NO_LIB"],
)
Does anyone have idea what the problem might be.
This is unfortunately a bug in our MSVC crosstool. What needs to be done is to add the missing action_config and make sure other compilation flags are compatible. Would you mind creating a github issue?

Conflict Protobuf version when using Opencv and Tensorflow c++

I am currently trying to use Tensorflow's shared library in a non-bazel project, so I creat a .so file from tensorflow using bazel.
but when I launch a c++ program that uses both Opencv and Tensorflow, it makes me the following error :
[libprotobuf FATAL external/protobuf/src/google/protobuf/stubs/common.cc:78] This program was compiled against version 2.6.1 of the Protocol Buffer runtime library, which is not compatible with the installed version (3.1.0). Contact the program author for an update. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "/build/mir-pkdHET/mir-0.21.0+16.04.20160330/obj-x86_64-linux-gnu/src/protobuf/mir_protobuf.pb.cc".)
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): This program was compiled against version 2.6.1 of the Protocol Buffer runtime library, which is not compatible with the installed version (3.1.0). Contact the program author for an update. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "/build/mir-pkdHET/mir-0.21.0+16.04.20160330/obj-x86_64-linux-gnu/src/protobuf/mir_protobuf.pb.cc".)
Abandon (core dumped)
Can you help me?
Thank you
You should rebuild TensorFlow with a linker script to avoid making third party symbols global in the shared library that Bazel creates. This is how the Android Java/JNI library for TensorFlow is able to coexist with the pre-installed protobuf library on the device (look at the build rules in tensorflow/contrib/android for a working example)
Here's a BUILD file that I adapted from the Android library to do this:
package(default_visibility = ["//visibility:public"])
licenses(["notice"]) # Apache 2.0
exports_files(["LICENSE"])
load(
"//tensorflow:tensorflow.bzl",
"tf_copts",
"if_android",
)
exports_files([
"version_script.lds",
])
# Build the native .so.
# bazel build //tensorflow/contrib/android_ndk:libtensorflow_cc_inference.so \
# --crosstool_top=//external:android/crosstool \
# --host_crosstool_top=#bazel_tools//tools/cpp:toolchain \
# --cpu=armeabi-v7a
LINKER_SCRIPT = "//tensorflow/contrib/android:version_script.lds"
cc_binary(
name = "libtensorflow_cc_inference.so",
srcs = [],
copts = tf_copts() + [
"-ffunction-sections",
"-fdata-sections",
],
linkopts = if_android([
"-landroid",
"-latomic",
"-ldl",
"-llog",
"-lm",
"-z defs",
"-s",
"-Wl,--gc-sections",
"-Wl,--version-script", # This line must be directly followed by LINKER_SCRIPT.
LINKER_SCRIPT,
]),
linkshared = 1,
linkstatic = 1,
tags = [
"manual",
"notap",
],
deps = [
"//tensorflow/core:android_tensorflow_lib",
LINKER_SCRIPT,
],
)
And the contents of version_script.lds:
{
global:
extern "C++" {
tensorflow::*;
};
local:
*;
};
This will make everything in the tensorflow namespace global and available through the library, while hiding the reset and preventing it from conflicting with protobuf.
(wasted a ton of time on this so I hope it helps!)
The error indicates that the program was complied using headers (.h files) from protobuf 2.6.1. These headers are typically found in /usr/include/google/protobuf or /usr/local/include/google/protobuf, though they could be in other places depending on your OS and how the program is being built. You need to update these headers to version 3.1.0 and recompile the program.
This is indeed a pretty serious problem! I get the below error similar to you:
$./ceres_single_test
[libprotobuf FATAL google/protobuf/stubs/common.cc:78] This program was compiled against version 2.6.1 of the Protocol Buffer runtime library, which is not compatible with the installed version (3.1.0). Contact the program author for an update. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "/build/mir-pkdHET/mir-0.21.0+16.04.20160330/obj-x86_64-linux-gnu/src/protobuf/mir_protobuf.pb.cc".)
terminate called after throwing an instance of 'google::protobuf::FatalException'
Aborted
My workaround:
cd /usr/lib/x86_64-linux-gnu
sudo mkdir BACKUP
sudo mv libmirprotobuf.so* ./BACKUP/
Now, the executable under test works, cool. What is not cool, however, is that things like gedit no longer work without running from a shell that has the BACKUP path added to LD_LIBRARY_PATH :-(
Hopefully there's a better fix out there?
The error complains about the Protocol Buffer runtime library, which is not compatible with the installed version. This error is coming from the GTK3 library. GTK3 use Protocol Buffer 2.6.1. If you use GTK3 to support Opencv, you get this error. The easiest way to fix this, you can use QT instead of GTK3.
If you use Cmake GUI to install Opencv, just select QT support instead of using GTK3. You can install QT using the following command.
sudo apt install qtbase5-dev
rebuild libprotobuf with -Dprotobuf_BUILD_SHARED_LIBS=ON
then make install to cover the older version