How to set the input to a LSTM network in C++

How to set the input to a LSTM network in C++ - c++

I'm new to libtorch and I need to load a LSTM network in C++. Before that, I have already tested with the following Python script and it is working well:
actuator_net_file = "resources/actuator_nets/anydrive_v3_lstm.pt"
actuator_network = torch.jit.load(actuator_net_file)
actuator_network.eval()
num_envs = 1
num_actions = 1
sea_input = torch.zeros(num_envs*num_actions, 1, 2, requires_grad=False)
sea_hidden_state = torch.zeros(2, num_envs*num_actions, 8, requires_grad=False)
sea_cell_state = torch.zeros(2, num_envs*num_actions, 8, requires_grad=False)
torques, (sea_hidden_state[:], sea_cell_state[:]) = actuator_network(sea_input, (sea_hidden_state, sea_cell_state))
And the next step is to write a simple C++ program to test the forward evaluation of the network. But I don't know how to give arguments to the forward function. Here is what I got:
#include <torch/script.h> // One-stop header.
#include <torch/torch.h>
#include <iostream>
#include <memory>
#include <vector>
int main(int argc, const char* argv[]) {
if (argc != 2) {
std::cerr << "usage: example-app <path-to-exported-script-module>\n";
return -1;
}
std::string actuator_net_file = "/home/fenglongsong/Desktop/example-app/anydrive_v3_lstm.pt";
torch::jit::script::Module actuator_network;
try {
actuator_network = torch::jit::load(actuator_net_file);
actuator_network.eval();
}
catch (const c10::Error& e) {
std::cerr << "error loading the model\n";
return -1;
}
std::cout << "load model ok\n";
const int num_envs = 1;
const int num_actions = 1;
auto u0 = torch::zeros({num_envs*num_actions, 1, 2});
auto h0 = torch::zeros({2, num_envs*num_actions, 8});
auto c0 = torch::zeros({2, num_envs*num_actions, 8});
std::vector<torch::jit::IValue> inputs;
inputs.push_back(u0);
std::vector<torch::jit::IValue> tuple;
tuple.push_back(h0);
tuple.push_back(c0);
inputs.push_back(c10::ivalue::Tuple::create(tuple));
std::cout << "before forward" << std::endl;
actuator_network.forward(inputs).toTensor();
}
The compile passes successfully but when running the executable, the following error occurs:
fenglongsong#alvaro-rsl ~/Desktop/example-app/build $ ./example-app .
load model ok
before forward
terminate called after throwing an instance of 'c10::Error'
what(): Expected Tensor but got Tuple
Exception raised from reportToTensorTypeError at ../aten/src/ATen/core/ivalue.cpp:908 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7f9153dc07ab in /home/fenglongsong/Documents/ocs2_ws/src/libtorch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xce (0x7f9153dbc15e in /home/fenglongsong/Documents/ocs2_ws/src/libtorch/lib/libc10.so)
frame #2: c10::IValue::reportToTensorTypeError() const + 0x64 (0x7f913dd6d304 in /home/fenglongsong/Documents/ocs2_ws/src/libtorch/lib/libtorch_cpu.so)
frame #3: c10::IValue::toTensor() && + 0x4b (0x55d1ce3cd311 in ./example-app)
frame #4: main + 0x54e (0x55d1ce3ca0ec in ./example-app)
frame #5: __libc_start_main + 0xf3 (0x7f913c7c6083 in /lib/x86_64-linux-gnu/libc.so.6)
frame #6: _start + 0x2e (0x55d1ce3c986e in ./example-app)
Aborted (core dumped)
My question is, what should be the equivalence in C++ of
torques, (sea_hidden_state[:], sea_cell_state[:]) = actuator_network(sea_input, (sea_hidden_state, sea_cell_state)) ? Any suggessions will be much appreciated!

I have experience only directly with torch::Tensor, not with torch::jit::IValue, but generally, the signature is:
network.forward(const Tensor & input, torch::optional<std::tuple<Tensor,Tensor>>)
So in your example, you would call it like this:
actuator_network.forward(u0, std::make_tuple(h0, c0)).toTensor();
But it depends on the actual architecture, next time, include the actuator network architecture to your question to make it more clear.

Related

Creating 2D std::vector as input vector for Tensor Flow Lite results in crashing ESP although there is enough heap memory

I want to create a 2D input vector for my machine learning model. The model runs on a ESP32 but I am running into issues when it comes to setting up such an vector.
I initialise a vector by std::vector<std::vector<float>> testing_vector;
and reserve memory in my setup routine by testing_vector.reserve(1000);
This works fine when I reserve for 1000 elements. When I reserve for over 4700 elements though,
my ESP crashes although ESP.getFreeHeap() and ESP.getMaxAllocHeap() shows me that there should be enough heap memory available.
Edit:
Thanks to the answers of "Some programmer dude" and "molbdnilo" below I realised that my calculations for required space first was wrong. When I reserve 4700 elements of 12Byte vector objects this would now result in 56.400 Bytes which is still under Max Alloc Heap of 110580 though.
Input
When I reserve memory for 4.700 elements by testing_vector.reserve(4700);the ESP crashes
Outcome
My serial monitor with enabled "ESP32 Exception Decoder" gives me:
Rebooting...
ets Jul 29 2019 12:21:46
rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:1184
load:0x40078000,len:13132
load:0x40080400,len:3036
entry 0x400805e4
Total heap: 298076
Free heap: 274232
Total PSRAM: 0
Free PSRAM: 0
Max Alloc Heap: 110580
Max Vector Size: 357913941
abort() was called at PC 0x4013457b on core 1
Backtrace:0x400834e1:0x3ffb26900x40088c5d:0x3ffb26b0 0x4008d6b5:0x3ffb26d0 0x4013457b:0x3ffb2750 0x401345c2:0x3ffb2770 0x401346bb:0x3ffb2790 0x4013461a:0x3ffb27b0 0x400d1b16:0x3ffb27d0 0x400d1d95:0x3ffb27f0 0x40129fe7:0x3ffb2820
#0 0x400834e1:0x3ffb2690 in panic_abort at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/esp_system/panic.c:402
#1 0x40088c5d:0x3ffb26b0 in esp_system_abort at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/esp_system/esp_system.c:128
#2 0x4008d6b5:0x3ffb26d0 in abort at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/newlib/abort.c:46
#3 0x4013457b:0x3ffb2750 in __cxxabiv1::__terminate(void (*)()) at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32-elf/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:47
#4 0x401345c2:0x3ffb2770 in std::terminate() at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32-elf/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:57
#5 0x401346bb:0x3ffb2790 in __cxa_throw at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32-elf/src/gcc/libstdc++-v3/libsupc++/eh_throw.cc:95
#6 0x4013461a:0x3ffb27b0 in operator new(unsigned int) at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32-elf/src/gcc/libstdc++-v3/libsupc++/new_op.cc:54
#7 0x400d1b16:0x3ffb27d0 in __gnu_cxx::new_allocator<std::vector<float, std::allocator<float> > >::allocate(unsigned int, void const*) at c:\users\d.fluhr\.platformio\packages\toolchain-xtensa-esp32\xtensa-esp32-elf\include\c++\8.4.0\ext/new_allocator.h:111
(inlined by) std::allocator_traits<std::allocator<std::vector<float, std::allocator<float> > > >::allocate(std::allocator<std::vector<float, std::allocator<float> > >&, unsigned int) at c:\users\d.fluhr\.platformio\packages\toolchain-xtensa-esp32\xtensa-esp32-elf\include\c++\8.4.0\bits/alloc_traits.h:436
(inlined by) std::_Vector_base<std::vector<float, std::allocator<float> >, std::allocator<std::vector<float, std::allocator<float> > > >::_M_allocate(unsigned int) at c:\users\d.fluhr\.platformio\packages\toolchain-xtensa-esp32\xtensa-esp32-elf\include\c++\8.4.0\bits/stl_vector.h:296
(inlined by) std::vector<float, std::allocator<float> >* std::vector<std::vector<float, std::allocator<float> >, std::allocator<std::vector<float, std::allocator<float> > > >::_M_allocate_and_copy<std::move_iterator<std::vector<float, std::allocator<float> >*> >(unsigned int, std::move_iterator<std::vector<float, std::allocator<float> >*>, std::move_iterator<std::vector<float, std::allocator<float> >*>) at c:\users\d.fluhr\.platformio\packages\toolchain-xtensa-esp32\xtensa-esp32-elf\include\c++\8.4.0\bits/stl_vector.h:1398
(inlined by) std::vector<std::vector<float, std::allocator<float> >, std::allocator<std::vector<float, std::allocator<float> > > >::reserve(unsigned int) at c:\users\d.fluhr\.platformio\packages\toolchain-xtensa-esp32\xtensa-esp32-elf\include\c++\8.4.0\bits/vector.tcc:74
#8 0x400d1d95:0x3ffb27f0 in setup() at src/main.cc:104
#9 0x40129fe7:0x3ffb2820 in loopTask(void*) at C:/Users/d.fluhr/.platformio/packages/framework-arduinoespressif32/cores/esp32/main.cpp:42
Conclusion
I don't understand this output very well but I understand the result is invoked by my "reserve" method.
But what could be the reason for this crash?
Code
unfortunately the tensorflow lib is very big. This is why I just present my main.cc here.
I reduced the main loop part but kept the setup part as it is for a better understanding whats going on in the background.
/* Copyright 2020 The TensorFlow Authors. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#include "tensorflow/lite/micro/all_ops_resolver.h"
#include "tensorflow/lite/micro/micro_interpreter.h"
#include "tensorflow/lite/micro/system_setup.h"
#include "tensorflow/lite/schema/schema_generated.h"
// #include "main_functions.h"
#include "model.h"
#include "constants.h"
#include "output_handler.h"
// additional libraries by Daniel
#include "Arduino.h"
#include <chrono>
#include <iostream>
#include <vector>
#include "input_data.h"
//setting timer
using namespace std::chrono;
unsigned long interval = 2000;
auto t_0 = high_resolution_clock::from_time_t(0);
auto now = high_resolution_clock::now();
auto previousMillis = duration_cast<milliseconds>(now - t_0).count();
bool debug_flag = true;
// Globals, used for compatibility with Arduino-style sketches.
namespace {
const tflite::Model* model = nullptr;
tflite::MicroInterpreter* interpreter = nullptr;
TfLiteTensor* input = nullptr;
TfLiteTensor* output = nullptr;
int inference_count = 0;
// increase if esp spits out "Failed to resize buffer"
constexpr int kTensorArenaSize = 55000;
uint8_t tensor_arena[kTensorArenaSize];
} // namespace
// initialise test vector
std::vector<std::vector<float>> testing_vector;
// The name of this function is important for Arduino compatibility.
void setup() {
// Map the model into a usable data structure. This doesn't involve any
// copying or parsing, it's a very lightweight operation.
model = tflite::GetModel(g_model);
if (model->version() != TFLITE_SCHEMA_VERSION) {
MicroPrintf("Model provided is schema version %d not equal to supported "
"version %d.", model->version(), TFLITE_SCHEMA_VERSION);
return;
}
// This pulls in all the operation implementations we need.
// NOLINTNEXTLINE(runtime-global-variables)
static tflite::AllOpsResolver resolver;
// Build an interpreter to run the model with.
static tflite::MicroInterpreter static_interpreter(
model, resolver, tensor_arena, kTensorArenaSize);
interpreter = &static_interpreter;
// Allocate memory from the tensor_arena for the model's tensors.
TfLiteStatus allocate_status = interpreter->AllocateTensors();
if (allocate_status != kTfLiteOk) {
MicroPrintf("AllocateTensors() failed");
return;
}
// Obtain pointers to the model's input and output tensors.
input = interpreter->input(0);
output = interpreter->output(0);
// Keep track of how many inferences we have performed.
inference_count = 0;
//debugg information
std::cout << "Total heap: " << ESP.getHeapSize() << "\n";
std::cout << "Free heap: " << ESP.getFreeHeap() << "\n";
std::cout << "Total PSRAM: " << ESP.getPsramSize() << "\n";
std::cout << "Free PSRAM: " << ESP.getFreePsram() << "\n";
std::cout << "Max Alloc Heap: " << ESP.getMaxAllocHeap() << "\n";
std::cout << "Max Vector Size: " << testing_vector.max_size() << "\n";
// reserve memory for vector
testing_vector.reserve(4700);
}
// The name of this function is important for Arduino compatibility.
void loop() {
// setting up timer for test ouput
now = high_resolution_clock::now();
auto mseconds = duration_cast<milliseconds>(now - t_0).count();
if (mseconds- interval > previousMillis){
std::cout << "test ouput: I am in main loop";
previousMillis = duration_cast<milliseconds>(now - t_0).count();
}
}
Further Information:
I have a partition table and already played with configure different sizes
Partition Table
partition table image
Hardware
ESP32-S (NODEMCU-32)
SPI Flash 32Mbit

ITK image allocation and sysmalloc

I am currently inheriting old code and trying to run it. As part of this there is an image generation done through ITK (which has been built and installed on the system)
The (truncated) function causing issue at the moment is the following
void PrintDensityImage(std::vector<float> *HU, imageDimensions dimensions, std::string nameFile)
{
ImageType::Pointer image = ImageType::New();
ImageType::RegionType region;
ImageType::IndexType start;
start[0] = 0;
start[1] = 0;
start[2] = 0;
ImageType::SizeType size;
size[0] = 512;//dimensions.nbVoxel.x;
size[1] = 512;//dimensions.nbVoxel.y;
size[2] = 8;//dimensions.nbVoxel.z;
ImageType::SpacingType inputSpacing;
inputSpacing[0] = 0.9;//dimensions.voxelSize.x;
inputSpacing[1] = 0.9;//dimensions.voxelSize.y;
inputSpacing[2] = 1.1;//dimensions.voxelSize.z;
std::cout << inputSpacing << endl;
std::cout << size << " " << start << " " << region << endl;
region.SetSize(size);
region.SetIndex(start);
image->SetRegions(region);
image->SetSpacing(inputSpacing);
printf("I hit here...\n");
std::cout << region << endl;
image->Allocate();
printf("But I do not get here\n");
ImageType::IndexType pixelIndex;
.........
}
And the header includes
#include <itkImageFileReader.h>
#include <itkImageFileWriter.h>
#include <itkHDF5ImageIO.h>
#include "itkGDCMImageIO.h"
#include "itkGDCMSeriesFileNames.h"
#include "itkNumericSeriesFileNames.h"
#include "itkImageSeriesReader.h"
typedef itk::Image< float, 3 > ImageType;
The current console output is
[0.9, 0.9, 1.1]
[512, 512, 8] [0, 0, 0] ImageRegion (0x7ffd0e7a6910)
Dimension: 3
Index: [0, 0, 0]
Size: [0, 0, 0]
I hit here...
ImageRegion (0x7ffd0e7a6910)
Dimension: 3
Index: [0, 0, 0]
Size: [512, 512, 8]
Followed by the error
CT_GPUMCD: malloc.c:2379: sysmalloc: Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed.
Aborted (core dumped)
I am not certain what is the cause of this error but it seems to spur from the image->Allocate() line which I can't quite grasp why. As far as I can read from the ITK docs (https://itk.org/ITKSoftwareGuide/html/Book1/ITKSoftwareGuide-Book1ch4.html) this should be fine.
If there is any insight into the matter I would greatly appreciate it as I really don't see what the issue is here.

The error comes from malloc.c, so from C run-time library. Are you using some experimental or beta version of compiler? Or some modified CRT? Or some software which replaces malloc by their own version (e.g. to track memory leaks)? I doubt this has much to do with ITK. What happens if you replace image->Allocate(); by float * p = new float[512*512*8];?
For reference, Allocate is here, which boils down to new T[size].

Unhandled exception at the end of thread

Hello right now i'm trying to use a thread to read some info about 2 images using OpenCV 2.3.1, while I can retrieve the sizes of the 2 images without a problem, an error occurs when I reach the end of the thread.
Unhandled exception at 0x00348A56 in OpenCvDemo.exe: An invalid parameter was passed to a function that considers invalid parameters fatal.
Since I can read the size of the images just fine I know they aren't empty or anything. In another test I detached the thread and put it to Sleep for 5 seconds, the main function executed fine and displayed the image, but once the thread woke up and reached the end it crashed the application and displayed the same error as above.
Sadly I need to use OpenCV 2.3.1 since this is just a test for a bigger application.
How can I solve this problem, or at least what is the cause of it?
#include "stdafx.h"
#include <windows.h>
#include <iostream>
#include <thread>
#include <opencv2/opencv.hpp>
void Loop(cv::Mat &image1, cv::Mat &image2) {
std::vector<uchar> buff1;
std::vector<uchar> buff2;
cv::imencode(".png", image1, buff1);
cv::imencode(".png", image2, buff2);
int imageSize1 = buff1.size();
int imageSize2 = buff2.size();
std::cout << "Size " << imageSize1 << "\n";
std::cout << "Size " << imageSize2 << "\n";
}
int main()
{
cv::Mat image1 = cv::imread("C:\\Users\\Cesar\\Pictures\\happyface.png", 1);
cv::Mat image2 = cv::imread("C:\\Users\\Cesar\\Pictures\\happyface.png", 1);
std::thread mythread(Loop, std::ref(image1), std::ref(image2));
mythread.join();
cv::imshow("name", image1);
cv::waitKey(0);
return 0;
}
The callstack
OpenCvDemo.exe!_invoke_watson(const wchar_t * expression, const wchar_t * function_name, const wchar_t * file_name, unsigned int line_number, unsigned int reserved) Line 224 C++
OpenCvDemo.exe!_invalid_parameter(const wchar_t * expression, const wchar_t * function_name, const wchar_t * file_name, unsigned int line_number, unsigned int reserved) Line 113 C++
[External Code]
OpenCvDemo.exe!Loop(cv::Mat & image1, cv::Mat & image2) Line 23 C++
[External Code]
OpenCvDemo.exe!invoke_thread_procedure(unsigned int(__stdcall*)(void *) procedure, void * const context) Line 92 C++
OpenCvDemo.exe!thread_start<unsigned int (__stdcall*)(void *)>(void * const parameter) Line 115 C++
[External Code]
[Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]

valgrind reporting invalid read with std::string

I'm working on code that runs on a raspberry pi 3. And got the following error on my logging classes.
==1297== Invalid read of size 8
==1297== at 0x4865D1C: ??? (in /usr/lib/arm-linux-gnueabihf/libarmmem.so)
==1297== Address 0x4c8d45c is 100 bytes inside a block of size 107 alloc'd
==1297== at 0x4847DA4: operator new(unsigned int) (vg_replace_malloc.c:328)
==1297== by 0x49C3D9B: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned int) (in /usr/lib/arm-linux-gnueabihf/libstdc++.so.6.0.22)
==1297== by 0x4AE65: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (basic_string.tcc:1155)
==1297== by 0xF82B5: Log::Book::addField(std::unique_ptr<Log::Entry, std::default_delete<Log::Entry> >&, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (LogBook.cpp:149)
==1297== by 0xF7CCB: Log::Book::record(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long long, std::ratio<1ll, 1000000000ll> > >) (LogBook.cpp:87)
GCC version: gcc version 6.3.0 20170516 (Raspbian 6.3.0-18+rpi1+deb9u1)
valgrind version: valgrind-3.13.0
I can't seem to locate the problem since the function Log::Book::record() get it's value via pass-by-value. I can also say that this error isn't always shown when the function is called. It is deterministic in the sense of on what line the error shows and on what line not. Can anybody direct my in direction of what this problem is and the solution for it? Code snippet below with a comment of the indicated lines.
/** log message */
void Book::record(std::string file, const int line, const unsigned int level, Identifier id, const std::string message,
const std::chrono::high_resolution_clock::time_point timeStamp)
{
if (!(fileLevels & level) && !(consoleLevels & level)) { return; }
auto now = Time::keeper->now();
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(timeStamp - Time::globalEpoch);
//generate message
auto entry = std::make_unique<Entry>(level);
// Time since startup
addField(entry, 0, std::to_string(duration.count()));
//UTC Time
addField(entry, 1, now.dateTime());
// File
std::string stringFile;
if (!file.empty())
{
stringFile = URL{file}.lastPathComponent();
}
addField(entry, 2, stringFile);
//Line number
addField(entry, 3, std::to_string(line));
//ID
addField(entry, 4, id);
//Message
std::string stringMessage;
if(!message.empty())
{
addField(entry, 5, message); //this is line LogBook.cpp:87
}
else
{
addField(entry, 5, " empty message.");
}
*entry << ";";
//queue message
this->append(std::move(entry));
}
void Book::addField(std::unique_ptr<Entry> &entry, unsigned int index, const std::string &text)
{
std::string textOutput;
if ((spacings.at(index) != 0) && (text.length() > (spacings.at(index) - 1)))
{
spacings.at(index) = (uint8_t) (text.length() + 2);
}
entry->setWidth(spacings.at(index));
if(entry->empty())
textOutput = text;
else
textOutput = ";" + text; //This is line LogBook.cpp:149
if(!textOutput.empty())
(*entry) << textOutput;
}
The code where this function gets called and this problem occurs.
auto node = child(items, "item", index);
auto enabled = boolValue(node, "enabled", false);
auto file = pathValue(node, key::path);
auto name = stringValue(node, "name", "");
auto type = stringValue(node, "type");
CLOG(CLOG::WARNING, "Yard item " + name + " not enabled, path:" + file.path());
Update 1:
I compile with cmake with the options. And added extra options. These didn't solve the problem.
add_compile_options(-ggdb)
add_compile_options(-O1)
#Extra disable vectorization
add_compile_options(-fno-tree-vectorize)
add_compile_options(-fno-tree-loop-vectorize)
add_compile_options(-fno-tree-slp-vectorize)
Update 2:
I've found an other place where string concatenation is used and valgrind reports te same errors
Update 3:
Some time and interesting discoveries later.
Error happens in a shared library libarmmem.so. This gets dynamically loaded and is by this reason always on a different address. Used gdb and valgrind combination to break when the error happens.
gdb loaded shared libraries with starting address.
(gdb) info sharedlibrary
From To Syms Read Shared Object Library
0x0483246c 0x04832750 Yes /usr/local/lib/valgrind/vgpreload_core-arm-linux.so
0x04846e60 0x04850c10 Yes /usr/local/lib/valgrind/vgpreload_memcheck-arm-linux.so
0x04863588 0x048672fc Yes (*) /usr/lib/arm-linux-gnueabihf/libarmmem.so
...
Error reported by valgrind.
==9442== Invalid read of size 8
==9442== at 0x4865D34: ??? (in /usr/lib/arm-linux-gnueabi/libarmmem.so)
We know from readelf of libarmmem.so that the .text section begins on 588. and that memcpy sits on 710. The disassembly on this breakpoint shows we are in memcpy on address 0x04863710. If we check the range like : 0x04863588 - 0x04863710 = 188. 188 + 588(starting adress of .text) = 710.
Disassembly shows it happens on a assembly line. vldmia is a instruction for Load Vector Floating Point registers.
0x04865d34 <+9764>: vldmia r1!, {d9}
No solution yet.

Most probly the code inside libarmem.so has been vectorized in such a way that it realizes that there's a terminating char only after reading full 8-byte chunk. This will not trigger a processor exception (as algorithm ensures that pointer is aligned and thus stays in the same page) but will cause tools like Valgrind to report false positives.
Problems like this are getting worse over time and making Valgrind less useful in practice. See Valgrind vs Optimising Compilers for an in-depth discussion or this bug in diff for real-world example (or my Debian suppression list for even more examples).

C++ unexpected change of members data

One of my class has a trouble: its members data can change unexpectedly.
I have read similar subject on SO, and it seems to be apparent to an undefined behavior or a problem of pointer. But even with a very easy expression of my code, I still have it:
aid.cpp:
#include "aid.h"
bool AID::Detect(t_arr3d x, t_arr3d x_p1, t_arr3d x_p2, t_arr3d x_p3, t_arr3d x_p4, int fp) {
return false;
}
AID::AID() {
this->counter = 0;
maxErrorBound = 0.1;
maxErrorBound2 = 0.02; // = maxErrorBound * lambda
}
aid.h
#ifndef AID_H_
#define AID_H_
#include "detector.h"
#include "vec.h"
#include <map>
#include <vector>
#include <limits>
#include <cstddef>
#include <boost/assign/list_of.hpp>
#include <boost/unordered_map.hpp>
#include "constants.h"
using namespace rode;
using boost::assign::map_list_of;
using namespace std;
class AID: public Detector {
public:
bool Detect(t_arr3d x, t_arr3d x_p1, t_arr3d x_p2, t_arr3d x_p3, t_arr3d x_p4, int fp);
AID();
private:
int counter;
float maxErrorBound ;
float maxErrorBound2;
};
#endif
The class is called in another one (rode.cpp):
...
if(a_detector == "AID"){
AID d = AID( );
this->aid = &d;
}
...
With LLDB, I have put a watchpoint to check what is going on:
Watchpoint 1 hit:
old value: 0
new value: 1606405696
Process 38408 stopped
* thread #1: tid = 0x74cd2, 0x00007fff5fc12171 dyld`ImageLoaderMachO::findExportedSymbol(char const*, bool, ImageLoader const**) const + 13, queue = 'com.apple.main-thread', stop reason = watchpoint 1
frame #0: 0x00007fff5fc12171 dyld`ImageLoaderMachO::findExportedSymbol(char const*, bool, ImageLoader const**) const + 13
dyld`ImageLoaderMachO::findExportedSymbol:
-> 0x7fff5fc12171 <+13>: pushq %rax
0x7fff5fc12172 <+14>: movq %rcx, %r14
0x7fff5fc12175 <+17>: movl %edx, -0x2c(%rbp)
0x7fff5fc12178 <+20>: movq %rsi, %r15
(lldb) bt
* thread #1: tid = 0x74cd2, 0x00007fff5fc12171 dyld`ImageLoaderMachO::findExportedSymbol(char const*, bool, ImageLoader const**) const + 13, queue = 'com.apple.main-thread', stop reason = watchpoint 1
* frame #0: 0x00007fff5fc12171 dyld`ImageLoaderMachO::findExportedSymbol(char const*, bool, ImageLoader const**) const + 13
frame #1: 0x00007fff5fc184f6 dyld`ImageLoaderMachOCompressed::resolveTwolevel(ImageLoader::LinkContext const&, ImageLoader const*, bool, char const*, bool, ImageLoader const**) + 86
frame #2: 0x00007fff5fc18784 dyld`ImageLoaderMachOCompressed::resolve(ImageLoader::LinkContext const&, char const*, unsigned char, long, ImageLoader const**, ImageLoaderMachOCompressed::LastLookup*, bool) + 276
frame #3: 0x00007fff5fc1a09b dyld`ImageLoaderMachOCompressed::doBindFastLazySymbol(unsigned int, ImageLoader::LinkContext const&, void (*)(), void (*)()) + 235
frame #4: 0x00007fff5fc0424e dyld`dyld::fastBindLazySymbol(ImageLoader**, unsigned long) + 90
frame #5: 0x00007fff9610b3ba libdyld.dylib`dyld_stub_binder + 282
frame #6: 0x000000010004f268 wrf2sl`GCC_except_table678 + 3660
frame #7: 0x00000001000270f9 wrf2sl`main(argc=15, argv=0x00007fff5fbffaa0) + 21337 at wrf2sl.cc:170
frame #8: 0x00007fff9610d5c9 libdyld.dylib`start + 1
wrf2sl is my program. But the rest is not related to it.
Have you ever seen a similar trouble?
How should I check to understand what is going on?

The problem is here
if(a_detector == "AID"){
AID d = AID( );
this->aid = &d;
}
Here you create a local variable in the scope of the if body, and it's local only inside there. Then you store a pointer to that local variable, a pointer to an object that is destructed once the if statement is done. That will lead to undefined behavior when you try to dereference the pointer to a non-existent object.
My advice is to not use pointers to start with, and instead store the object as a value (i.e. an actual instance of the AID class). If you must use pointers, then allocate it dynamically with new, and remember to delete it when you're done with it (or optionally depending on use-case use a smart pointer).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to set the input to a LSTM network in C++ - c++

Related

Creating 2D std::vector as input vector for Tensor Flow Lite results in crashing ESP although there is enough heap memory

ITK image allocation and sysmalloc

Unhandled exception at the end of thread

valgrind reporting invalid read with std::string

C++ unexpected change of members data

Categories

Resources