I have a class A as follows:
class A
{
public:
A()
{
printf("A constructed\n");
}
~A();
//no other constructors/assignment operators
}
I have the following elsewhere
A * _a;
I initalize it with:
int count = ...
...
_a = new A[count];
and I access it with
int key = ....
...
A *a_inst = &(_a[key]);
....
It runs normally, and the printf in the constructor is executed, and all the fields in A are fine.
I ran Valgrind with the following args:
valgrind --leak-check=full --show-reachable=yes --track-origins=yes -v ./A_app
and Valgrind keeps yelling about
Conditional jump or move depends on uninitialised value(s)
and then the stack trace to the accessors statements.
Can anyone explain why this is happening? Specifically if what Valgrind says is true, why is the constructor executed?
This can mean that key or count contains an uninitialized value. Even if you do initialize it in the declaration, e.g. int key = foo + bar;, it could be that either foo or bar is uninitialized, and valgrind carries this over to key.
Edit: Try setting A *a = 0;
Running your code in an a simplified scenario does not produce any warnings from Valgrind. Consider the following code:
#include <iostream>
class A
{
public:
A()
{
std::cout << "A" << std::endl;
}
};
int main()
{
A *a;
int count = 10;
a = new A[count];
int key = 1;
A *inst = &(a[key]);
return 0;
}
Compiled with:
$ g++ -g main.cc -o main
and run with:
$ valgrind --leak-check=full --show-reachable=yes --track-origins=yes ./main
So, I think more information is needed. You are likely doing something between defining _a and actually allocating memory on the heap. Might I simply suggest that you merge the definition and allocation into one line?
int count = 10;
A *a = new A[count];
Related
I was tinkering with the example given on the cppreference launder web page.
The example shown below suggest that either I misunderstood something and introduced UB or that there is a bug somewhere or that clang is to lax or too good.
In doit1(), I believe the optimization done by GCC is incorrect (the function returns 2) and does not take into account the fact that we use the placement new return value.
In doit2(), I believe the code is also legal but with GCC, no code is produced ?
In both situations, clang provides the behavior I expect. On GCC, it will depend on the optimization level. I tried GCC 12.1 but this is not the only GCC version showing this behavior.
#include <new>
struct A {
virtual A* transmogrify(int& i);
};
struct B : A {
A* transmogrify(int& i) override {
i = 2;
return new (this) A;
}
};
A* A::transmogrify(int& i) {
i = 1;
return new (this) B;
}
static_assert(sizeof(B) == sizeof(A), "");
int doit1() {
A i;
int n;
int m;
A* b_ptr = i.transmogrify(n);
// std::launder(&i)->transmogrify(m); // OK, launder is NOT redundant
// std::launder(b_ptr)->transmogrify(m); // OK, launder IS redundant
(b_ptr)->transmogrify(m); // KO, launder IS redundant, we use the return value of placment new
return m + n; // 3 expected, OK == 3, else KO
}
int doit2() {
A i;
int n;
int m;
A* b_ptr = i.transmogrify(n);
// b_ptr->transmogrify(m); // KO, as shown in doit1
static_cast<B*>(b_ptr)->transmogrify(m); // VERY KO see the ASM, but we realy do have a B in the memory pointed by b_ptr
return m + n; // 3 expected, OK == 3, else KO
}
int main() {
return doit1();
// return doit2();
}
Code available at: https://godbolt.org/z/43ebKf1q6
The UB comes from accessing A i; to call the destructor at the end of scope without laundering the pointer. That can let the compiler assume i has not been destroyed by the storage reuse before then.
You need something more like:
alignas(B) std::byte storage[sizeof(B)];
A& i = *new (storage) A;
// ...
static_cast<B*>(std::launder(&i))->~B();
// or: b_ptr->~B();
// or: simply don't call the destructor
I can get more debug info if built my program on Windows compared to Linux.
Here is my code:
#include <iostream>
#include <vector>
using namespace std;
class Base
{
public:
Base() = default;
virtual ~Base() = default;
};
class Derived : public Base
{
public:
Derived() = default;
~Derived() = default;
private:
int i = 1;
};
int main()
{
vector<Base*> a;
a.push_back(new Derived());
return 0;
}
Build On Linux:
Build On Windows:
It's obvious I can get more information with Windows build version. Such as vector info, vector element real type, derived object info... But Linux version, I only get the pointer address. By the way, they are all debugging by visual studio. Is there some way to add more debug info to the program built by GNU compiler? Such as compiler flags?
I don't use visual studio, so not sure how you will make use of this answer from within that environment, however, I think you can get what you want using the GDB setting: set print object on.
To show this in action, here's my GDB (12.1) session, using the same test program that you posted:
$ gdb -q vec.x
Reading symbols from vec.x...
(gdb) b 26
Breakpoint 1 at 0x401245: file vec.cc, line 26.
(gdb) r
Starting program: /tmp/vec.x
Breakpoint 1, main () at vec.cc:26
26 return 0;
(gdb) p a
$1 = std::vector of length 1, capacity 1 = {0x418eb0}
(gdb) p a[0]
$2 = (Base *) 0x418eb0
(gdb) p *a[0]
$3 = {
_vptr.Base = 0x403040 <vtable for Derived+16>
}
(gdb) set print object on
(gdb) p a[0]
$4 = (Derived *) 0x418eb0
(gdb) p *a[0]
$5 = (Derived) {
<Base> = {
_vptr.Base = 0x403040 <vtable for Derived+16>
},
members of Derived:
i = 1
}
(gdb) q
I fixed this problem by add a init command to the ~/.gdbinit.
Add the following command to the first line of file ~/.gdbinit.
set print object on
My goal is to 'fill' a class that resides in device memory from the host. Since that class contains a pointer to data, my understanding is that, after allocating the class itself, I need to allocate the space for it seperately and then change the pointer of the device class to the now allocated pointer.
I've tried to orient my solution according to this post which, in my eyes, seems to do exactly what I want, however I am doing something wrong and thus would like help.
I have the follwing setup of classes and relevant code:
class A {
public:
HostB host_B;
B *dev_B;
void moveBToGPU();
}
class HostB {
public:
vector<int> info;
}
class B {
public:
int *info;
}
void A::moveBToGPU() {
cudaMalloc(this->dev_B, sizeof(B));
int* dev_data;
cudaMalloc(&dev_data, sizeof(int) * host_B->info.size());
cudaMemcpy(&this->dev_B->info, &dev_data, sizeof(int *), cudaMemcpyHostToDevice); //Not sure if correct
//I would like to do the following, but that results in a segfault
cudaMemcpy(this->dev_B->info, host_B->info.data(), host_B->info.size(), cudaMemcpyHostToDevice);
//As expected, this works
cudaMemcpy(dev_data, host_B->info.data(), host_B->info.size(), cudaMemcpyHostToDevice;
Just get rid of the line causing the seg fault. The line that comes after it does what you want, correctly. The segfault is arising due to the fact that this: this->dev_B->info requires dereferencing a device pointer in host code (illegal) whereas this: dev_data does not. Also note that you probably want to multiply host_B->info.size() by sizeof(int) as you did with cudaMalloc
Here is an example. Your posted code could not compile, it had numerous compile errors (in moveBToGPU). I'm not going to try and list every compile error. Please study the example below for the changes:
$ cat t1676.cu
#include <cstdio>
#include <vector>
using namespace std;
class HostB {
public:
vector<int> info;
};
class B {
public:
int *info;
};
class A {
public:
HostB host_B;
B *dev_B;
void moveBToGPU();
};
__global__ void k(A a){
printf("%d\n",a.dev_B->info[0]);
}
void A::moveBToGPU() {
cudaMalloc(&dev_B, sizeof(B));
int* dev_data;
cudaMalloc(&dev_data, sizeof(int) * host_B.info.size());
cudaMemcpy(&dev_B->info, &dev_data, sizeof(int *), cudaMemcpyHostToDevice); //Not sure if correct
//As expected, this works
cudaMemcpy(dev_data, host_B.info.data(), sizeof(int)*host_B.info.size(), cudaMemcpyHostToDevice);
}
int main(){
A a;
a.host_B.info.push_back(12);
a.moveBToGPU();
k<<<1,1>>>(a);
cudaDeviceSynchronize();
}
$ nvcc -o t1676 t1676.cu
$ cuda-memcheck ./t1676
========= CUDA-MEMCHECK
12
========= ERROR SUMMARY: 0 errors
$
I wrote the following benchmark to estimate the overhead of virtual functions:
struct A{
int i = 0 ;
virtual void inc() __attribute__((noinline));
};
#ifdef VIRT
struct B : public A{
void inc() override __attribute__((noinline));
};
void A::inc() { }
void B::inc() { i++; }
#else
void A::inc() { i++; }
#endif
int main(){
#ifdef VIRT
B b;
A* p = &b;
#else
A a;
A* p = &a;
#endif
for( ;p->i < IT; p->inc()) {; }
return 0;
}
I compile it with
G=$((1000**3))
g++ -O1 -DIT=$((1*G)) -DVIRT virt.cc -o virt
g++ -O1 -DIT=$((1*G)) virt.cc -o nonvirt
And the results I got were that nonvirt was about 0.6ns slower than virt per function call at -O1 and about 0.3ns slower than virt at -O2 per function call.
How is this possible? I thought virtual functions were supposed to be slower.
First, just because you invoke a method through a pointer doesn't mean the compiler can't figure out the target type and make the call non-virtual. Plus, your program does nothing else, so everything will be well-predicted and in cache. Finally, a difference of 0.3 ns is one cycle, which is hardly worth noting. If you really want to dig into it, you could inspect the assembly code for each case on whatever your platform is.
On my system (Clang, OS X, old Macbook Air), the virtual case is a little slower, but it's hardly measurable with -O1 (e.g. 3.7 vs 3.6 seconds for non-virtual). And with -O2 there's no difference I can distinguish.
EDIT: Has been corrected
Your main is wrong. the for loop is defined 2 times in one case and once in the other. This should not impact performance since the second time the loop exits immediately ?
Correct it like that :
int main(){
#ifdef VIRT
B b;
A* p = &b;
/* removed this for loop */
#else
A a;
A* p = &a;
#endif
for( ;p->i < IT; p->inc()) {; }
return 0;
}
This problem only occurs with g++ 4.8.2 for ARMv6 (stock pidora); it compiles without error or warning on x86_64 w/ clang 3.4.2 and g++ 4.8.3. I am having a hard time not seeing it as a compiler bug, but wanted to get some other opinions.
It involves a simple member variable that g++ keeps insisting is an array and
error: array must be initialized with a brace-enclosed initializer
The header for the class looks like this:
namespace SystemStateMonitor {
class ramInput : public input, public inputFile {
public:
typedef enum {
RATIO,
PERCENT,
KiBYTES,
MiBYTES
} style_t;
ramInput (
const std::string &label = "RAM",
style_t style = style_t::PERCENT
);
unsigned int getAvailable ();
double getDV ();
double ratio ();
protected:
style_t style;
unsigned int available;
void setStyle (style_t);
friend input* jsonInputRAM (jsonObject);
};
}
The constructor looks like this:
#define PROC_FILE "/proc/meminfo"
using namespace std;
using namespace SystemStateMonitor;
ramInput::ramInput (
const string &label,
ramInput::style_t s
) :
input (label),
inputFile (PROC_FILE),
style (s),
available (0)
{
setStyle(style);
}
And when I compile this with the ARMv6 g++, I get:
inputs/ramInput.cpp:19:14: error: array must be initialized with a brace-enclosed initializer
available (0)
^
The superclasses do not have any member "available"; there is no potential weird collision. Interestingly, if I then modify the constructor:
) :
input (label),
inputFile (PROC_FILE),
style (s)
{
available = 0;
setStyle(style);
}
I now get the same error for style (s). If I then do the same thing with style (move initialization into the body), I get the error for inputFile (PROC_FILE), which is even more bizarre because that's a super constructor call.
inputs/ramInput.cpp:17:22: error: array must be initialized with a brace-enclosed initializer
inputFile (PROC_FILE)
^
Unfortunately but not surprisingly, an SSCCE starting with this:
class test {
public:
test () : x(0) { };
unsigned int x;
};
Does not reproduce the problem.
What could be going wrong here? Am I right to believe this isn't my fault?
As Mike Seymour points out in comments on the question, the cascading nature of the error indicated the compiler was simply pointing to the end of the problem initialization list, rather than the correct entry. The array turned out to be in a superclass constructor default initializer:
std::array<double,2> range = { 0 }
That particular g++ chokes on that, and:
std::array<double,2> range = { 0.0, 0.0 }
But:
std::array<double,2> range = { }
Works. Good thing I didn't want any non-zero values there...