Casting/dereferencing member variable pointer from void*, is this safe? - c++

I had a problem while hacking a bigger project so I made a simpel test case. If I'm not omitting something, my test code works fine, but maybe it works accidentally so I wanted to show it to you and ask if there are any pitfalls in this approach.
I have an OutObj which has a member variable (pointer) InObj. InObj has a member function. I send the address of this member variable object (InObj) to a callback function as void*. The type of this object never changes so inside the callback I recast to its original type and call the aFunc member function in it. In this exampel it works as expected, but in the project I'm working on it doesn't. So I might be omitting something or maybe there is a pitfall here and this works accidentally. Any comments? Thanks a lot in advance.
(The problem I have in my original code is that InObj.data is garbage).
#include <stdio.h>
class InObj
{
public:
int data;
InObj(int argData);
void aFunc()
{
printf("Inside aFunc! data is: %d\n", data);
};
};
InObj::InObj(int argData)
{
data = argData;
}
class OutObj
{
public:
InObj* objPtr;
OutObj(int data);
~OutObj();
};
OutObj::OutObj(int data)
{
objPtr = new InObj(data);
}
OutObj::~OutObj()
{
delete objPtr;
}
void callback(void* context)
{
((InObj*)context)->aFunc();
}
int main ()
{
OutObj a(42);
callback((void*)a.objPtr);
}

Yes, this is safe.
A pointer to any type can be converted to a pointer to void and back again.
Note that the conversion to void* is implicit, so you don't need the cast.

What you have posted should be "safe" insofar as a non type safe operation like this can be safe. I would replace the casts you have though with static_cast instead of C style casts, because static_cast doesn't allow you to make such unsafe conversions between types. If you try to do something unsafe with static_cast the compiler will tell you instead of leaving you guessing.
(Side unrelated note: InObj's constructor should use initialization rather than assignment:
InObj::InObj(int argData) : data(argData)
{
}
)

Related

Is casting an address by reinterpret_cast an undefined behaviour?

I want to find a way to encapsulate a header-only 3rd party library without exposing its header files. In our other projects, we encapsulate by using void*: in the implementation, we allocate memory and assign to it, and cast to pointer of its original type when we use it. But this time, the encapsulated class is used frequently, hence dynamic allocation is unacceptable. Here is another solution I'm currently considering.
Assuming that the encapsulated class need N bytes, I will make a char array member variable of size N in the wrapper class, named data, for instance. In the implementation, when I try to assign an object of the encapsulated class to the wrapper, or forward a function call, I need to cast &data to the pointer of encapsulated class by reinterpret_cast, firstly. The char array is completely a placeholder. To make this clear, here is a sample code.
#include <iostream>
struct Inner {
void print() const {
std::cout << "Inner::print()\n";
}
};
struct Wrapper;
Inner* addressof(Wrapper&);
const Inner* addressof(const Wrapper&);
struct Wrapper {
Wrapper() {
Inner* ptr = addressof(*this);
*ptr = Inner();
}
void run() const {
addressof(*this)->print();
}
char data[1];
};
Inner* addressof(Wrapper& w) {
return reinterpret_cast<Inner*>(&(w.data));
}
const Inner* addressof(const Wrapper& w) {
return reinterpret_cast<const Inner*>(&(w.data));
}
int main() {
Wrapper wrp;
wrp.run();
}
From the view of memory, this seems make sense. But I'm sure if this is some kind of undefined behaviour.
Additionally, I want to know if there is a list of undefined behaviour. Seems like cppreference doesn't contain such thing and C++ standard specfication is really hard to understand.
What you have here is undefined behavior. The reason is when you reinterpret an object to a different type, you are not allowed to modify it until you cast it back to the original type.
In your code, you originally have the data as a char[1]. Later, in your constructor, you reinterpret_cast &data as Inner*. At this point, modifying the its value will produce undefined behavior.
What you could do however, is to first create a Inner object, then cast it and store it in the char[1]. Later you can cast the char[1] back to the Inner object and do anything with the Inner object as wanted.
So now your constructor would look like this:
Wrapper() {
Inner inner;
char* ptr = reinterpret_cast<char*>(&inner);
std::memcpy(data, ptr, 1);
}
However, if you did it like this, then you don't even need the reinterpret_cast there as you can directly memcpy from inner:
Wrapper() {
Inner inner;
std::memcpy(data, &inner, 1);
}
Better, if you have C++20, then you can and should use std::bit_cast, along with std::byte(C++17) and std::array(C++11):
struct Wrapper {
Wrapper()
: data(std::bit_cast<decltype(data)>(Inner{}))
{}
void run() const {
std::bit_cast<Inner>(data).print();
}
std::array<std::byte, 1> data;
};
Demo: https://godbolt.org/z/MaT5sasaT

Why doesn't shared_ptr permit direct assignment

So when using shared_ptr<Type> you can write:
shared_ptr<Type> var(new Type());
I wonder why they didn't allow a much simpler and better (imo):
shared_ptr<Type> var = new Type();
Instead to achieve such functionality you need to use .reset():
shared_ptr<Type> var;
var.reset(new Type());
I am used to OpenCV Ptr class that is a smart pointer that allows direct assignment and everything works fine
The syntax:
shared_ptr<Type> var = new Type();
Is copy initialization. This is the type of initialization used for function arguments.
If it were allowed, you could accidentally pass a plain pointer to a function taking a smart pointer. Moreover, if during maintenance, someone changed void foo(P*) to void foo(std::shared_ptr<P>) that would compile just as fine, resulting in undefined behaviour.
Since this operation is essentially taking an ownership of a plain pointer this operation has to be done explicitly. This is why the shared_ptr constructor that takes a plain pointer is made explicit - to avoid accidental implicit conversions.
The safer and more efficient alternative is:
auto var = std::make_shared<Type>();
The issue with allowing a raw pointer to be implicitly converted into a std::shared_ptr can be demonstrated with
void foo(std::shared_ptr<int> bar) { /*do something, doesn't matter what*/ }
int main()
{
int * bar = new int(10);
foo(bar);
std::cout << *bar;
}
Now if the implicit conversion worked the memory bar points to would be deleted by the shared_ptr destructor at the end of the foo(). When we go to access it in std::cout << *bar; we now have undefined behavior as we are dereferencing a deleted pointer.
In your case you create the pointer directly at the call site so it does not matter but as you can see from the example it can cause problems.
Allowing this allows you to call functions with pointer arguments directly, which is error prone because you're not necessarily aware at call site that you're creating a shared pointer from it.
void f(std::shared_ptr<int> arg);
int a;
f(&a); // bug
Even if you disregard this, you create the invisible temporary at the call site, and creating shared_ptr is quite expensive.
I wonder why they didn't allow a much simpler and better...
Your opinion will change as you become more experienced and encounter more badly written, buggy code.
shared_ptr<>, like all standard library objects is written in such as way as to make it as difficult as possible to cause undefined behaviour (i.e. hard to find bugs that waste everyone's time and destroy our will to live).
consider:
#include<memory>
struct Foo {};
void do_something(std::shared_ptr<Foo> pfoo)
{
// ... some things
}
int main()
{
auto p = std::make_shared<Foo>(/* args */);
do_something(p.get());
p.reset(); // BOOM!
}
This code cannot compile, and that's a good thing. Because if it did, the program would exhibit undefined behaviour.
This is because we'd be deleting the same Foo twice.
This program will compile, and is well-formed.
#include<memory>
struct Foo {};
void do_something(std::shared_ptr<Foo> pfoo)
{
// ... some things
}
int main()
{
auto p = std::make_shared<Foo>(/* args */);
do_something(p);
p.reset(); // OK
}
Why [doesn't] shared_ptr permit direct assignment [copy initialization]?
Because it is explicit, see here and here.
I wonder what the rationale [is] behind it? (From a comment now removed)
TL;DR, making any constructor (or cast) explicit is to prevent it from participating in implicit conversion sequences.
The requirement for the explicit is better illustrated with the shared_ptr<> is an argument for a function.
void func(std::shared_ptr<Type> arg)
{
//...
}
And called as;
Type a;
func(&a);
This would compile, and as written and is undesired and wrong; it won't behave as expected.
It gets further complicated with adding user defined (implicit) conversions (casting operators) into the mix.
struct Type {
};
struct Type2 {
operator Type*() const { return nullptr; }
};
Then the following function (if not explicit) would compile, but offers a horrible bug...
Type2 a;
func(a);

Reference and pointer in polymorphism

Base abstract class:
class Satellite
{
public:
Satellite();
virtual void center()=0;
virtual ~Satellite(){}
};
First derived class
class Comm_sat:public Satellite
{
public:
Comm_sat();
void center() override{cout << "comm satellite override\n";}
};
Second derived class
class Space_station:public Satellite
{
public:
Space_station();
void center() override{cout << "space station override\n";}
};
Pointer version of the functions
void f(Satellite* ms){
ms->center();
delete ms;
}
int main()
{
Comm_sat* cs = new Comm_sat;
Space_station* ss = new Space_station;
f(cs);
f(ss);
}
The objects created using new in main() are properly destroyed in f(), right?
Reference version of the functions
void f(Satellite& ms){
ms.center();
}
int main()
{
Comm_sat cs;
Space_station ss;
f(cs);
f(ss);
}
Is the reference version better?
Besides, I try to use unique_ptr, however, I get errors
void f(Satellite* ms){
ms->center();
}
int main()
{
unique_ptr<Comm_sat> cs{new Comm_sat};
unique_ptr<Space_station> ss{new Space_station};
f(cs);
f(ss);
}
Error: cannot convert std::unique_ptr<Comm_sat> to Satellite* for argument 1 to void f(Satellite*)
Error: type class std::unique_ptr<Comm_sat> argument given to delete, expected pointer delete cs;
Same error for the other derived class.
Is the reference version better?
Yes, although a better way to put this would be "the pointer version is worse". The problem with the pointer version is that you pass it a valid pointer, and get a dangling pointer when the function returns. This is not intuitive, and leads to maintenance headaches when someone modifies your code thinking that you have forgotten to delete cs and ss in the main, not realizing that f deletes its argument.
The version that uses a reference is much better in this respect, because the resources are managed automatically for you. Readers of your code do not need to track the place where the memory of cs and ss gets released, because the allocation and release happen automatically.
I try to use unique_ptr, however, I get errors
There is no implicit conversion from std::unique_ptr<T> to T*. You need to call get() if you want to pass a raw pointer:
f(cs.get());
f(ss.get());
The objects created using new in main() are properly destroyed in f(), right?
They're destroyed, and cleaned up correctly, yes. "Properly" is a stretch though, since all this manual-new-and-delete-raw-pointers stuff is poor style.
The reason unique_ptr isn't working for you is that ... it's a unique_ptr, not a raw pointer. You can't just pass it as a raw pointer.
Try
void f(Satellite* ms){
ms->center();
}
// ...
f(cs.get());
or better, unless you really need to pass nullptr sometimes,
void f(Satellite& ms){
ms.center();
}
// ...
f(*cs);
or best of all, since you don't show any reason to require dynamic allocation at all:
void f(Satellite& ms);
// ...
{
Comm_sat cs;
f(cs);
} // no new, no delete, cs goes out of scope here

C++11 best practice to use rvalue

I am new to C++11. In fact until recently, I programmed only using dynamic allocation, in a way similar to Java, e.g.
void some_function(A *a){
a->changeInternalState();
}
A *a = new A();
some_function(a);
delete a;
// example 2
some_function( new A() ); // suppose there is **no** memory leak.
Now I want to reproduce similar code with C++11, but without pointers.
I need to be able to pass newly created class class A directly to function useA(). There seems to be a problem if I want to do so with non-const normal reference and It works if I do it with rvalue reference.
Here is the code:
#include <stdio.h>
class A{
public:
void print(){
++p; // e.g. change internal state
printf("%d\n", p);
}
int p;
};
// normal reference
void useA(A & x){
x.print();
}
// rvalue reference
void useA(A && x){
useA(x);
}
int main(int argc, char** argv)
{
useA( A{45} ); // <--- newly created class
A b{20};
useA(b);
return 0;
}
It compiles and executes correctly, but I am not sure, if this is the correct acceptable way to do the work?
Are there some best practices for this kind of operations?
Normally you would not design the code so that a temporary object gets modified. Then you would write your print function as:
void useA(A const & x){
x.print();
}
and declare A::print as const. This binds to both rvalues and lvalues. You can use mutable for class member variables which might change value but without the object logically changing state.
Another plan is to keep just A &, but write:
{ A temp{45}; useA(temp); }
If you really do want to modify a temporary object, you can write the pair of lvalue and rvalue overloads as you have done in your question. I believe this is acceptable practice for that case.
The best thing about C++11 move semantics is that most of the time, you get them "for free" without having to explicitly add any &&s or std::move()s in your code. Usually, you only need to use these things explicitly if you're writing code that does manual memory management, such as the implementation of a smart pointer or a container class, where you would have had to write a custom destructor and copy constructor anyway.
In your example, A is just an int. For ints, a move is no different from a copy, because there's no opportunity for optimization even if the int happens to be a disposable temporary. Just provide a single useA() function that takes an ordinary reference. It'll have the same behavior.

C++ "smart pointer" template that auto-converts to bare pointer but can't be explicitly deleted

I am working in a very large legacy C++ code base which shall remain nameless. Being a legacy code base, it passes raw pointers around all over the place. But we are gradually trying to modernize it and so there are some smart pointer templates as well. These smart pointers (unlike, say, Boost's scoped_ptr) have an implicit conversion to the raw pointer, so that you can pass one of them into a routine that takes a raw pointer without having to write .get(). A big downside of this is that you can also accidentally use one in a delete statement, and then you have a double free bug, which can be a real pain to track down.
Is there a way to modify the template so that it still has the implicit conversion to the raw pointer, but causes a compile error if used in a delete statement? Like this:
#include <my_scoped_ptr>
struct A {};
extern void f(A*);
struct B
{
scoped_ptr<A> a;
B();
~B();
};
B::B()
: a(new A)
{
f(a); // this should compile
}
B::~B()
{
delete a; // this should NOT compile
}
The Standard says
The operand shall have a pointer type, or a class type having a single conversion function (12.3.2) to a pointer type. If the operand has a class type, the operand is converted to a pointer type by calling the above-mentioned conversion function, and the converted operand is used in place of the original operand for the remainder of this section.
You can (ab)-use the absence of overload resolution by declaring a const version of the conversion function. On a conforming compiler that's enough to make it not work anymore with delete:
struct A {
operator int*() { return 0; }
operator int*() const { return 0; }
};
int main() {
A a;
int *p = a; // works
delete a; // doesn't work
}
Results in the following
[js#HOST2 cpp]$ clang++ main1.cpp
main1.cpp:9:3: error: ambiguous conversion of delete expression of type 'A' to a pointer
delete a; // doesn't work
^ ~
main1.cpp:2:3: note: candidate function
operator int*() { return 0; }
^
main1.cpp:3:3: note: candidate function
operator int*() const { return 0; }
^
1 error generated.
On compilers that are less conforming in that regard (EDG/Comeau, GCC) you can make the conversion function a template. delete does not expect a particular type, so this would work:
template<typename T>
operator T*() { return /* ... */ }
However, this has the downside that your smartpointer is now convertible to any pointer-type. Although the actual conversion is still typechecked, but this won't rule out conversions up-front but rather give a compile time error much later. Sadly, SFINAE does not seem to be possible with conversion functions in C++03 :) A different way is to return a private nested type pointer from the other function
struct A {
operator int*() { return 0; }
private:
struct nested { };
operator nested*() { return 0; }
};
The only problem now is with a conversion to void*, in which case both conversion functions are equally viable. A work-around suggested by #Luther is to return a function pointer type from the other conversion function, which works with both GCC and Comeau and gets rid of the void* problem while having no other problems on the usual conversion paths, unlike the template solution
struct A {
operator int*() { return 0; }
private:
typedef void fty();
operator fty*() { return 0; }
};
Notice that these workarounds are only needed for compilers that are not conforming, though.
There isn't a way to stop one and not the other. Anywhere it can be implicitly converted to a pointer for a function call, it can be implicitly converted for a delete expression.
Your best bet is to remove the conversion function. Your situation is exactly why user-defined conversion operators are dangerous and shouldn't be used often.
I'm wrong. :(
You can use a technique presented by Boost, but my concern is that you're allowing implicit conversions from a smart pointer to a raw pointer, which is generally frowned upon on. Besides, users can call delete on a pointer obtained by the -> operator, so there's really nothing you can do to prevent a determined idiot to work around whatever mechanism you come up with.
You really should just implement a get() method instead of providing operator T*() so that at least calls to delete smartptr will not compile. Non-idiots should be able to figure out that there's probably a reason why that won't work.
Yes, it's more work to type out LegacyFunc(smartptr.get()) than LegacyFunc(smartptr), but the former is preferred since it makes it explicit and prevents unexpected conversions from happening, like delete smartptr.
What if you have functions like this:
void LegacyOwnPointer(SomeType* ptr);
where the function will store the pointer somewhere? This will screw up the smart pointer, because now it's not aware that something else is owning the raw pointer.
Either way, you have some work to do. Smart pointers are like raw pointers, but they are not the same, so you can't just find-and-replace all instances of T* and replace it with my_scoped_ptr<T> and expect it to work just as well as before.
have not thought much about this but ... Can you provide an overload for operator delete which is strongly-typed for instances of your template class such that when the code is included compilation fails? if this is in your header file then implicit conversion in calls to delete should be prevented in favour of a call to your overload.
operator delete(my_scoped_ptr)
{
//... uncompilable code goes here
}
Apologies if this turns out to be a stupid idea.
I can see where you do not want to do a massive application of .get()'s. Have you ever consider a much smaller replacement of delete's?
struct A
{
friend static void Delete( A* p) { delete p; }
private:
~A(){}
};
struct B
{
};
int main()
{
delete new B(); //ok
Delete( new A ); //ok
delete new A; //compiler error
return (0);
}