Apparent encapsulation - c++

The new command allocates memory in the heap to store an object.
Static allocation may cause the object to be put on the stack instead.
But they're both not protected area of memory.
I can access to this, which is just the address of the object, and then use the indirection operator, so pointing to the object fields:
string str=string("hello");
void** str_this=(void**)&str;
char* str_data= (char*)*str_this;
str_data[0]='s';
str_data[1]=0;
cout <<str_data; // prints "sello"
So is this still considered encapsulation? Is charge of the class user (who instantiates the object) to avoid poiting to it's data?

Encapsulation in programming does not usually mean "impossible to work around if you really really try".
It usually means closer to "being able to make a clear distinction what is supposed to be exposed or not and not have things accidentally exposed or used".
I don't think anyone will mistake your code for accessing a string the way it's supposed to be accessed.

C++ is a powerful language. With great power comes great responsibility, as a comic book superhero once said.
Once you cast an object pointer to void* and then recast that pointer to another type, in this case char*, you have entered into the land of undefined behavior. The language makes no guarantees about what will and what will not work if you try to use that char* pointer. It is only legal to recast the void* back to the original type (string*).
Invoking undefined behavior is not a legitimate way of breaking encapsulation, because once you cross that undefined behavior threshold, anything is possible (just not portable).

Your code implicitly relies on a particular implementation of std::string. In other words, your code is breaking the encapsulation of std::string. Worse yet, it invokes undefined behavior, so it may work, or it may not, or it may crash...

Related

const_cast failing in c++

CallingClass::CallingFunc()
{
SomeClass obj;
obj.Construct(*Singleton::GetInstance()); // passing the listener
// Singleton::GetInstance() returns a static pointer.
//Singleton is derived from IListener
}
SomeClass::Construct(const IListener &listener)
{
IListener* pListener = const_cast<IListener*>(&listener);
}
After const_cast pListener is null.
Is it possible to perform such typecasting?
Thanks
So let me see. You have two-phase initialization, a Singleton, and casting away const, and you're de-referencing an object just to take it's address again? A stray NULL pointer is the least of your concerns, my friend.
Throw it away and write it again from scratch. And pick up a C++ book first.
Just so you know, const_cast cannot produce a null pointer unless it was passed one. GetInstance() must be returning NULL to produce this behaviour, which is formally UB as soon as you de-reference it.
const_cast is basically an instruction to the compiler to ignore the constness of something. Use of it is to be avoided, because you are overriding the compiler protection, and it can lead to a crash as you write something that attempts to update read-only memory.
However, it doesn't actually cause any code to be generated.
Therefore, if this:
IListener* pListener = const_cast<IListener*>(&listener);
results in pListener being NULL, then &listener is NULL, which is impossible (or you are returning a null reference for your singleton, or you are missing something out from your description of the problem).
Having said which I agree strongly with the answer from DeadMG.
Creating an empty object and doing an Init on it (2-phase construction) is to be avoided. Properly created objects should be valid, and if you have an Init method, it isn't.
Removing the constness from anything is to be avoided - it is extremely likely to produce surprising behaviour.
The amount of de-and-rereferencing in that code is going to give anyone a headache.
Two questions:
What are you trying to acheive here?
How much control have you got over the code? (i.e. what are you able to change?)
Without wishing to be unkind I would honestly say that it might be better to start again. There are a couple of issues I would have with this code:
Firstly, the Singleton pattern should ensure that only one of a specific object is ever created, therefore it is usually returned by pointer, reference or some derivative thereof (i.e. boost shared pointer etc.) It need not necessarily be const though and the fact that it is here indicates that the author did not intend it to be used in a non-const way.
Second, you're then passing this object by reference into a function. No need. That's the one of the major features (and drawbacks) of the singleton pattern: You can access it from anywhere. So you could just as easily write:
SomeClass::Construct()
{
IListener* pListener = const_cast<IListener*>(*Singleton::GetInstance());
}
Although this still doesn't really help you. One thing it does do is make your interface a bit clearer. You see, when you write SomeClass::Construct(const IListener&listener) anyone reading your could reasonably imply that listener is treated as const within the function and by using const_cast, you've broken that implied contract. This is a very good reason that you should not use const_cast - at least not in these circumstances.
The fundamental question that you need to ask yourself is when your IListener is const, why do you need to use it in a non-const way within Construct? Either the singleton should not return a const object or your function should not need it to be non-const.
This is a design issue that you need to sort out before you take any further steps.

Return as pointer, reference or object? [duplicate]

I'm moving from Java to C++ and am a bit confused of the language's flexibility. One point is that there are three ways to store objects: A pointer, a reference and a scalar (storing the object itself if I understand it correctly).
I tend to use references where possible, because that is as close to Java as possible. In some cases, e.g. getters for derived attributes, this is not possible:
MyType &MyClass::getSomeAttribute() {
MyType t;
return t;
}
This does not compile, because t exists only within the scope of getSomeAttribute() and if I return a reference to it, it would point nowhere before the client can use it.
Therefore I'm left with two options:
Return a pointer
Return a scalar
Returning a pointer would look like this:
MyType *MyClass::getSomeAttribute() {
MyType *t = new MyType;
return t;
}
This'd work, but the client would have to check this pointer for NULL in order to be really sure, something that's not necessary with references. Another problem is that the caller would have to make sure that t is deallocated, I'd rather not deal with that if I can avoid it.
The alternative would be to return the object itself (scalar):
MyType MyClass::getSomeAttribute() {
MyType t;
return t;
}
That's pretty straightforward and just what I want in this case: It feels like a reference and it can't be null. If the object is out of scope in the client's code, it is deleted. Pretty handy. However, I rarely see anyone doing that, is there a reason for that? Is there some kind of performance problem if I return a scalar instead of a pointer or reference?
What is the most common/elegant approach to handle this problem?
Return by value. The compiler can optimize away the copy, so the end result is what you want. An object is created, and returned to the caller.
I think the reason why you rarely see people do this is because you're looking at the wrong C++ code. ;)
Most people coming from Java feel uncomfortable doing something like this, so they call new all over the place. And then they get memory leaks all over the place, have to check for NULL and all the other problems that can cause. :)
It might also be worth pointing out that C++ references have very little in common with Java references.
A reference in Java is much more similar to a pointer (it can be reseated, or set to NULL).
In fact the only real differences are that a pointer can point to a garbage value as well (if it is uninitialized, or it points to an object that has gone out of scope), and that you can do pointer arithmetics on a pointer into an array.
A C++ references is an alias for an object. A Java reference doesn't behave like that.
Quite simply, avoid using pointers and dynamic allocation by new wherever possible. Use values, references and automatically allocated objects instead. Of course you can't always avoid dynamic allocation, but it should be a last resort, not a first.
Returning by value can introduce performance penalties because this means the object needs to be copied. If it is a large object, like a list, that operation might be very expensive.
But modern compilers are very good about making this not happen. The C++ standards explicitly states that the compiler is allowed to elide copies in certain circumstances. The particular instance that would be relevant in the example code you gave is called the 'return value optimization'.
Personally, I return by (usually const) reference when I'm returning a member variable, and return some sort of smart pointer object of some kind (frequently ::std::auto_ptr) when I need to dynamically allocate something. Otherwise I return by value.
I also very frequently have const reference parameters, and this is very common in C++. This is a way of passing a parameter and saying "the function is not allowed to touch this". Basically a read-only parameter. It should only be used for objects that are more complex than a single integer or pointer though.
I think one big change from Java is that const is important and used very frequently. Learn to understand it and make it your friend.
I also think Neil's answer is correct in stating that avoiding dynamic allocation whenever possible is a good idea. You should not contort your design too much to make that happen, but you should definitely prefer design choices in which it doesn't have to happen.
Returning by value is a common thing practised in C++. However, when you are passing an object, you pass by reference.
Example
main()
{
equity trader;
isTraderAllowed(trader);
....
}
bool isTraderAllowed(const equity& trdobj)
{
... // Perform your function routine here.
}
The above is a simple example of passing an object by reference. In reality, you would have a method called isTraderAllowed for the class equity, but I was showing you a real use of passing by reference.
A point regarding passing by value or reference:
Considering optimizations, assuming a function is inline, if its parameter is declared as "const DataType objectName" that DataType could be anything even primitives, no object copy will be involved; and if its parameter is declared as "const DataType & objectName" or "DataType & objectName" that again DataType could be anything even primitives, no address taking or pointer will be involved. In both previous cases input arguments are used directly in assembly code.
A point regarding references:
A reference is not always a pointer, as instance when you have following code in the body of a function, the reference is not a pointer:
int adad=5;
int & reference=adad;
A point regarding returning by value:
as some people have mentioned, using good compilers with capability of optimizations, returning by value of any type will not cause an extra copy.
A point regarding return by reference:
In case of inline functions and optimizations, returning by reference will not involve address taking or pointer.

If de-referencing a NULL pointer is an invalid thing to do, how should auto pointers be implemented?

I thought dereferencing a NULL pointer was dangerous, if so then what about this implementation of an auto_ptr?
http://ootips.org/yonat/4dev/smart-pointers.html
If the default constructor is invoked without a parameter the internal pointer will be NULL, then when operator*() is invoked won't that be dereferencing a null pointer?
Therefore what is the industrial strength implementation of this function?
Yes, dereferencing NULL pointer = bad.
Yes, constructing an auto_ptr with NULL creates a NULL auto_ptr.
Yes, dereferencing a NULL auto_ptr = bad.
Therefore what is the industrial strength implementation of this function?
I don't understand the question. If the definition of the function in question created by the industry itself is not "industrial strength" then I have a very hard time figuring out what might be.
std::auto_ptr is intended to provide essentially the same performance as a "raw" pointer. To that end, it doesn't (necessarily) do any run-time checking that the pointer is valid before being dereferenced.
If you want a pointer that checks validity, it's relatively easy to provide that, but it's not the intended purpose of auto_ptr. In fairness, I should add that the real intent of auto_ptr is rather an interesting question -- its specification was changed several times during the original standardization process, largely because of disagreements over what it should try to accomplish. The version that made it into the standard does have some uses, but quite frankly, not very many. In particular, it has transfer-of-ownership semantics that make it unsuitable for storage in a standard container (among other things), removing one of the obvious purposes for smart pointers in general.
Its purpose to help prevent memory leaks by ensuring that delete is performed on the underlying pointer whenever the auto_ptr goes out of scope (or itself is deleted).
Just like in higher-level languages such as C#, trying to dereference a null pointer/object will still explode, as it should.
Do what you would do if you dereferenced a NULL pointer. On many platforms, this means throw an exception.
Well, just like you said: dereferencing null pointer is illegal, leads to undefined behavior. This immediately means that you must not use operator * on a default-constructed auto_ptr. Period.
Where exactly you see a problem with "industrial strength" of this implementation is not clear to me.
#Jerry Coffin: It is naughty of me to answer your answer here rather than the OP's question but I need more lines than a comment allows..
You are completely right about the ridiculous semantics of the current version, it is so completely rotten that a new feature: "mutable" HAD to be added to the language just to allow these insane semantics to be implemented.
The original purpose of "auto_ptr" was exactly what boost::scoped_ptr does (AFAIK), and I'm happy to see a version of that finally made it into the new Standard. The reason for the name "auto_ptr" is that it should model a first class pointer on the stack, i.e. an automatic variable.
This auto_ptr was an National Body requirement, based on the following logic: if we have catchable exceptions in C++, we MUST have a way to manage pointers which is exception safe IN the Standard. This also applies to non-static class members (although that's a slightly harder problem which required a change to the syntax and semantics of constructors).
In addition a reference counting pointer was required but due to a lot of different possible implementation with different tradeoffs, one can accept that this be left out of the Standard until a later time.
Have you ever played that game where you pass a message around a ring of people and at the end someone reads out the input and output messages? That's what happened. The original intent got lost because some people thought that the auto_ptr, now we HAD to have it, could be made to do more... and finally what got put in the standard can't even do what the original simple scope_ptr style one did (auto_ptr semantics don't assure the pointed at object is destroyed because it could be moved elsewhere).
If I recall the key problem was returning the value of a auto_ptr: the core design simply doesn't allow that (it's uncopyable). A sane solution like
return ap.swap(NULL)
unfortunately still destroys the intended invariant. The right way is probably closer to:
return ap.clone();
which copies the object and returns the copy, destroying the original: the compiler is then free to optimise away the copy (as written not exception safe .. the clone might leak if another exception is thrown before it returns: a ref counted pointer solves this of course).

When to return a pointer, scalar and reference in C++?

I'm moving from Java to C++ and am a bit confused of the language's flexibility. One point is that there are three ways to store objects: A pointer, a reference and a scalar (storing the object itself if I understand it correctly).
I tend to use references where possible, because that is as close to Java as possible. In some cases, e.g. getters for derived attributes, this is not possible:
MyType &MyClass::getSomeAttribute() {
MyType t;
return t;
}
This does not compile, because t exists only within the scope of getSomeAttribute() and if I return a reference to it, it would point nowhere before the client can use it.
Therefore I'm left with two options:
Return a pointer
Return a scalar
Returning a pointer would look like this:
MyType *MyClass::getSomeAttribute() {
MyType *t = new MyType;
return t;
}
This'd work, but the client would have to check this pointer for NULL in order to be really sure, something that's not necessary with references. Another problem is that the caller would have to make sure that t is deallocated, I'd rather not deal with that if I can avoid it.
The alternative would be to return the object itself (scalar):
MyType MyClass::getSomeAttribute() {
MyType t;
return t;
}
That's pretty straightforward and just what I want in this case: It feels like a reference and it can't be null. If the object is out of scope in the client's code, it is deleted. Pretty handy. However, I rarely see anyone doing that, is there a reason for that? Is there some kind of performance problem if I return a scalar instead of a pointer or reference?
What is the most common/elegant approach to handle this problem?
Return by value. The compiler can optimize away the copy, so the end result is what you want. An object is created, and returned to the caller.
I think the reason why you rarely see people do this is because you're looking at the wrong C++ code. ;)
Most people coming from Java feel uncomfortable doing something like this, so they call new all over the place. And then they get memory leaks all over the place, have to check for NULL and all the other problems that can cause. :)
It might also be worth pointing out that C++ references have very little in common with Java references.
A reference in Java is much more similar to a pointer (it can be reseated, or set to NULL).
In fact the only real differences are that a pointer can point to a garbage value as well (if it is uninitialized, or it points to an object that has gone out of scope), and that you can do pointer arithmetics on a pointer into an array.
A C++ references is an alias for an object. A Java reference doesn't behave like that.
Quite simply, avoid using pointers and dynamic allocation by new wherever possible. Use values, references and automatically allocated objects instead. Of course you can't always avoid dynamic allocation, but it should be a last resort, not a first.
Returning by value can introduce performance penalties because this means the object needs to be copied. If it is a large object, like a list, that operation might be very expensive.
But modern compilers are very good about making this not happen. The C++ standards explicitly states that the compiler is allowed to elide copies in certain circumstances. The particular instance that would be relevant in the example code you gave is called the 'return value optimization'.
Personally, I return by (usually const) reference when I'm returning a member variable, and return some sort of smart pointer object of some kind (frequently ::std::auto_ptr) when I need to dynamically allocate something. Otherwise I return by value.
I also very frequently have const reference parameters, and this is very common in C++. This is a way of passing a parameter and saying "the function is not allowed to touch this". Basically a read-only parameter. It should only be used for objects that are more complex than a single integer or pointer though.
I think one big change from Java is that const is important and used very frequently. Learn to understand it and make it your friend.
I also think Neil's answer is correct in stating that avoiding dynamic allocation whenever possible is a good idea. You should not contort your design too much to make that happen, but you should definitely prefer design choices in which it doesn't have to happen.
Returning by value is a common thing practised in C++. However, when you are passing an object, you pass by reference.
Example
main()
{
equity trader;
isTraderAllowed(trader);
....
}
bool isTraderAllowed(const equity& trdobj)
{
... // Perform your function routine here.
}
The above is a simple example of passing an object by reference. In reality, you would have a method called isTraderAllowed for the class equity, but I was showing you a real use of passing by reference.
A point regarding passing by value or reference:
Considering optimizations, assuming a function is inline, if its parameter is declared as "const DataType objectName" that DataType could be anything even primitives, no object copy will be involved; and if its parameter is declared as "const DataType & objectName" or "DataType & objectName" that again DataType could be anything even primitives, no address taking or pointer will be involved. In both previous cases input arguments are used directly in assembly code.
A point regarding references:
A reference is not always a pointer, as instance when you have following code in the body of a function, the reference is not a pointer:
int adad=5;
int & reference=adad;
A point regarding returning by value:
as some people have mentioned, using good compilers with capability of optimizations, returning by value of any type will not cause an extra copy.
A point regarding return by reference:
In case of inline functions and optimizations, returning by reference will not involve address taking or pointer.

Is there a practical benefit to casting a NULL pointer to an object and calling one of its member functions?

Ok, so I know that technically this is undefined behavior, but nonetheless, I've seen this more than once in production code. And please correct me if I'm wrong, but I've also heard that some people use this "feature" as a somewhat legitimate substitute for a lacking aspect of the current C++ standard, namely, the inability to obtain the address (well, offset really) of a member function. For example, this is out of a popular implementation of a PCRE (Perl-compatible Regular Expression) library:
#ifndef offsetof
#define offsetof(p_type,field) ((size_t)&(((p_type *)0)->field))
#endif
One can debate whether the exploitation of such a language subtlety in a case like this is valid or not, or even necessary, but I've also seen it used like this:
struct Result
{
void stat()
{
if(this)
// do something...
else
// do something else...
}
};
// ...somewhere else in the code...
((Result*)0)->stat();
This works just fine! It avoids a null pointer dereference by testing for the existence of this, and it does not try to access class members in the else block. So long as these guards are in place, it's legitimate code, right? So the question remains: Is there a practical use case, where one would benefit from using such a construct? I'm especially concerned about the second case, since the first case is more of a workaround for a language limitation. Or is it?
PS. Sorry about the C-style casts, unfortunately people still prefer to type less if they can.
The first case is not calling anything. It's taking the address. That's a defined, permitted, operation. It yields the offset in bytes from the start of the object to the specified field. This is a very, very, common practice, since offsets like this are very commonly needed. Not all objects can be created on the stack, after all.
The second case is reasonably silly. The sensible thing would be to declare that method static.
I don't see any benefit of ((Result*)0)->stat(); - it is an ugly hack which will likely break sooner than later. The proper C++ approach would be using a static method Result::stat() .
offsetof() on the other hand is legal, as the offsetof() macro never actually calls a method or accesses a member, but only performs address calculations.
Everybody else has done a good job of reiterating that the behavior is undefined. But lets pretend it wasn't, and that p->member is allowed to behave in a consistent manner under certain circumstances even if p isn't a valid pointer.
Your second construct would still serve almost no purpose. From a design perspective, you've probably done something wrong if a single function can do its job both with and without accessing members, and if it can then splitting the static portion of the code into a separate, static function would be much more reasonable than expecting your users to create a null pointer to operate on.
From a safety perspective, you've only protected against a small portion of the ways an invalid this pointer can be created. There's uninitialized pointers, for starters:
Result* p;
p->stat(); //Oops, 'this' is some random value
There's pointers that have been initialized, but are still invalid:
Result* p = new Result;
delete p;
p->stat(); //'this' points to "safe" memory, but the data doesn't belong to you
And even if you always initialize your pointers, and absolutely never accidentally reuse free'd memory:
struct Struct {
int i;
Result r;
}
int main() {
((Struct*)0)->r.stat(); //'this' is likely sizeof(int), not 0
}
So really, even if it weren't undefined behavior, it is worthless behavior.
Although libraries targeting specific C++ implementations may do this, that doesn't mean it's "legitimate" generally.
This works just fine! It avoids a null
pointer dereference by testing for the
existence of this, and it does not try
to access class members in the else
block. So long as these guards are in
place, it's legitimate code, right?
No, because although it might work fine on some C++ implementations, it is perfectly okay for it to not work on any conforming C++ implementation.
Dereferencing a null-pointer is undefined behavior and anything can happen if you do it. Don't do it if you want a program that works.
Just because it doesn't immediately crash in one specific test case doesn't mean that it won't get you into all kinds of trouble.
Undefined behaviour is undefined behaviour. Do this tricks "work" for your particular compiler? well, possibly. will they work for the next iteration of it. or for another compiler? Possibly not. You pays your money and you takes your choice. I can only say that in nearly 25 years of C++ programming I've never felt the need to do any of these things.
Regarding the statement:
It avoids a null pointer dereference by testing for the existence of this, and it does not try to access class members in the else block. So long as these guards are in place, it's legitimate code, right?
The code is not legitimate. There is no guarantee that the compiler and/or runtime will actually call to the method when the pointer is NULL. The checking in the method is of no help because you can't assume that the method will actually end up being called with a NULL this pointer.