safety in std::vector memory manipulation - c++

I have this following code
size_t returnSize(const char* s)
{
string string(s);
return string.size();
};
size_t returnSize(const int& i)
{
return sizeof(i);
};
template<typename T>
vector<char> Serialize(const T& t)
{
T* pt = new T(t);
vector<char> CasttoChar;
for (int i =0 ;i<returnSize(t);i++)
{
CasttoChar.push_back(reinterpret_cast<const char*>(pt)[i]);
}
delete pt;
return CasttoChar;
};
template<typename T>
T DeSerialize(const vector<char> cstr)
{
T* a = (T*)(&cstr[0]);
return *a;
}
int _tmain(int argc, _TCHAR* argv[])
{
int x = 97;
vector<char> c = Serialize(x);
cout << DeSerialize<int>(c) << endl;
string k = "blabla";
vector<char> c3 = Serialize(k.c_str());
cout << DeSerialize<const char*>(c3) << endl;
system("PAUSE");
return EXIT_SUCCESS;
}
//output is
//97
//blabla
Is this line T* a = (T*)(&cstr[0]); safe?
Also, I tried reinterpret_cast<T*>(&cstr[0]); instead of T* a = (T*)(&cstr[0]); but compiler complained about not being able to convert const char* to int*. so why does the C style cast work?

Refer the standard
Why reinterpret_cast fails?
5.2.10 Reinterpret cast [expr.reinterpret.cast]
The reinterpret_cast operator shall not cast away constness (5.2.11).
An expression of integral, enumeration, pointer, or pointer-to-member
type can be explicitly converted to its own type; such a cast yields
the value of its operand.
Should I use C Cast?
No. Using C Cast instead of C++ Cast is always unsafe. You are trying to remove the constness of an Object which is an UB.
Using reinterpret_cast, will actually trap this error and advise you of during compile time of the potential pitfall.
You should actually use const_cast in this situation. Its the only legal way to convert a const object to a non const object
But Why does a C Cast works
Quoting from the accepted answer from the Question When should static_cast, dynamic_cast and reinterpret_cast be used?
A C-style cast is defined as the first of the following which
succeeds:
const_cast
static_cast
static_cast, then const_cast
reinterpret_cast
reinterpret_cast, then const_cast
So fortunately, it tries the const_cast first.

The C-style cast works because it takes many steps in order to make the cast succeed. It uses the first of the following that succeeds:
const_cast
static_cast
static_cast + const_cast
reinterpret_cast
reinterpret_cast + const_cast
In this case, it's doing the most 'powerful' cast, a reinterpret_cast to const int * followed by const_cast to int*.
The reinterpret_cast alone won't compile, because you're casting away const-ness. The const_cast is required to cast to int*. Doing a reinterpret_cast to const int* would be fine, however.
As an aside, what you're doing is generally unsafe, unless you're using a compiler extension to ensure that any user-defined type you're deserializing to isn't padded.

C style casting in c++ is not a good idea precisily because you go past the checks that prevent you from removing a const or changing the type arbitrary. If you want to make the code work as is you first need to const_cast and then reinterpret_cast, but really try to avoid const casting. To avoid the warning using reinterpret_cast simply declare a as const T*.

Stick to C++ casts. The reason the reinterpret_cast didn't work is you were casting away constness, which isn't cool; you have to use a const_cast for that and you just shouldn't. C casts ignore this.
Having said that, what are you trying to achieve here? You have effectively casting to a char array and memcpying without the efficiency that would bring.

Sorry to chime in here, but your code is broken in several ways, and the casting is just one of them. Concerning the casting, as soon as you use the conversion from/to vector on something that is not just a simple int or so but requires a constructor it will fail. In any case, a two-step conversions from char const* to void const* to T const* is unfortunately necessary.
Now, other problems:
Try the whole thing with a zero-size string. This should now fully answer your actual question, too: No, it's not safe.
You are returning a pointer to a char from DeSerialize<char const*>(). This pointer points into memory owned by the given vector, which is passed by value and after returning from that function ceases to exist! It is pure luck that it seems to work.
Even if you managed to somehow return a char const* from the function, think about who owns that memory now. The point is that this owner must also release the memory. Consider using std::string and making the char const* variant not compile using a specialization of your template.
In general, if you mean this code earnest, begin adding unit tests. It might slow you down now but avoids errors while you go, thus saving time overall. Search for "test-driven development".
There is nothing that assures that the string is NUL-terminated.
Don't use new/delete unless you have to, prefer "plain" stack variables. If you do, take care of properly releasing the memory in case of exceptions (from push_back()). Use auto_ptr (C++98) or unique_ptr (C++11) to make sure the memory is released correctly.

Related

With or without reinterpret_cast

int main()
{
class_name object;
object.method();
fstream file("writeobject.dat" , ios::out|ios::app);
file.write(reinterpret_cast<char*>(&object), sizeof(object));
return 0;
}
//////////////////////////////////////////////////////////////////////////////////////////
int main()
{
class_name object;
object.method();
fstream file("writeobject.dat" , ios::out|ios::app);
file.write((char*)&bk,sizeof(book));
return 0;
}
What is the difference between above both functions. What is reinterpret_cast is doing here? I don't see any of the difference between output of both main() functions.
A C style cast is nothing but the C++ cast that succeeds from the predefined order of:
const_cast
static_cast
static_cast, then const_cast
reinterpret_cast
reinterpret_cast, then const_cast
In this case they are doing the same thing. However, if you are using C++ it is better to use C++ style of explicit casting because they are more indicative of the intent and also it's always better to be explicit about what casting you need than be at the mercy of the compiler to chose one for you.
There is no functional difference between the C-style cast and the reinterpret_cast in the above case.
However the reinterpret_cast might be considered to be preferred by many because it is explicit in what it is doing, both to the compiler and the other humans reading the code.
Being explicit to the compiler is valuable in cases where automatic conversions might take place at times when it is not desired. Consider:
class Foo
{
public:
operator double() const
{
return mD;
}
Foo () : mD (4.12) {};
private:
double mD;
};
int main()
{
Foo foo;
double d = (double) foo;
double d2 = reinterpret_cast <double> (foo);
}
The code:
double d = (double) foo;
compiles, and when it is run the conversion operator operator double() is called. However the reinterpret_cast will not compile because Foo cannot be converted to a double.
To carry the "be explicit" philosophy forward, in cases where you do want the automatic conversion to be available, you can use a static_cast:
double d3 = static_cast <double> (foo);
This will again call the conversion operator.
Aside from the good answers already given, reinterpret_cast is MUCH easier to search for in the code than (char *), which may occur in other places, never mind that a regex type search will need some escaping to not interpret the * as a wildcard.
More importantly, if you want to find EVERY place that you cast something from one pointer type to another, finding all reinterpret_cast is quite easy, where finding all variants of (int *), (char *), (uint8_t *) and (foo **) would be quite a bit of effort to come up with the right regex to match all those without missing something out and not adding some extra finds that you didn't want.
One uses a C++-style reinterpret_cast, the other a C-style cast. The C++ style is better because it's more explicit about what it's doing, and highlights a potentially dangerous operation with verbose syntax.
In this case they do the same as for this combination of target and source type reinterpret_cast is used for the c-style cast. But just a slight change could change that.
Say you had
const class_name object;
and the first form would not compile while the second switched to "reinterpret_cast, then const_cast". And if the function you pass it to actually modified the data, you'd discover that only at runtime.

const_cast vs reinterpret_cast

Referring the SO C++ FAQ When should static_cast, dynamic_cast and reinterpret_cast be used?.
const_cast is used to remove or add const to a variable and its the only reliable, defined and legal way to remove the constness.
reinterpret_cast is used to change the interpretation of a type.
I understand in a reasonable way, why a const variable should be casted to non-const only using const_cast, but I cannot figure out a reasonable justification of issues using reinterpret_cast instead of const_cast to add constness.
I understand that using reinterpret_cast for even adding constness is not sane but would it be an UB or potential time bomb for using reinterpret_cast to add constness?
The reason I was confused here is because of the statement
Largely, the only guarantee you get with reinterpret_cast is that if
you cast the result back to the original type, you will get the exact
same value.
So if I add constness using reinterpret_cast and if you reinterpret_cast the result back to the original type, it should result back to the original type and should not be UB, but that violates the fact that one should only use const_cast to remove the constness
On a separate Note, the standard guarantees that You can add Constness using reinterpret case
5.2.10 Reinterpret cast (7) ......When a prvalue v of type “pointer to T1” is converted to the type “pointer to cv T2”, the result is
static_cast(static_cast(v)) if both T1 and T2 are
standard-layout types (3.9) and the alignment requirements of T2 are
no stricter than those of T1........
reinterpret_cast changes the interpretation of the data within the object. const_cast adds or removes the const qualifier. Data representation and constness are orthogonal. So it makes sense to have different cast keywords.
So if I add constness using reinterpret_cast and if you reinterpret_cast the result back to the original type, it should result back to the original type and should not be UB, but that violates the fact that one should only use const_cast to remove the constness
That wouldn't even compile:
int * n = new int;
const * const_added = reinterpret_cast<const int *>(n);
int * original_type = reinterpret_cast<int*>(const_added);
// error: reinterpret_cast from type ‘const int*’ to type ‘int*’ casts away qualifiers
You shouldn't just be adding const with reinterpret_cast. A reinterpret_cast should be primarily that: reinterpreting the pointer (or whatever).
In other words, if you're going from const char* to char* (hopefully because there's a bad API you can't change), then const_cast is your friend. That's really all it's intended to be.
But if you need to go from MyPODType* to const char*, you need reinterpret_cast, and it's just being nice by not requiring a const_cast on top of it.
There is one thing to keep in mind: You can't use const_cast to make a const variable writable. You can only use it to retrieve a non-const reference from a const reference if that const reference refers to a non-const object. Sounds complicated? Example:
// valid:
int x;
int const& x1 = x;
const_cast<int&>(x1) = 0;
// invalid:
int const y = 42;
int const& y1 = y;
const_cast<int&>(y1) = 0;
In reality, both of these will compile and sometimes even "work". However, the second one causes undefined behaviour and in many cases will terminate the program when the constant object is placed in read-only memory.
That said, a few more things: reinterpret_cast is the most powerful cast, but also the most dangerous one, so don't use it unless you have to. When you need to go from void* to sometype*, use static_cast. When going the opposite direction, use the built-in implicit conversion or use an explicit static_cast, too. Similarly with adding or removing const, which is also added implicitly. Concerning reinterpret_cast, see also the discussion at C++ When should we prefer to use a two chained static_cast over reinterpret_cast where an alternative that is less hackish is discussed.
Uli
The only place where I can think of for relating reinterpret_cast with const-ness is when passing a const object to an API that accepts a void pointer -
UINT ThreadFunction(void* param)
{
const MyClass* ptr = reinterpret_cast<const MyClass*>(param);
}
yeah, as you know, const_cast means that it removes constness from a specific type.
But, when we need to add constness to a type. Is there a reason we have to do it?
for example,
void PrintAnything(void* pData)
{
const CObject* pObject = reinterpret_cast<CObject*>(pData);
// below is bla-bla-bla.
}
reinterpret_cast has nothing to do with 'const'.
const_cast means two things.
first one is to remove constness from a type and the other is to give its code explicitness. Because you can use cast it using C-style cast, but this is not explicit so that is not recommended.
They do not function same. it is definitely different.

Is const_cast safer than normal cast?

Which is safer to use?
int main()
{
const int i=5;
int *ptr;
ptr=(int*)&i; <------------------- first
ptr=const_cast<int*>(&i); <-------------------Second
return 0;
}
It's safer in the sense that you won't get a cast that's something other than just removing const:
int main()
{
const char i=5;
int *ptr;
ptr=(int*)&i; // the compiler won't complain
ptr=const_cast<int*>(&i); // will fail, since `i` isn't an int
return 0;
}
which doesn't necessary mean that the const_cast<> is safe:
const int i=5;
int main()
{
int const& cri(i);
int& ri = const_cast<int&>(cri); // unsafe
ri = 0; // will likely crash;
return 0;
}
They are entirely equivalent, except that C-style casts present more of a maintenance headache over the const_cast. If the code were frozen in time, they would be identical. The Standard says that the C-style cast may devolve to static, reinterpret, or const cast or a combination of the three, or a strange funky cast that can access private bases for some reason. The point is, in this use case it is exactly equivalent to const_cast.
I'm not sure about safety - I'm sure someone is more well-versed in this than I am - but C++-style casts are part of the language standard and should always be preferred over C-style casts (as a matter of both style as well as readability).
To amend my answer, it appears that C++-style casts are checked by the compiler whereas C-style casts fail at runtime; in that regard, C++-style casts are definitely safer.
Neither is safer than the other. In both cases undefined behavior will occur should you modify the value through one of the pointers that have been casted. const_cast has the benefit of doing only what you want and expresses it clearly, while the C style cast could be everything and is not sensitive to the actual type of its argument.
It's safer, but in a different way than you imagine.
It's safer because you explicitly state you're casting away constness.
When someone sees your code, they think - "ok, here's a const_cast, this argument must have been const. Let's take a closer look at this", whereas a regular cast just gets lost in the back of the mind when reading big chunks of code.

Is there any harm if I don't do const_cast<char*> and simply use ( char * ) typecasting?

Just want to know if there is a disadvantage of not using const_cast While passing a char* and simply type-casting it as (char *) or both are basically one and same ?
#include <iostream>
#include<conio.h>
using namespace std;
void print(char * str)
{
cout << str << endl;
}
int main ()
{
const char * c = "sample text";
// print( const_cast<char *> (c) ); // This one is advantageous or the below one
print((char *) (c) ); // Does the above one and this are same?
getch();
return 0;
}
Is there some disadvantage of using print((char *) (c) ); over print( const_cast<char *> (c) ); or basically both are same ?
First of all, your print function should take a const char* parameter instead of just char* since it does not modify it. This eliminates the need for either cast.
As for your question, C++ style casts (i.e. const_cast, dynamic_cast, etc.) are prefered over C-style casts because they express the intent of the cast and they are easy to search for. If I accidentally use an a variable of type int instead of const char*, using const_cast will result in a compile time error. However if I use a C-style cast it will compile successfully but produce some difficult to diagnose memory issues at runtime.
In this context, they are identical (casting from a "const char*" to a "char*"). The advantages of const_cast are:
It will help catch typos (if you accidentally cast a "const wchar_t*" to a "char*", then const_cast will complain.)
It's easier to search for.
It's easier to see.
The C-style cast (char *) is equivalent if used properly. If you mess up the const_cast, the compiler will warn you, if you mess up the C-style cast you just get a bug.
const_cast is more appropriate because it only casts away constness, and otherwise will warn you about other possible mistakes (like converting one pointer type to another etc), and (char *) will just silently interpret anything you give it as char *. So if you can - better use const_cast for better type safety.
Independently on the effect that C cast do in this particular case, C cast and C++ casts are not the same: C++ distinguish between reinterpret, static, dynamic and const cast.
The semantics of these cast are different and not always equally possible.
C cast can be either static or reinterpret cast (where static is not possible). It must be used where such an ambivalence is a requirement (I cannot imagine how and when), it must be avoided where a well defined and expected behavior is needed.

Why do we have reinterpret_cast in C++ when two chained static_cast can do its job?

Say I want to cast A* to char* and vice-versa, we have two choices (I mean, many of us think we've two choices, because both seems to work! Hence the confusion!):
struct A
{
int age;
char name[128];
};
A a;
char *buffer = static_cast<char*>(static_cast<void*>(&a)); //choice 1
char *buffer = reinterpret_cast<char*>(&a); //choice 2
Both work fine.
//convert back
A *pA = static_cast<A*>(static_cast<void*>(buffer)); //choice 1
A *pA = reinterpret_cast<A*>(buffer); //choice 2
Even this works fine!
So why do we have reinterpret_cast in C++ when two chained static_cast can do its job?
Some of you might think this topic is a duplicate of the previous topics such as listed at the bottom of this post, but it's not. Those topics discuss only theoretically, but none of them gives even a single example demonstrating why reintepret_cast is really needed, and two static_cast would surely fail. I agree, one static_cast would fail. But how about two?
If the syntax of two chained static_cast looks cumbersome, then we can write a function template to make it more programmer-friendly:
template<class To, class From>
To any_cast(From v)
{
return static_cast<To>(static_cast<void*>(v));
}
And then we can use this, as:
char *buffer = any_cast<char*>(&a); //choice 1
char *buffer = reinterpret_cast<char*>(&a); //choice 2
//convert back
A *pA = any_cast<A*>(buffer); //choice 1
A *pA = reinterpret_cast<A*>(buffer); //choice 2
Also, see this situation where any_cast can be useful: Proper casting for fstream read and write member functions.
So my question basically is,
Why do we have reinterpret_cast in C++?
Please show me even a single example where two chained static_cast would surely fail to do the same job?
Which cast to use; static_cast or reinterpret_cast?
Cast from Void* to TYPE* : static_cast or reinterpret_cast
There are things that reinterpret_cast can do that no sequence of static_casts can do (all from C++03 5.2.10):
A pointer can be explicitly converted to any integral type large enough to hold it.
A value of integral type or enumeration type can be explicitly converted to a pointer.
A pointer to a function can be explicitly converted to a pointer to a function of a different type.
An rvalue of type "pointer to member of X of type T1" can be explicitly converted to an rvalue of type "pointer to member of Y of type T2" if T1 and T2 are both function types or both object types.
Also, from C++03 9.2/17:
A pointer to a POD-struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa.
You need reinterpret_cast to get a pointer with a hardcoded address (like here):
int* pointer = reinterpret_cast<int*>( 0x1234 );
you might want to have such code to get to some memory-mapped device input-output port.
A concrete example:
char a[4] = "Hi\n";
char* p = &a;
f(reinterpret_cast<char (&)[4]>(p)); // call f after restoring full type
// ^-- any_cast<> can't do this...
// e.g. given...
template <typename T, int N> // <=--- can match this function
void f(T (&)[N]) { std::cout << "array size " << N << '\n'; }
Other than the practical reasons that others have given where there is a difference in what they can do it's a good thing to have because its doing a different job.
static_cast is saying please convert data of type X to Y.
reinterpret_cast is saying please interpret the data in X as a Y.
It may well be that the underlying operations are the same, and that either would work in many cases. But there is a conceptual difference between saying please convert X into a Y, and saying "yes I know this data is declared as a X but please use it as if it was really a Y".
As far as I can tell your choice 1 (two chained static_cast) is dreaded undefined behaviour. Static cast only guarantees that casting pointer to void * and then back to original pointer works in a way that the resulting pointer from these to conversions still points to the original object. All other conversions are UB. For pointers to objects (instances of the user defined classes) static_cast may alter the pointer value.
For the reinterpret_cast - it only alters the type of the pointer and as far as I know - it never touches the pointer value.
So technically speaking the two choices are not equivalent.
EDIT: For the reference, static_cast is described in section 5.2.9 of current C++0x draft (sorry, don't have C++03 standard, the draft I consider current is n3225.pdf). It describes all allowed conversions, and I guess anything not specifically listed = UB. So it can blow you PC if it chooses to do so.
Using of C Style casting is not safer. It never checks for different types can be mixed together.
C++ casts helps you to make sure the type casts are done as per related objects (based on the cast you use). This is the more recommended way to use casts than using the traditional C Style casts that's always harmful.
Look, people, you don't really need reinterpret_cast, static_cast, or even the other two C++ styles casts (dynamic* and const).
Using a C style cast is both shorter and allows you to do everything the four C++-style cast let you do.
anyType someVar = (anyOtherType)otherVar;
So why use the C++-style casts? Readability. Secondly: because the more restrictive casts allow more code safety.
*okay, you might need dynamic