Is pointer conversion expensive or not? - c++

Is pointer conversion considered expensive? (e.g. how many CPU cycles it takes to convert a pointer/address), especially when you have to do it quite frequently, for instance (just an example to show the scale of freqency, I know there are better ways for this particular cases):
unsigned long long *x;
/* fill data to x*/
for (int i = 0; i < 1000*1000*1000; i++)
{
A[i]=foo((unsigned char*)x+i);
};

(e.g. how many CPU cycles it takes to convert a pointer/address)
In most machine code languages there is only 1 "type" of pointer and so it doesn't cost anything to convert between them. Keep in mind that C++ types really only exist at compile time.
The real issue is that this sort of code can break strict aliasing rules. You can read more about this elsewhere, but essentially the compiler will either produce incorrect code through undefined behavior, or be forced to make conservative assumptions and thus produce slower code. (note that the char* and friends is somewhat exempt from the undefined behavior part)
Optimizers often have to make conservative assumptions about variables in the presence of pointers. For example, a constant propagation process that knows the value of variable x is 5 would not be able to keep using this information after an assignment to another variable (for example, *y = 10) because it could be that *y is an alias of x. This could be the case after an assignment like y = &x.
As an effect of the assignment to *y, the value of x would be changed as well, so propagating the information that x is 5 to the statements following *y = 10 would be potentially wrong (if *y is indeed an alias of x). However, if we have information about pointers, the constant propagation process could make a query like: can x be an alias of *y? Then, if the answer is no, x = 5 can be propagated safely.
Another optimization impacted by aliasing is code reordering. If the compiler decides that x is not aliased by *y, then code that uses or changes the value of x can be moved before the assignment *y = 10, if this would improve scheduling or enable more loop optimizations to be carried out.
To enable such optimizations in a predictable manner, the ISO standard for the C programming language (including its newer C99 edition, see section 6.5, paragraph 7) specifies that it is illegal (with some exceptions) for pointers of different types to reference the same memory location. This rule, known as "strict aliasing", sometime allows for impressive increases in performance,[1] but has been known to break some otherwise valid code. Several software projects intentionally violate this portion of the C99 standard. For example, Python 2.x did so to implement reference counting,[2] and required changes to the basic object structs in Python 3 to enable this optimisation. The Linux kernel does this because strict aliasing causes problems with optimization of inlined code.[3] In such cases, when compiled with gcc, the option -fno-strict-aliasing is invoked to prevent unwanted optimizations that could yield unexpected code.
[edit]
http://en.wikipedia.org/wiki/Aliasing_(computing)#Conflicts_with_optimization
What is the strict aliasing rule?

On any architecture you're likely to encounter, all pointer types have the same representation, and so conversion between different pointer types representing the same address has no run-time cost. This applies to all pointer conversions in C.
In C++, some pointer conversions have a cost and some don't:
reinterpret_cast and const_cast (or an equivalent C-style cast, such as the one in the question) and conversion to or from void* will simply reinterpret the pointer value, with no cost.
Conversion between pointer-to-base-class and pointer-to-derived class (either implicitly, or with static_cast or an equivalent C-style cast) may require adding a fixed offset to the pointer value if there are multiple base classes.
dynamic_cast will do a non-trivial amount of work to look up the pointer value based on the dynamic type of the object pointed to.
Historically, some architectures (e.g. PDP-10) had different representations for pointer-to-byte and pointer-to-word; there may be some runtime cost for conversions there.

unsigned long long *x;
/* fill data to x*/
for (int i = 0; i < 1000*1000*1000; i++)
{
A[i]=foo((unsigned char*)x+i); // bad cast
}
Remember, the machine only knows memory addresses, data and code. Everything else (such as types etc) are known only to the Compiler(that aid the programmer), and that does all the pointer arithmetic, only the compiler knows the size of each type.. so on and so forth.
At runtime, there are no machine cycles wasted in converting one pointer type to another because the conversion does not happen at runtime. All pointers are treated as of 4 bytes long(on a 32 bit machine) nothing more and nothing less.

It all depends on your underlying hardware.
On most machine architectures, all pointers are byte pointers, and converting between a byte pointer and a byte pointer is a no-op. On some architectures, a pointer conversion may under some circumstances require extra manipulation (there are machines that work with word based addresses for instance, and converting a word pointer to a byte pointer or vice versa will require extra manipulation).
Moreover, this is in general an unsafe technique, as the compiler can't perform any sanity checking on what you are doing, and you can end up overwriting data you didn't expect.

Related

Undefined behaviour in RE2 which stated to be well defined

Recently I've found that the RE2 library uses this technique for fast set lookups. During the lookup it uses values from uninitialized array, which, as far as I know, is undefined behaviour.
I've even found this issue with valgrind warnings about use of uninitialized memory. But the issue was closed with a comment that this behaviour is indended.
I suppose that in reality an uninitialized array will just contain some random data on all modern compilers and architectures. But on the other hand I treat the 'undefined behaviour' statement as 'literally anything can happen' (including your program formats your hard drive or Godzilla comes and destroys your city).
The question is: is it legal to use uninitialized data in C++?
When C was originally designed, if arr was an array of some type T occupying N bytes, an expression like arr[i] meant "take the base address of arr, add i*N to it, fetch N bytes at the resulting address, and interpret them as a T". If every possible combination of N bytes would have a meaning when interpreted as a type T, fetching an uninitialized array element may yield an arbitrary value, but the behavior would otherwise be predictable. If T is a 32-bit type, an attempt to read an uninitialized array element of type T would yield one of at most 4294967296 possible behaviors; such action would be safe if and only if every one of those 4294967296 behaviors would meet a program's requirements. As you note, there are situations where such a guarantee is useful.
The C Standard, however, describes a semantically-weaker language which does not guarantee that an attempt to read an uninitialized array element will behave in a fashion consistent with any bit pattern the storage might have contain. Compiler writers want to process this weaker language, rather than the one Dennis Ritchie invented, because it allows them to apply a number of optimizations without regard for how they interact. For example, if code performs a=x; and later performs b=a; and c=a;, and if a compiler can't "see" anything between the original assignment and the later ones that could change a or x, it could omit the first assignment and change the latter two assignments to b=x; and c=x;. If, however, something happens between the latter two assignments that would change x, that could result in b and c getting different values--something that should be impossible if nothing changes a.
Applying that optimization by itself wouldn't be a problem if nothing changed x that shouldn't. On the other hand, consider code which uses some allocated storage as type float, frees it, re-allocates it, and uses it as type int. If the compiler knows that the original and replacement request are of the same size, it could recycle the storage without freeing and reallocating it. That could, however, cause the code sequence:
float *fp = malloc(4);
...
*fp = slowCalculation();
somethingElse = *fp;
free(fp);
int *ip = malloc(4);
...
a=*ip;
b=a;
...
c=a;
to get rewritten as:
float *fp = malloc(4);
...
startSlowCalculation(); // Use some pipelined computation unit
int *ip = (int*)fp;
...
b=*ip;
*fp = resultOfSlowCalculation(); // ** Moved from up above
somethingElse = *fp;
...
c=*ip;
It would be rare for performance to benefit particularly from processing the result of the slow calculation between the writes to b and c. Unfortunately, compilers aren't designed in a way that would make it convenient to guarantee that a deferred calculation wouldn't by chance land in exactly the spot where it would cause trouble.
Personally, I regard compiler writers' philosophy as severely misguided: if a programmer in a certain situation knows that a guarantee would be useful, requiring the programmer to work around the lack of it will impose significant cost with 100% certainty. By contrast, a requirement that compiler refrain from optimizations that are predicated on the lack of that guarantee would rarely cost anything (since code to work around its absence would almost certainly block the "optimization" anyway). Unfortunately, some people seem more interested in optimizing the performance of those source texts which don't need guarantees beyond what the Standard mandates, than in optimizing the efficiency with which a compiler can generate code to accomplish useful tasks.

Cast between a pointer and integer in x86_32/64

I have a simple virtual machine which I made for fun. It works in a very low level and it doesn't have any notion of types. Everything is just an integer. There are some instructions for getting a pointer and accessing memory by a pointer. The problem is that these pointers are simply stored as uint64_t, and any pointer arithmetic is an integer arithmetic. The machine casts this to a void* when using it as a pointer. Well, this kind of code is obvious in the assembly level, but the C and C++ standard doesn't let programmers do integer-pointer cast safely, especially when the integer is used to change the value of the pointer.
I'm not making this toy VM portable everywhere. It is just expected to work under x86_32/64 machines, and it does seem to work very fine even after full compiler optimizations. I think it's because pointers are represented no differently as integers in the x86 architecture.
What kind of solution is usually applied in such situations, that the language standards doesn't declare certain code as safe, but it really should be safe in the targeted hardware, and the results does seem okay?
Or as a more practical question, how can I let a compiler (gcc) not perform breaking optimizations on code like
uint64_t registers[0x100];
registers[0] = (uint64_t)malloc(8);
registers[0] += 4;
registers[2] = 0;
memcpy(&registers[2], (void*)registers[0], 4);
The above isn't real code, but a certain sequence of bytecode instructions would actually do something similar as above.
If you really need to cast a pointer to an integer, use at least uintptr_t, for spaghetti monster's sake! This type (along with its signed counterpart) is meant to be casted safely to/from a pointer. It is, however, not to be used for operations (but might be safe for a linear model with no modifications to the representation of both values).
Still then, your code does not seem to make sense, but might hinder the compiler actively to optimize. Without deeper knowledge about what you intend to accomplish, I would say, chances are there are better ways.

Does typecasting consume extra CPU cycles

Does typecasting in C/C++ result in extra CPU cycles?
My understanding is that is should consume extra CPU cycles atleast in certain cases. Like typecasting from float to integer where the CPU should require to convert a float structure to integer.
float a=2.0;
int b= (float)a;
I would like to understand the cases where it would/would not consume extra CPU cycles.
I would like to say that "converting between types" is what we should be looking at, not whether there is a cast or not. For example
int a = 10;
float b = a;
will be the same as :
int a = 10;
float b = (float)a;
This also applies to changing the size of a type, e.g.
char c = 'a';
int b = c;
this will "extend c into an int size from a single byte [using byte in the C sense, not 8-bit sense]", which will potentially add an extra instruction (or extra clockcycle(s) to the instruction used) above and beyond the datamovement itself.
Note that sometimes these conversions aren't at all obvious. On x86-64, a typical example is using int instead of unsigned int for indices in arrays. Since pointers are 64-bit, the index needs to be converted to 64-bit. In the case of an unsigned, that's trivial - just use the 64-bit version of the register the value is already in, since a 32-bit load operation will zero-fill the top part of the register. But if you have an int, it could be negative. So the compiler will have to use the "sign extend this to 64 bits" instruction. This is typically not an issue where the index is calculated based on a fixed loop and all values are positive, but if you call a function where it is not clear if the parameter is positive or negative, the compiler will definitely have to extend the value. Likewise if a function returns a value that is used as an index.
However, any reasonably competent compiler will not mindlessly add instructions to convert something from its own type to itself (possibly if optimization is turned off, it may do - but minimal optimization should see that "we're converting from type X to type X, that doesn't mean anything, lets take it away").
So, in short, the above example is does not add any extra penalty, but there are certainly cases where "converting data from one type to another does add extra instructions and/or clockcycles to the code".
It'll consume cycles where it alters the underlying representation. So it will consume cycles if you convert a float to an int or vice-versa. Depending on architecture casts such as int to char or long long to int may or may not consume cycles (but more often than not they will). Casting between pointer types will only consume cycles if there is multiple inheritance involved.
There are different types of casts. C++ has different types of cast operators for the different types of casts. If we look at it in those terms, ...
static_cast will usually have a cost if you're converting from one type to another, especially if the target type is a different size than the source type. static_casts are sometimes used to cast a pointer from a derived type to a base type. This may also have a cost, especially if the derived class has multiple bases.
reinterpret_cast will usually not have a direct cost. Loosely speaking, this type of cast doesn't change the value, it just changes how it's interpreted. Note, however, that this may have an indirect cost. If you reinterpret a pointer to an array of bytes as a pointer to an int, then you may pay a cost each time you dereference that pointer unless the pointer is aligned as the platform expects.
const_cast should not cost anything if you're adding or removing constness, as it's mostly an annotation to the compiler. If you're using it to add or remove a volatile qualifier, then I suppose there may be a performance difference because it would enable or disable certain optimizations.
dynamic_cast, which is used to cast from a pointer to a base class to a pointer to a derived class, most certainly has a cost, as it must--at a minimum--check if the conversion is appropriate.
When you use a traditional C cast, you're essentially just asking the compiler to choose the more specific type of cast. So to figure out if your C cast has a cost, you need to figure out what type of cast it really is.
DL and enjoy Agner Fog's manuals:
http://www.agner.org/optimize/
1. Optimizing software in C++: An optimization guide for Windows, Linux and Mac platforms
It is huge PDF but for start you can check out:
14.7 Don't mix float and double
14.8 Conversions between floating point numbers and integers

What are the use pointer variables?

I've recently tried to really come to grips with references and pointers in C++, and I'm getting a little bit confused. I understand the * and & operators which can respectively get the value at an address and get the address of a value, however why can't these simply be used with basic types like ints?
I don't understand why you can't, for example, do something like the following and not use any weird pointer variable creation:
string x = "Hello";
int y = &x; //Set 'y' to the memory address of 'x'
cout << *y; //Output the value at the address 'y' (which is the memory address of 'x')
The code above should, theoretically in my mind, output the value of 'x'. 'y' contains the memory address of 'x', and hence '*y' should be 'x'. If this works (which incidentally on trying to compile it, it doesn't -- it tells me it can't convert from a string to an int, which doesn't make much sense since you'd think a memory address could be stored in an int fine).
Why do we need to use special pointer variable declarations (e.g. string *y = &x)?
And inside this, if we take the * operator in the pointer declaration literally in the example in the line above, we are setting the value of 'y' to the memory address of 'x', but then later when we want to access the value at the memory address ('&x') we can use the same '*y' which we previously set to the memory address.
C and C++ resolve type information at compile-time, not runtime. Even runtime polymorphism relies on the compiler constructing a table of function pointers with offsets fixed at compile time.
For that reason, the only way the program can know that cout << *y; is printing a string is because y is strongly typed as a pointer-to-string (std::string*). The program cannot, from the address alone, determine that the object stored at address y is a std::string. (Even C++ RTTI does not allow this, you need enough type information to identify a polymorphic base class.)
In short, C is a typed language. You cannot store arbitrary things in variables.
Check the type safety article at wikipedia. C/C++ prevents problematic operations and functional calls at compliation time by checking the type of the operands and function parameters (but note that with explicit casts you can change the type of an expression).
It doesn't make sense to store a string in an integer -> The same way it doesn't make sense to store a pointer in it.
Simply put, a memory address has a type, which is pointer. Pointers are not ints, so you can't store a pointer in an int variable. If you're curious why ints and pointers are not fungible, it's because the size of each is implementation defined (with certain restrictions) and there is no guarantee that they will be the same size.
For instance, as #Damien_The_Unbeliever pointed out pointers on a 64-bit system must be 64-bits long, but it is perfectly legal for an int to be 32-bits, as long as it is no longer than a long and nor shorter than a short.
As to why each data type has it's own pointer type, that's because each type (especially user-defined types) is structured differently in memory. If we were to dereference typeless (or void) pointers, there would be no information indicating how that data should be interpreted. If, on the other hand, you were to create a universal pointer and do away with the "inconvenience" of specifying types, each entity in memory would probably have to be stored along-side its type information. While this is doable, it's far from efficient, and efficiency is on of C++'s design goals.
Some very low-level languages... like machine language... operate exactly as you describe. A number is a number, and it's up to the programmer to hold it in their heads what it represents. Generally speaking, the hope of higher level languages is to keep you from the concerns and potential for error that comes from that style of development.
You can actually disregard C++'s type-safety, at your peril. For instance, the gcc on a 32-bit machine I have will print "Hello" when I run this:
string x = "Hello";
int y = reinterpret_cast<int>(&x);
cout << *reinterpret_cast<string*>(y) << endl;
But as pretty much every other answerer has pointed out, there's no guarantee it would work on another computer. If I try this on a 64-bit machine, I get:
error: cast from ‘std::string*’ to ‘int’ loses precision
Which I can work around by changing it to a long:
string x = "Hello";
long y = reinterpret_cast<long>(&x);
cout << *reinterpret_cast<string*>(y) << endl;
The C++ standard specifies minimums for these types, but not maximums, so you really don't know what you're going to be dealing with when you face a new compiler. See: What does the C++ standard state the size of int, long type to be?
So the potential for writing non-portable code is high once you start going this route and "casting away" the safeties in the language. reinterpret_cast is the most dangerous type of casting...
When should static_cast, dynamic_cast, const_cast and reinterpret_cast be used?
But that's just technically drilling down into the "why not int" part specifically, in case you were interested. Note that as #BenVoight points out in the comment below, there does exist an integer type as of C99 called intptr_t which is guaranteed to hold any poniter. So there are much larger problems when you throw away type information than losing precision...like accidentally casting back to a wrong type!
C++ is a strongly typed language, and pointers and integers are different types. By making those separate types the compiler is able to detect misuses and tell you that what you are doing is incorrect.
At the same time, the pointer type maintains information on the type of the pointed object, if you obtain the address of a double, you have to store that in a double*, and the compiler knows that dereferencing that pointer you will get to a double. In your example code, int y = &x; cout << *y; the compiler would loose the information of what y points to, the type of the expression *y would be unknown and it would not be able to determine which of the different overloads of operator<< to call. Compare that with std::string *y = &x; where the compiler sees y it knows it is a std::string* and knows that dereferencing it you get to a std::string (and not a double or any other type), enabling the compiler to statically check all expressions that contain y.
Finally, while you think that a pointer is just the address of the object and that should be representable by an integral type (which on 64bit architectures would have to be int64 rather than int) that is not always the case. There are different architectures on which pointers are not really representable by integral values. For example in architectures with segmented memory, the address of an object can contain both a segment (integral value) and an offset into the segment (another integral value). On other architectures the size of pointers was different than the size of any integral type.
The language is trying to protect you from conflating two different concepts - even though at the hardware level they are both just sets of bits;
Outside of needing to pass values manually between various parts of a debugger, you never need to know the numerical value.
Outside of archaic uses of arrays, it doesn't make sense to "add 10" to a pointer - so you shouldn't treat them as numeric values.
By the compiler retaining type information, it also prevents you from making mistakes - if all pointers were equal, then the compiler couldn't, helpfully, point out that what you're trying to dereference as an int is a pointer to a string.

What does this C++ construct do?

Somewhere in lines of code, I came across this construct...
//void* v = void* value from an iterator
int i = (int)(long(v))
What possible purpose can this contruct serve?
Why not simply use int(v) instead? Why the cast to long first?
It most possibly silences warnings.
Assuming a 32bit architecture with sizeof(int) < sizeof(long) and sizeof(long) == sizeof(void *) you possibly get a warning if you cast a void * to an int and no warning if you cast a void * to a long as you're not truncating. You then get a warning assigning a long to an int (possible truncation) which is removed by then explicitly casting from a long to an int.
Without knowing the compiler it's hard to say, but I've certainly seen multi-step casts required to prevent warnings. Why not try converting the construct to what you think it should be and see what the compiler says (of course that only helps you to work out what was in the mind of the original programmer if you're using the same compiler and same warning level as they were).
It does eeevil.
On most architectures, a pointer can be considered to be just another kind of number. On most architectures, long is as many bits as a pointer, so there is a 1-to-1 map between long values and pointers. But violations, especially of the second rule, are not uncommon!
long(v) is an alias for reinterpret_cast<long>(v), which carries no guarantees. Not really fit for any purpose, unless your ABI spec says otherwise.
However, for whatever reason, whoever wrote that code prefers int to long. So they again cross their fingers and hope that no essential information is thrown out in the bits that may possibly be lost in the int to long cast.
Two uses of this are creating a unique object identifier, or trying to somehow package the pointer for some kind of arithmetic otherwise unsupported by pointers.
An opaque identifier can be a void*, so casting to integral type is unnecessary.
"Extracting" an integer from a pointer (for e.g. a division operation) can always be done by subtracting a base pointer to obtain a difference of type ptrdiff_t, which is usually long.