Example of error caused by UB of incrementing a NULL pointer - c++

This code :
int *p = nullptr;
p++;
cause undefined behaviour as it was discussed in Is incrementing a null pointer well-defined?
But when explaining fellows why they should avoid UB, besides saying it is bad because UB means that anything could happen, I like to have some example demonstating it. I have tons of them for access to an array past the limits but I could not find a single one for that.
I even tried
int testptr(int *p) {
intptr_t ip;
int *p2 = p + 1;
ip = (intptr_t) p2;
if (p == nullptr) {
ip *= 2;
}
else {
ip *= -2;
} return (int) ip;
}
in a separate compilation unit hoping that an optimizing compiler would skip the test because when p is null, line int *p2 = p + 1; is UB, and compilers are allowed to assume that code does not contain UB.
But gcc 4.8.2 (I have no useable gcc 4.9) and clang 3.4.1 both answer a positive value !
Could someone suggest some more clever code or another optimizing compiler to exhibit a problem when incrementing a null pointer ?

How about this example:
int main(int argc, char* argv[])
{
int a[] = { 111, 222 };
int *p = (argc > 1) ? &a[0] : nullptr;
p++;
p--;
return (p == nullptr);
}
At face value, this code says: 'If there are any command line arguments, initialise p to point to the first member of a[], otherwise initialise it to null. Then increment it, then decrement it, and tell me if it's null.'
On the face of it this should return '0' (indicating p is non-null) if we supply a command line argument, and '1' (indicating null) if we don't.
Note that at no point do we dereference p, and if we supply an argument then p always points within the bounds of a[].
Compiling with the command line clang -S --std=c++11 -O2 nulltest.cpp (Cygwin clang 3.5.1) yields the following generated code:
.text
.def main;
.scl 2;
.type 32;
.endef
.globl main
.align 16, 0x90
main: # #main
.Ltmp0:
.seh_proc main
# BB#0:
pushq %rbp
.Ltmp1:
.seh_pushreg 5
movq %rsp, %rbp
.Ltmp2:
.seh_setframe 5, 0
.Ltmp3:
.seh_endprologue
callq __main
xorl %eax, %eax
popq %rbp
retq
.Leh_func_end0:
.Ltmp4:
.seh_endproc
This code says 'return 0'. It doesn't even bother to check the number of command line args.
(And interestingly, commenting out the decrement has no effect on the generated code.)

Extracted from http://c-faq.com/null/machexamp.html:
Q: Seriously, have any actual machines really used nonzero null
pointers, or different representations for pointers to different
types?
A: The Prime 50 series used segment 07777, offset 0 for the null
pointer, at least for PL/I. Later models used segment 0, offset 0 for
null pointers in C, necessitating new instructions such as TCNP (Test
C Null Pointer), evidently as a sop to [footnote] all the extant
poorly-written C code which made incorrect assumptions. Older,
word-addressed Prime machines were also notorious for requiring larger
byte pointers (char *'s) than word pointers (int *'s).
The Eclipse MV series from Data General has three architecturally
supported pointer formats (word, byte, and bit pointers), two of which
are used by C compilers: byte pointers for char * and void *, and word
pointers for everything else. For historical reasons during the
evolution of the 32-bit MV line from the 16-bit Nova line, word
pointers and byte pointers had the offset, indirection, and ring
protection bits in different places in the word. Passing a mismatched
pointer format to a function resulted in protection faults.
Eventually, the MV C compiler added many compatibility options to try
to deal with code that had pointer type mismatch errors.
Some Honeywell-Bull mainframes use the bit pattern 06000 for
(internal) null pointers.
The CDC Cyber 180 Series has 48-bit pointers consisting of a ring,
segment, and offset. Most users (in ring 11) have null pointers of
0xB00000000000. It was common on old CDC ones-complement machines to
use an all-one-bits word as a special flag for all kinds of data,
including invalid addresses.
The old HP 3000 series uses a different addressing scheme for byte
addresses than for word addresses; like several of the machines above
it therefore uses different representations for char * and void *
pointers than for other pointers.
The Symbolics Lisp Machine, a tagged architecture, does not even have
conventional numeric pointers; it uses the pair <NIL, 0> (basically a
nonexistent <object, offset> handle) as a C null pointer.
Depending on the ``memory model'' in use, 8086-family processors (PC
compatibles) may use 16-bit data pointers and 32-bit function
pointers, or vice versa.
Some 64-bit Cray machines represent int * in the lower 48 bits of a
word; char * additionally uses some of the upper 16 bits to indicate a
byte address within a word.
Given that those null pointers have a weird bit pattern representation in the quoted machines, the code you put:
int *p = nullptr;
p++;
would not give the value most people would expect (0 + sizeof(*p)).
Instead you would have a value based on your machine specific nullptr bit pattern (except if the compiler has a special case for null pointer arithmetic but since that is not mandated by the standard you'll most likely face Undefined Behaviour with "visible" concrete effect).

An ideal C implementation would, when not being used for kinds of systems programming that would require using pointers which the programmer knew to have meaning but the compiler did not, ensure that every pointer was either valid or was recognizable as invalid, and would trap any time code either tried to dereference an invalid pointer (including null) or used illegitimate means to created something that wasn't a valid pointer but might be mistaken for one. On most platforms, having generated code enforce such a constraint in all situations would be quite expensive, but guarding against many common erroneous scenarios is much cheaper.
On many platforms, it is relatively inexpensive to have the compiler generate for *foo=23 code equivalent to if (!foo) NULL_POINTER_TRAP(); else *foo=23;. Even primitive compilers in the 1980s often had an option for that. The usefulness of such trapping may be largely lost, however, if compilers allow a null pointer to be incremented in such a fashion that it is no longer recognizable as a null pointer. Consequently, a good compiler should, when error-trapping is enabled, replace foo++; with foo = (foo ? foo+1 : (NULL_POINTER_TRAP(),0));. Arguably, the real "billion dollar mistake" wasn't inventing null pointers, but lay rather with the fact that some compilers would trap direct null-pointer stores, but would not trap null-pointer arithmetic.
Given that an ideal compiler would trap on an attempt to increment a null pointer (many compilers fail to do so for reasons of performance rather than semantics), I can see no reason why code should expect such an increment to have meaning. In just about any case where a programmer might expect a compiler to assign a meaning to such a construct [e.g. ((char*)0)+5 yielding a pointer to address 5], it would be better for the programmer to instead use some other construct to form the desired pointer (e.g. ((char*)5)).

This is just for completion, but the link proposed by #HansPassant in comment really deserves to be cited as an answer.
All references are here, following is just some extracts
This article is about a new memory-safe interpretation of the C abstract
machine that provides stronger protection to benefit
security and debugging ... [Writers] demonstrate that it is possible for a memory-safe implementation
of C to support not just the C abstract machine
as specified, but a broader interpretation that is still compatible
with existing code. By enforcing the model in hardware,
our implementation provides memory safety that can be used
to provide high-level security properties for C ...
[Implementation] memory capabilities are represented as the
triplet (base, bound, permissions), which is loosely packed
into a 256-bit value. Here base provides an offset into a virtual
address region, and bound limits the size of the region
accessed ... Special capability
load and store instructions allow capabilities to be spilled
to the stack or stored in data structures, just like pointers ... with
the caveat that pointer subtraction is not allowed.
The addition of permissions allows capabilities to
be tokens granting certain rights to the referenced memory.
For example, a memory capability may have permissions
to read data and capabilities, but not to write them (or just
to write data but not capabilities). Attempting any of the
operations that is not permitted will cause a trap.
[The] results confirm that it is possible to retain the strong
semantics of a capability-system memory model (which provides
non-bypassable memory protection) without sacrificing
the advantages of a low-level language.
(emphasize mine)
That means that even if it is not an operational compiler, researches exists to build one that could trap on incorrect pointer usages, and have already been published.

Related

What is a valid pointer in gcc linux x86-64 C++?

I am programming C++ using gcc on an obscure system called linux x86-64. I was hoping that may be there are a few folks out there who have used this same, specific system (and might also be able to help me understand what is a valid pointer on this system). I do not care to access the location pointed to by the pointer, just want to calculate it via pointer arithmetic.
According to section 3.9.2 of the standard:
A valid value of an object pointer type represents either the address of a byte in memory (1.7) or a null pointer.
And according to [expr.add]/4:
When an expression that has integral type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
expression P points to element x[i] of an array object x with n
elements, the expressions P + J and J + P (where J has the value j)
point to the (possibly-hypothetical) element x[i + j] if 0 ≤ i + j ≤
n; otherwise, the behavior is undefined. Likewise, the expression P -
J points to the (possibly-hypothetical) element x[i − j] if 0 ≤ i − j
≤ n; otherwise, the behavior is undefined.
And according to a stackoverflow question on valid C++ pointers in general:
Is 0x1 a valid memory address on your system? Well, for some embedded systems it is. For most OSes using virtual memory, the page beginning at zero is reserved as invalid.
Well, that makes it perfectly clear! So, besides NULL, a valid pointer is a byte in memory, no, wait, it's an array element including the element right after the array, no, wait, it's a virtual memory page, no, wait, it's Superman!
(I guess that by "Superman" here I mean "garbage collectors"... not that I read that anywhere, just smelled it. Seriously, though, all the best garbage collectors don't break in a serious way if you have bogus pointers lying around; at worst they just don't collect a few dead objects every now and then. Doesn't seem like anything worth messing up pointer arithmetic for.).
So, basically, a proper compiler would have to support all of the above flavors of valid pointers. I mean, a hypothetical compiler having the audacity to generate undefined behavior just because a pointer calculation is bad would be dodging at least the 3 bullets above, right? (OK, language lawyers, that one's yours).
Furthermore, many of these definitions are next to impossible for a compiler to know about. There are just so many ways of creating a valid memory byte (think lazy segfault trap microcode, sideband hints to a custom pagetable system that I'm about to access part of an array, ...), mapping a page, or simply creating an array.
Take, for example, a largish array I created myself, and a smallish array that I let the default memory manager create inside of that:
#include <iostream>
#include <inttypes.h>
#include <assert.h>
using namespace std;
extern const char largish[1000000000000000000L];
asm("largish = 0");
int main()
{
char* smallish = new char[1000000000];
cout << "largish base = " << (long)largish << "\n"
<< "largish length = " << sizeof(largish) << "\n"
<< "smallish base = " << (long)smallish << "\n";
}
Result:
largish base = 0
largish length = 1000000000000000000
smallish base = 23173885579280
(Don't ask how I knew that the default memory manager would allocate something inside of the other array. It's an obscure system setting. The point is I went through weeks of debugging torment to make this example work, just to prove to you that different allocation techniques can be oblivious to one another).
Given the number of ways of managing memory and combining program modules that are supported in linux x86-64, a C++ compiler really can't know about all of the arrays and various styles of page mappings.
Finally, why do I mention gcc specifically? Because it often seems to treat any pointer as a valid pointer... Take, for instance:
char* super_tricky_add_operation(char* a, long b) {return a + b;}
While after reading all the language specs you might expect the implementation of super_tricky_add_operation(a, b) to be rife with undefined behavior, it is in fact very boring, just an add or lea instruction. Which is so great, because I can use it for very convenient and practical things like non-zero-based arrays if nobody is putzing with my add instructions just to make a point about invalid pointers. I love gcc.
In summary, it seems that any C++ compiler supporting standard linkage tools on linux x86-64 would almost have to treat any pointer as a valid pointer, and gcc appears to be a member of that club. But I'm not quite 100% sure (given enough fractional precision, that is).
So... can anyone give a solid example of an invalid pointer in gcc linux x86-64? By solid I mean leading to undefined behavior. And explain what gives rise to the undefined behavior allowed by the language specs?
(or provide gcc documentation proving the contrary: that all pointers are valid).
Usually pointer math does exactly what you'd expect regardless of whether pointers are pointing at objects or not.
UB doesn't mean it has to fail. Only that it's allowed to make the whole rest of the program behave strangely in some way. UB doesn't mean that just the pointer-compare result can be "wrong", it means the entire behaviour of the whole program is undefined. This tends to happen with optimizations that depend on a violated assumption.
Interesting corner cases include an array at the very top of virtual address space: a pointer to one-past-the-end would wrap to zero, so start < end would be false?!? But pointer comparison doesn't have to handle that case, because the Linux kernel won't ever map the top page, so pointers into it can't be pointing into or just past objects. See Why can't I mmap(MAP_FIXED) the highest virtual page in a 32-bit Linux process on a 64-bit kernel?
Related:
GCC does have a max object size of PTRDIFF_MAX (which is a signed type). So for example, on 32-bit x86, an array larger than 2GB isn't fully supported for all cases of code-gen, although you can mmap one.
See my comment on What is the maximum size of an array in C? - this restriction lets gcc implement pointer subtraction (to get a size) without keeping the carry-out from the high bit, for types wider than char where the C subtraction result is in objects, not bytes, so in asm it's (a - b) / sizeof(T).
Don't ask how I knew that the default memory manager would allocate something inside of the other array. It's an obscure system setting. The point is I went through weeks of debugging torment to make this example work, just to prove to you that different allocation techniques can be oblivious to one another).
First of all, you never actually allocated the space for large[]. You used inline asm to make it start at address 0, but did nothing to actually get those pages mapped.
The kernel won't overlap existing mapped pages when new uses brk or mmap to get new memory from the kernel, so in fact static and dynamic allocation can't overlap.
Second, char[1000000000000000000L] ~= 2^59 bytes. Current x86-64 hardware and software only support canonical 48-bit virtual addresses (sign-extended to 64-bit). This will change with a future generation of Intel hardware which adds another level of page tables, taking us up to 48+9 = 57-bit addresses. (Still with the top half used by the kernel, and a big hole in the middle.)
Your unallocated space from 0 to ~2^59 covers all user-space virtual memory addresses that are possible on x86-64 Linux, so of course anything you allocate (including other static arrays) will be somewhere "inside" this fake array.
Removing the extern const from the declaration (so the array is actually allocated, https://godbolt.org/z/Hp2Exc) runs into the following problems:
//extern const
char largish[1000000000000000000L];
//asm("largish = 0");
/* rest of the code unchanged */
RIP-relative or 32-bit absolute (-fno-pie -no-pie) addressing can't reach static data that gets linked after large[] in the BSS, with the default code model (-mcmodel=small where all static code+data is assumed to fit in 2GB)
$ g++ -O2 large.cpp
/usr/bin/ld: /tmp/cc876exP.o: in function `_GLOBAL__sub_I_largish':
large.cpp:(.text.startup+0xd7): relocation truncated to fit: R_X86_64_PC32 against `.bss'
/usr/bin/ld: large.cpp:(.text.startup+0xf5): relocation truncated to fit: R_X86_64_PC32 against `.bss'
collect2: error: ld returned 1 exit status
compiling with -mcmodel=medium places large[] in a large-data section where it doesn't interfere with addressing other static data, but it itself is addressed using 64-bit absolute addressing. (Or -mcmodel=large does that for all static code/data, so every call is indirect movabs reg,imm64 / call reg instead of call rel32.)
That lets us compile and link, but then the executable won't run because the kernel knows that only 48-bit virtual addresses are supported and won't map the program in its ELF loader before running it, or for PIE before running ld.so on it.
peter#volta:/tmp$ g++ -fno-pie -no-pie -mcmodel=medium -O2 large.cpp
peter#volta:/tmp$ strace ./a.out
execve("./a.out", ["./a.out"], 0x7ffd788a4b60 /* 52 vars */) = -1 EINVAL (Invalid argument)
+++ killed by SIGSEGV +++
Segmentation fault (core dumped)
peter#volta:/tmp$ g++ -mcmodel=medium -O2 large.cpp
peter#volta:/tmp$ strace ./a.out
execve("./a.out", ["./a.out"], 0x7ffdd3bbad00 /* 52 vars */) = -1 ENOMEM (Cannot allocate memory)
+++ killed by SIGSEGV +++
Segmentation fault (core dumped)
(Interesting that we get different error codes for PIE vs non-PIE executables, but still before execve() even completes.)
Tricking the compiler + linker + runtime with asm("largish = 0"); is not very interesting, and creates obvious undefined behaviour.
Fun fact #2: x64 MSVC doesn't support static objects larger than 2^31-1 bytes. IDK if it has a -mcmodel=medium equivalent. Basically GCC fails to warn about objects too large for the selected memory model.
<source>(7): error C2148: total size of array must not exceed 0x7fffffff bytes
<source>(13): warning C4311: 'type cast': pointer truncation from 'char *' to 'long'
<source>(14): error C2070: 'char [-1486618624]': illegal sizeof operand
<source>(15): warning C4311: 'type cast': pointer truncation from 'char *' to 'long'
Also, it points out that long is the wrong type for pointers in general (because Windows x64 is an LLP64 ABI, where long is 32 bits). You want intptr_t or uintptr_t, or something equivalent to printf("%p") that prints a raw void*.
The Standard does not anticipate the existence of any storage beyond that which the implementation provides via objects of static, automatic, or thread duration, or the use of standard-library functions like calloc. It consequently imposes no restrictions on how implementations process pointers to such storage, since from its perspective such storage doesn't exist, pointers that meaningfully identify non-existent storage don't exist, and things that don't exist don't need to have rules written about them.
That doesn't mean that the people on the Committee weren't well aware that many execution environments provided forms of storage that C implementations might know nothing about. The expected, however, that people who actually worked with various platforms would be better placed than the Committee to determine what kinds of things programmers would need to do with such "outside" addresses, and how to best support such needs. No need for the Standard to concern itself with such things.
As it happens, there are some execution environments where it is more convenient for a compiler to treat pointers arithmetic like integer math than to do anything else, and many compilers for such platforms treat pointer arithmetic usefully even in cases where they're not required to do so. For 32-bit and 64-bit x86 and x64, I don't think there are any bit patterns for invalid non-null addresses, but it may be possible to form pointers that don't behave as valid pointers to the objects they address.
For example, given something like:
char x=1,y=2;
ptrdiff_t delta = (uintptr_t)&y - (uintptr_t)&x;
char *p = &x+delta;
*p = 3;
even if pointer representation is defined in such a way that using integer arithmetic to add delta to the address of x would yield y, that would in no way guarantee that a compiler would recognize that operations on *p might affect y, even if p holds y's address. Pointer p would effectively behave as though its address was invalid even though the bit pattern would match that of y's address.
The following examples show that GCC specifically assumes at least the following:
A global array cannot be at address 0.
An array cannot wrap around address 0.
Examples of unexpected behavior arising from arithmetic on invalid pointers in gcc linux x86-64 C++ (thank you melpomene):
largish == NULL evaluates to false in the program in the question.
unsigned n = ...; if (ptr + n < ptr) { /*overflow */ } can be optimized to if (false).
int arr[123]; int n = ...; if (arr + n < arr || arr + n > arr + 123) can be optimized to if (false).
Note that these examples all involve comparison of the invalid pointers, and therefore may not affect the practical case of non-zero-based arrays. Therefore I have opened a new question of a more practical nature.
Thank you everyone in the chat for helping to narrow down the question.

Where are expressions and constants stored if not in memory?

From C Programming Language by Brian W. Kernighan
& operator only applies to objects in memory: variables and array
elements. It cannot be applied to expressions, constants or register
variables.
Where are expressions and constants stored if not in memory?
What does that quote mean?
E.g:
&(2 + 3)
Why can't we take its address? Where is it stored?
Will the answer be same for C++ also since C has been its parent?
This linked question explains that such expressions are rvalue objects and all rvalue objects do not have addresses.
My question is where are these expressions stored such that their addresses can't be retrieved?
Consider the following function:
unsigned sum_evens (unsigned number) {
number &= ~1; // ~1 = 0xfffffffe (32-bit CPU)
unsigned result = 0;
while (number) {
result += number;
number -= 2;
}
return result;
}
Now, let's play the compiler game and try to compile this by hand. I'm going to assume you're using x86 because that's what most desktop computers use. (x86 is the instruction set for Intel compatible CPUs.)
Let's go through a simple (unoptimized) version of how this routine could look like when compiled:
sum_evens:
and edi, 0xfffffffe ;edi is where the first argument goes
xor eax, eax ;set register eax to 0
cmp edi, 0 ;compare number to 0
jz .done ;if edi = 0, jump to .done
.loop:
add eax, edi ;eax = eax + edi
sub edi, 2 ;edi = edi - 2
jnz .loop ;if edi != 0, go back to .loop
.done:
ret ;return (value in eax is returned to caller)
Now, as you can see, the constants in the code (0, 2, 1) actually show up as part of the CPU instructions! In fact, 1 doesn't show up at all; the compiler (in this case, just me) already calculates ~1 and uses the result in the code.
While you can take the address of a CPU instruction, it often makes no sense to take the address of a part of it (in x86 you sometimes can, but in many other CPUs you simply cannot do this at all), and code addresses are fundamentally different from data addresses (which is why you cannot treat a function pointer (a code address) as a regular pointer (a data address)). In some CPU architectures, code addresses and data addresses are completely incompatible (although this is not the case of x86 in the way most modern OSes use it).
Do notice that while (number) is equivalent to while (number != 0). That 0 doesn't show up in the compiled code at all! It's implied by the jnz instruction (jump if not zero). This is another reason why you cannot take the address of that 0 — it doesn't have one, it's literally nowhere.
I hope this makes it clearer for you.
where are these expressions stored such that there addresses can't be retrieved?
Your question is not well-formed.
Conceptually
It's like asking why people can discuss ownership of nouns but not verbs. Nouns refer to things that may (potentially) be owned, and verbs refer to actions that are performed. You can't own an action or perform a thing.
In terms of language specification
Expressions are not stored in the first place, they are evaluated.
They may be evaluated by the compiler, at compile time, or they may be evaluated by the processor, at run time.
In terms of language implementation
Consider the statement
int a = 0;
This does two things: first, it declares an integer variable a. This is defined to be something whose address you can take. It's up to the compiler to do whatever makes sense on a given platform, to allow you to take the address of a.
Secondly, it sets that variable's value to zero. This does not mean an integer with value zero exists somewhere in your compiled program. It might commonly be implemented as
xor eax,eax
which is to say, XOR (exclusive-or) the eax register with itself. This always results in zero, whatever was there before. However, there is no fixed object of value 0 in the compiled code to match the integer literal 0 you wrote in the source.
As an aside, when I say that a above is something whose address you can take - it's worth pointing out that it may not really have an address unless you take it. For example, the eax register used in that example doesn't have an address. If the compiler can prove the program is still correct, a can live its whole life in that register and never exist in main memory. Conversely, if you use the expression &a somewhere, the compiler will take care to create some addressable space to store a's value in.
Note for comparison that I can easily choose a different language where I can take the address of an expression.
It'll probably be interpreted, because compilation usually discards these structures once the machine-executable output replaces them. For example Python has runtime introspection and code objects.
Or I can start from LISP and extend it to provide some kind of addressof operation on S-expressions.
The key thing they both have in common is that they are not C, which as a matter of design and definition does not provide those mechanisms.
Such expressions end up part of the machine code. An expression 2 + 3 likely gets translated to the machine code instruction "load 5 into register A". CPU registers don't have addresses.
It does not really make sense to take the address to an expression. The closest thing you can do is a function pointer. Expressions are not stored in the same sense as variables and objects.
Expressions are stored in the actual machine code. Of course you could find the address where the expression is evaluated, but it just don't make sense to do it.
Read a bit about assembly. Expressions are stored in the text segment, while variables are stored in other segments, such as data or stack.
https://en.wikipedia.org/wiki/Data_segment
Another way to explain it is that expressions are cpu instructions, while variables are pure data.
One more thing to consider: The compiler often optimizes away things. Consider this code:
int x=0;
while(x<10)
x+=1;
This code will probobly be optimized to:
int x=10;
So what would the address to (x+=1) mean in this case? It is not even present in the machine code, so it has - by definition - no address at all.
Where are expressions and constants stored if not in memory
In some (actually many) cases, a constant expression is not stored at all. In particular, think about optimizing compilers, and see CppCon 2017: Matt Godbolt's talk “What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid”
In your particular case of some C code having 2 + 3, most optimizing compilers would have constant folded that into 5, and that 5 constant might be just inside some machine code instruction (as some bitfield) of your code segment and not even have a well defined memory location. If that constant 5 was a loop limit, some compilers could have done loop unrolling, and that constant won't appear anymore in the binary code.
See also this answer, etc...
Be aware that C11 is a specification written in English. Read its n1570 standard. Read also the much bigger specification of C++11 (or later).
Taking the address of a constant is forbidden by the semantics of C (and of C++).

Are pointer variables just integers with some operators or are they "symbolic"?

EDIT: The original word choice was confusing. The term "symbolic" is much better than the original ("mystical").
In the discussion about my previous C++ question, I have been told that pointers are
"a simple value type much like an integer"
not "mystical"
"The Bit pattern (object representation) contains the value (value representation) (§3.9/4) for trivially copyable types, which a pointer is."
This does not sound right! If nothing is symbolic and a pointer is its representation, then I can do the following. Can I?
#include <stdio.h>
#include <string.h>
int main() {
int a[1] = { 0 }, *pa1 = &a[0] + 1, b = 1, *pb = &b;
if (memcmp (&pa1, &pb, sizeof pa1) == 0) {
printf ("pa1 == pb\n");
*pa1 = 2;
}
else {
printf ("pa1 != pb\n");
pa1 = &a[0]; // ensure well defined behaviour in printf
}
printf ("b = %d *pa1 = %d\n", b, *pa1);
return 0;
}
This is a C and C++ question.
Testing with Compile and Execute C Online with GNU GCC v4.8.3: gcc -O2 -Wall gives
pa1 == pb
b = 1 *pa1 = 2
Testing with Compile and Execute C++ Online with GNU GCC v4.8.3: g++ -O2 -Wall
pa1 == pb
b = 1 *pa1 = 2
So the modification of b via (&a)[1] fails with GCC in C and C++.
Of course, I would like an answer based on standard quotes.
EDIT: To respond to criticism about UB on &a + 1, now a is an array of 1 element.
Related: Dereferencing an out of bound pointer that contains the address of an object (array of array)
Additional note: the term "mystical" was first used, I think, by Tony Delroy here. I was wrong to borrow it.
The first thing to say is that a sample of one test on one compiler generating code on one architecture is not the basis on which to draw a conclusion on the behaviour of the language.
c++ (and c) are general purpose languages created with the intention of being portable. i.e. a well formed program written in c++ on one system should run on any other (barring calls to system-specific services).
Once upon a time, for various reasons including backward-compatibility and cost, memory maps were not contiguous on all processors.
For example I used to write code on a 6809 system where half the memory was paged in via a PIA addressed in the non-paged part of the memory map. My c compiler was able to cope with this because pointers were, for that compiler, a 'mystical' type which knew how to write to the PIA.
The 80386 family has an addressing mode where addresses are organised in groups of 16 bytes. Look up FAR pointers and you'll see different pointer arithmetic.
This is the history of pointer development in c++. Not all chip manufacturers have been "well behaved" and the language accommodates them all (usually) without needing to rewrite source code.
Stealing the quote from TartanLlama:
[expr.add]/5 "[for pointer addition, ] if both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined."
So the compiler can assume that your pointer points to the a array, or one past the end. If it points one past the end, you cannot defererence it. But as you do, it surely can't be one past the end, so it can only be inside the array.
So now you have your code (reduced)
b = 1;
*pa1 = 2;
where pa points inside an array a and b is a separate variable. And when you print them, you get exactly 1 and 2, the values you have assigned them.
An optimizing compiler can figure that out, without even storing a 1or a 2 to memory. It can just print the final result.
If you turn off the optimiser the code works as expected.
By using pointer arithmetic that is undefined you are fooling the optimiser.
The optimiser has figured out that there is no code writing to b, so it can safely store it in a register. As it turns out, you have acquired the address of b in a non-standard way and modify the value in a way the optimiser doesn't see.
If you read the C standard, it says that pointers may be mystical. gcc pointers are not mystical. They are stored in ordinary memory and consist of the same type of bytes that make up all other data types. The behaviour you encountered is due to your code not respecting the limitations stated for the optimiser level you have chosen.
Edit:
The revised code is still UB. The standard doesn't allow referencing a[1] even if the pointer value happens to be identical to another pointer value. So the optimiser is allowed to store the value of b in a register.
C was conceived as a language in which pointers and integers were very intimately related, with the exact relationship depending upon the target platform. The relationship between pointers and integers made the language very suitable for purposes of low-level or systems programming. For purposes of discussion below, I'll thus call this language "Low-Level C" [LLC].
The C Standards Committee wrote up a description of a different language, where such a relationship is not expressly forbidden, but is not acknowledged in any useful fashion, even when an implementation generates code for a target and application field where such a relationship would be useful. I'll call this language "High Level Only C" [HLOC].
In the days when the Standard was written, most things that called themselves C implementations processed a dialect of LLC. Most useful compilers process a dialect which defines useful semantics in more cases than HLOC, but not as many as LLC. Whether pointers behave more like integers or more like abstract mystical entities depends upon which exact dialect one is using. If one is doing systems programming, it is reasonable to view C as treating pointers and integers as intimately related, because LLC dialects suitable for that purpose do so, and HLOC dialects that don't do so aren't suitable for that purpose. When doing high-end number crunching, however, one would far more often being using dialects of HLOC which do not recognize such a relationship.
The real problem, and source of so much contention, lies in the fact that LLC and HLOC are increasingly divergent, and yet are both referred to by the name C.

Is pointer conversion expensive or not?

Is pointer conversion considered expensive? (e.g. how many CPU cycles it takes to convert a pointer/address), especially when you have to do it quite frequently, for instance (just an example to show the scale of freqency, I know there are better ways for this particular cases):
unsigned long long *x;
/* fill data to x*/
for (int i = 0; i < 1000*1000*1000; i++)
{
A[i]=foo((unsigned char*)x+i);
};
(e.g. how many CPU cycles it takes to convert a pointer/address)
In most machine code languages there is only 1 "type" of pointer and so it doesn't cost anything to convert between them. Keep in mind that C++ types really only exist at compile time.
The real issue is that this sort of code can break strict aliasing rules. You can read more about this elsewhere, but essentially the compiler will either produce incorrect code through undefined behavior, or be forced to make conservative assumptions and thus produce slower code. (note that the char* and friends is somewhat exempt from the undefined behavior part)
Optimizers often have to make conservative assumptions about variables in the presence of pointers. For example, a constant propagation process that knows the value of variable x is 5 would not be able to keep using this information after an assignment to another variable (for example, *y = 10) because it could be that *y is an alias of x. This could be the case after an assignment like y = &x.
As an effect of the assignment to *y, the value of x would be changed as well, so propagating the information that x is 5 to the statements following *y = 10 would be potentially wrong (if *y is indeed an alias of x). However, if we have information about pointers, the constant propagation process could make a query like: can x be an alias of *y? Then, if the answer is no, x = 5 can be propagated safely.
Another optimization impacted by aliasing is code reordering. If the compiler decides that x is not aliased by *y, then code that uses or changes the value of x can be moved before the assignment *y = 10, if this would improve scheduling or enable more loop optimizations to be carried out.
To enable such optimizations in a predictable manner, the ISO standard for the C programming language (including its newer C99 edition, see section 6.5, paragraph 7) specifies that it is illegal (with some exceptions) for pointers of different types to reference the same memory location. This rule, known as "strict aliasing", sometime allows for impressive increases in performance,[1] but has been known to break some otherwise valid code. Several software projects intentionally violate this portion of the C99 standard. For example, Python 2.x did so to implement reference counting,[2] and required changes to the basic object structs in Python 3 to enable this optimisation. The Linux kernel does this because strict aliasing causes problems with optimization of inlined code.[3] In such cases, when compiled with gcc, the option -fno-strict-aliasing is invoked to prevent unwanted optimizations that could yield unexpected code.
[edit]
http://en.wikipedia.org/wiki/Aliasing_(computing)#Conflicts_with_optimization
What is the strict aliasing rule?
On any architecture you're likely to encounter, all pointer types have the same representation, and so conversion between different pointer types representing the same address has no run-time cost. This applies to all pointer conversions in C.
In C++, some pointer conversions have a cost and some don't:
reinterpret_cast and const_cast (or an equivalent C-style cast, such as the one in the question) and conversion to or from void* will simply reinterpret the pointer value, with no cost.
Conversion between pointer-to-base-class and pointer-to-derived class (either implicitly, or with static_cast or an equivalent C-style cast) may require adding a fixed offset to the pointer value if there are multiple base classes.
dynamic_cast will do a non-trivial amount of work to look up the pointer value based on the dynamic type of the object pointed to.
Historically, some architectures (e.g. PDP-10) had different representations for pointer-to-byte and pointer-to-word; there may be some runtime cost for conversions there.
unsigned long long *x;
/* fill data to x*/
for (int i = 0; i < 1000*1000*1000; i++)
{
A[i]=foo((unsigned char*)x+i); // bad cast
}
Remember, the machine only knows memory addresses, data and code. Everything else (such as types etc) are known only to the Compiler(that aid the programmer), and that does all the pointer arithmetic, only the compiler knows the size of each type.. so on and so forth.
At runtime, there are no machine cycles wasted in converting one pointer type to another because the conversion does not happen at runtime. All pointers are treated as of 4 bytes long(on a 32 bit machine) nothing more and nothing less.
It all depends on your underlying hardware.
On most machine architectures, all pointers are byte pointers, and converting between a byte pointer and a byte pointer is a no-op. On some architectures, a pointer conversion may under some circumstances require extra manipulation (there are machines that work with word based addresses for instance, and converting a word pointer to a byte pointer or vice versa will require extra manipulation).
Moreover, this is in general an unsafe technique, as the compiler can't perform any sanity checking on what you are doing, and you can end up overwriting data you didn't expect.

Could I ever want to access the address zero?

The constant 0 is used as the null pointer in C and C++. But as in the question "Pointer to a specific fixed address" there seems to be some possible use of assigning fixed addresses. Is there ever any conceivable need, in any system, for whatever low level task, for accessing the address 0?
If there is, how is that solved with 0 being the null pointer and all?
If not, what makes it certain that there is not such a need?
Neither in C nor in C++ null-pointer value is in any way tied to physical address 0. The fact that you use constant 0 in the source code to set a pointer to null-pointer value is nothing more than just a piece of syntactic sugar. The compiler is required to translate it into the actual physical address used as null-pointer value on the specific platform.
In other words, 0 in the source code has no physical importance whatsoever. It could have been 42 or 13, for example. I.e. the language authors, if they so pleased, could have made it so that you'd have to do p = 42 in order to set the pointer p to null-pointer value. Again, this does not mean that the physical address 42 would have to be reserved for null pointers. The compiler would be required to translate source code p = 42 into machine code that would stuff the actual physical null-pointer value (0x0000 or 0xBAAD) into the pointer p. That's exactly how it is now with constant 0.
Also note, that neither C nor C++ provides a strictly defined feature that would allow you to assign a specific physical address to a pointer. So your question about "how one would assign 0 address to a pointer" formally has no answer. You simply can't assign a specific address to a pointer in C/C++. However, in the realm of implementation-defined features, the explicit integer-to-pointer conversion is intended to have that effect. So, you'd do it as follows
uintptr_t address = 0;
void *p = (void *) address;
Note, that this is not the same as doing
void *p = 0;
The latter always produces the null-pointer value, while the former in general case does not. The former will normally produce a pointer to physical address 0, which might or might not be the null-pointer value on the given platform.
On a tangential note: you might be interested to know that with Microsoft's C++ compiler, a NULL pointer to member will be represented as the bit pattern 0xFFFFFFFF on a 32-bit machine. That is:
struct foo
{
int field;
};
int foo::*pmember = 0; // 'null' member pointer
pmember will have the bit pattern 'all ones'. This is because you need this value to distinguish it from
int foo::*pmember = &foo::field;
where the bit pattern will indeed by 'all zeroes' -- since we want offset 0 into the structure foo.
Other C++ compilers may choose a different bit pattern for a null pointer to member, but the key observation is that it won't be the all-zeroes bit pattern you might have been expecting.
You're starting from a mistaken premise. When you assign an integer constant with the value 0 to a pointer, that becomes a null pointer constant. This does not, however, mean that a null pointer necessarily refers to address 0. Quite the contrary, the C and C++ standards are both very clear that a null pointer may refer to some address other than zero.
What it comes down to is this: you do have to set aside an address that a null pointer would refer to -- but it can be essentially any address you choose. When you convert zero to a pointer, it has to refer to that chosen address -- but that's all that's really required. Just for example, if you decided that converting an integer to a point would mean adding 0x8000 to the integer, then the null pointer to would actually refer to address 0x8000 instead of address 0.
It's also worth noting that dereferencing a null pointer results in undefined behavior. That means you can't do it in portable code, but it does not mean you can't do it at all. When you're writing code for small microcontrollers and such, it's fairly common to include some bits and pieces of code that aren't portable at all. Reading from one address may give you the value from some sensor, while writing to the same address could activate a stepper motor (just for example). The next device (even using exactly the same processor) might be connected up so both of those addresses referred to normal RAM instead.
Even if a null pointer does refer to address 0, that doesn't prevent you from using it to read and/or write whatever happens to be at that address -- it just prevents you from doing so portably -- but that doesn't really matter a whole lot. The only reason address zero would normally be important would be if it was decoded to connect to something other than normal storage, so you probably can't use it entirely portably anyway.
The compiler takes care of this for you (comp.lang.c FAQ):
If a machine uses a nonzero bit pattern for null pointers, it is the compiler's responsibility to generate it when the programmer requests, by writing "0" or "NULL," a null pointer. Therefore, #defining NULL as 0 on a machine for which internal null pointers are nonzero is as valid as on any other, because the compiler must (and can) still generate the machine's correct null pointers in response to unadorned 0's seen in pointer contexts.
You can get to address zero by referencing zero from a non-pointer context.
In practice, C compilers will happily let your program attempt to write to address 0. Checking every pointer operation at run time for a NULL pointer would be a tad expensive. On computers, the program will crash because the operating system forbids it. On embedded systems without memory protection, the program will indeed write to address 0 which will often crash the whole system.
The address 0 might be useful on an embedded systems (a general term for a CPU that's not in a computer; they run everything from your stereo to your digital camera). Usually, the systems are designed so that you wouldn't need to write to address 0. In every case I know of, it's some kind of special address. Even if the programmer needs to write to it (e.g., to set up an interrupt table), they would only need to write to it during the initial boot sequence (usually a short bit of assembly language to set up the environment for C).
Memory address 0 is also called the Zero Page. This is populated by the BIOS, and contains information about the hardware running on your system. All modern kernels protect this region of memory. You should never need to access this memory, but if you want to you need to do it from within kernel land, a kernel module will do the trick.
On the x86, address 0 (or rather, 0000:0000) and its vicinity in real mode is the location of the interrupt vector. In the bad old days, you would typically write values to the interrupt vector to install interrupt handers (or if you were more disciplined, used the MS-DOS service 0x25). C compilers for MS-DOS defined a far pointer type which when assigned NULL or 0 would recieve the bit pattern 0000 in its segment part and 0000 in its offset part.
Of course, a misbehaving program that accidentally wrote to a far pointer whose value was 0000:0000 would cause very bad things to happen on the machine, typically locking it up and forcing a reboot.
In the question from the link, people are discussing setting to fixed addresses in a microcontroller. When you program a microcontroller everything is at a much lower level there.
You even don't have an OS in terms of desktop/server PC, and you don't have virtual memory and that stuff. So there is it OK and even necessary to access memory at a specific address. On a modern desktop/server PC it is useless and even dangerous.
I compiled some code using gcc for the Motorola HC11, which has no MMU and 0 is a perfectly good address, and was disappointed to find out that to write to address 0, you just write to it. There's no difference between NULL and address 0.
And I can see why. I mean, it's not really possible to define a unique NULL on an architecture where every memory location is potentially valid, so I guess the gcc authors just said 0 was good enough for NULL whether it's a valid address or not.
char *null = 0;
; Clears 8-bit AR and BR and stores it as a 16-bit pointer on the stack.
; The stack pointer, ironically, is stored at address 0.
1b: 4f clra
1c: 5f clrb
1d: de 00 ldx *0 <main>
1f: ed 05 std 5,x
When I compare it with another pointer, the compiler generates a regular comparison. Meaning that it in no way considers char *null = 0 to be a special NULL pointer, and in fact a pointer to address 0 and a "NULL" pointer will be equal.
; addr is a pointer stored at 7,x (offset of 7 from the address in XR) and
; the "NULL" pointer is at 5,y (offset of 5 from the address in YR). It doesn't
; treat the so-called NULL pointer as a special pointer, which is not standards
; compliant as far as I know.
37: de 00 ldx *0 <main>
39: ec 07 ldd 7,x
3b: 18 de 00 ldy *0 <main>
3e: cd a3 05 cpd 5,y
41: 26 10 bne 53 <.LM7>
So to address the original question, I guess my answer is to check your compiler implementation and find out whether they even bothered to implement a unique-value NULL. If not, you don't have to worry about it. ;)
(Of course this answer is not standard compliant.)
It all depends on whether the machine has virtual memory. Systems with it will typically put an unwritable page there, which is probably the behaviour that you are used to. However in systems without it (typically microcontrollers these days, but they used to be far more common) then there's often very interesting things in that area such as an interrupt table. I remember hacking around with those things back in the days of 8-bit systems; fun, and not too big a pain when you had to hard-reset the system and start over. :-)
Yes, you might want to access memory address 0x0h. Why you would want to do this is platform-dependent. A processor might use this for a reset vector, such that writing to it causes the CPU to reset. It could also be used for an interrupt vector, as a memory-mapped interface to some hardware resource (program counter, system clock, etc), or it could even be valid as a plain old memory address. There is nothing necessarily magical about memory address zero, it is just one that was historically used for special purposes (reset vectors and the like). C-like languages follow this tradition by using zero as the address for a NULL pointer, but in reality the underlying hardware may or may not see address zero as special.
The need to access address zero usually arises only in low-level details like bootloaders or drivers. In these cases, the compiler can provide options/pragmas to compile a section of code without optimizations (to prevent the zero pointer from being extracted away as a NULL pointer) or inline assembly can be used to access the true address zero.
C/C++ don't allows you to write to any address. It is the OS that can raise a signal when a user access some forbidden address. C and C++ ensure you that any memory obtained from the heap, will be different of 0.
I have at times used loads from address zero (on a known platform where that would be guaranteed to segfault) to deliberately crash at an informatively named symbol in library code if the user violates some necessary condition and there isn't any good way to throw an exception available to me. "Segfault at someFunction$xWasnt16ByteAligned" is a pretty effective error message to alert someone to what they did wrong and how to fix it. That said, I wouldn't recommend making a habit of that sort of thing.
Writing to address zero can be done, but it depends upon several factors such as your OS, target architecture and MMU configuration. In fact, it can be a useful debugging tool (but not always).
For example, a few years ago while working on an embedded system (with few debugging tools available), we had a problem which was resulting in a warm reboot. To help locate the problem, we were debugging using sprintf(NULL, ...); and a 9600 baud serial cable. As I said--few debugging tools available. With our setup, we knew that a warm reboot would not corrupt the first 256 bytes of memory. Thus after the warm reboot we could pause the loader and dump the memory contents to find out what happened prior to reboot.
Remember that in all normal cases, you don't actually see specific addresses.
When you allocate memory, the OS supplies you with the address of that chunk of memory.
When you take the reference of a variable, the the variable has already been allocated at an address determined by the system.
So accessing address zero is not really a problem, because when you follow a pointer, you don't care what address it points to, only that it is valid:
int* i = new int(); // suppose this returns a pointer to address zero
*i = 42; // now we're accessing address zero, writing the value 42 to it
So if you need to access address zero, it'll generally work just fine.
The 0 == null thing only really becomes an issue if for some reason you're accessing physical memory directly. Perhaps you're writing an OS kernel or something like that yourself. In that case, you're going to be writing to specific memory addresses (especially those mapped to hardware registers), and so you might conceivably need to write to address zero. But then you're really bypassing C++ and relying on the specifics of your compiler and hardware platform.
Of course, if you need to write to address zero, that is possible. Only the constant 0 represents a null pointer. The non-constant integer value zero will not, if assigned to a pointer, yield a null pointer.
So you could simply do something like this:
int i = 0;
int* zeroaddr = (int*)i;
now zeroaddr will point to address zero(*), but it will not, strictly speaking, be a null pointer, because the zero value was not constant.
(*): that's not entirely true. The C++ standard only guarantees an "implementation-defined mapping" between integers and addresses. It could convert the 0 to address 0x1633de20` or any other address it likes. But the mapping is usually the intuitive and obvious one, where the integer 0 is mapped to the address zero)
It may surprise many people, but in the core C language there is no such thing as a special null pointer. You are totally free to read and write to address 0 if it's physically possible.
The code below does not even compile, as NULL is not defined:
int main(int argc, char *argv[])
{
void *p = NULL;
return 0;
}
OTOH, the code below compiles, and you can read and write address 0, if the hardware/OS allows:
int main(int argc, char *argv[])
{
int *p = 0;
*p = 42;
int x = *p; /* let's assume C99 */
}
Please note, I did not include anything in the above examples.
If we start including stuff from the standard C library, NULL becomes magically defined. As far as I remember it comes from string.h.
NULL is still not a core C feature, it's a CONVENTION of many C library functions to indicate the invalidity of pointers. The C library on the given platform will define NULL to a memory location which is not accessible anyway. Let's try it on a Linux PC:
#include <stdio.h>
int main(int argc, char *argv[])
{
int *p = NULL;
printf("NULL is address %p\n", p);
printf("Contents of address NULL is %d\n", *p);
return 0;
}
The result is:
NULL is address 0x0
Segmentation fault (core dumped)
So our C library defines NULL to address zero, which it turns out is inaccessible.
But it was not the C compiler, of not even the C-library function printf() that handled the zero address specially. They all happily tried to work with it normally. It was the OS that detected a segmentation fault, when printf tried to read from address zero.
If I remember correctly, in an AVR microcontroller the register file is mapped into an address space of RAM and register R0 is at the address 0x00. It was clearly done in purpose and apparently Atmel thinks there are situations, when it's convenient to access address 0x00 instead of writing R0 explicitly.
In the program memory, at the address 0x0000 there is a reset interrupt vector and again this address is clearly intended to be accessed when programming the chip.