As I know the memory location 0x00000000 isn't accessible through a pointer in C, but recently in my project I was able to access the zeroth location using a pointer.
The the zeroth location is flash memory.
But according to the post Why is address zero used for the null pointer? , referencing zero through pointer is a reserved value. So, my question is as it is a reserved value why this doesn't hold true for memory mapped to flash at location zero?
Processor : TMS470Rxx
OS : Micro C OS-II
Thanks
There are many platforms where address zero behaves differently than on the typical desktop PC.
There are microcontrollers where hardware devices are memory-mapped at device zero.
There are operating systems that will happily let you specifically map something at address zero if you wish to. If you have writable flash mapped there, then you can write there.
Different platforms work in different ways.
The decision as to what happens when using what address is rather complex. It depends on the processor architecture, OS and sometimes "what some software does".
The processor may not have memory protection, or address zero may indeed be "used for something" (in x86 real-mode, that DOS runs in, address zero contains the vector table, where interrupts and such jump to, so it's incredibly sensitive to being overwritten - hence badly written programs could and would crash not just themselves but the entire machine in DOS).
Typically, modern OS's on processors that have "virtual memory mapping" (so the physical address that the processor actually uses is not (or may not be) the same as the virtual address that the program sees) will map address zero as "not accessible" for your typical applications.
The OS may allow access to address zero at times:
I had a bug in a Windows driver many years ago, that used a NULL-pointer, and that that particular situation, address 0 was not causing a crash. The crash happened much later when the content of address zero was being used for something - at which point the system blue-screened (at least until I attached the debugger and debugged the problem, and then fixed it so it didn't try to use a NULL pointer - I don't remember if I had to allocate some memory or just skipped that bit if the pointer was NULL).
The choice of 0 for NULL is based on a combination of "we have to choose some value (and there is not one value that can be 100% sure that nobody will ever need to use)" - particularly on machines with a 16 or 32-bit address range. Normally, however, address zero, if it is used, contains something "special" (vector table for interrupts, bootstrap code for the processor, or similar), so you are unlikely to meaningfully store data in there. C and C++ as languages do not require that NULL-pointers can't be accessed [but also does not "allow you" to freely access this location], just that this value can be used as "pointer that doesn't point at anything" - and the spec also provides for the case where your NULL value is not ACTUALLY zero. But the compiler has to deal with "ZERO means NULL for pointers, so replace it with X".
The value zero does, at least sometimes, have a benefit over other values - mostly that many processors have special instructions to compare with or identify zero.
Related
In C/C++, it is not allowed to access data at address 0.
However, the physical memory are numbered from 0. And, in DOS era, the interrupt vector table was located at physical address 0. The first interrupt vector was the handler of the division-by-zero exception.
My question is:
Under what cases is it allowed to access physical address 0?
To access physical address zero, it depends on which platform you are talking.
The language has no idea on the underlying addressing model, it depends on the OS.
On bare metal environment, you have total control on the page table if paging is enabled, or just de-reference zero if paging is not enabled.
On some Unix and Linux variation, you do mmap and perhaps also open /dev/mem to get a non-null pointer with logical address non-zero but physical address zero, it may require some access rights.
I'm not sure on Windows.
PS. Other answers seems make a confusion on language level pointer and physical address.
In C/C++, it is not allowed to access address 0.
Yes you can, as long as there's addressable memory there. On most platforms, there won't be.
Under what cases is it allowed to access physical address 0?
You can access any physical address if it's mapped into virtual memory. If there's anything sensitive there, then the OS probably won't allow that in user code. Within the kernel, it's just a case of setting up the page tables to include that address.
Generally, the address space in virtual memory is managed by the operating system.
A freestanding C (or C++) implementation could certainly allow you to dereference (void*)0 in an implementation specific way. But beware of undefined behavior.
The C and C++ standards are very careful about the NULL pointer (and C++11 added the nullptr keyword for several good reasons).
A C compiler (at least a hosted implementation) is allowed to suppose that after a successful dereference a pointer is not null. Some optimizations in GCC are doing that.
Most hosted C or C++ implementations have a null pointer which is an all-zero-bits word, but that is not required by the standard (however, it is very common; it helps the compiler and the libc).
However, pragmatically, a lot of software is supposing that NULL is represented by all-zero-bits (in theory it is a mistake).
I know no C implementation where NULL is not all-zero-bits, but that is not required. However, coding such a compiler would be a headache.
On some operating systems, an application can change its address space, e.g. with mmap(2) on POSIX or Linux.
If you really wanted, you could access address 0 in C, but you really should never want to do that.
There's no specification in either C or C++ that would allow you to assign a specific physical address to a pointer. So your question about "how one would access address 0" could be translated to "how one would assign 0 address to a pointer" formally has no answer. You simply can't assign a specific address to a pointer in C/C++.
But you can get that effect through integer-to-pointer conversion:-
uintptr_t null_address = 0;
void *ptr = (void *) null_address;
The C Standard does not require that platforms provide access to any particular physical memory location, zero or otherwise. While it would be common on many embedded platforms for (char*)0x1234 to access physical location 0x1234, nothing in the C Standard requires that.
The C Standard also does not require that platforms do anything in particular if an attempt is made to access a null pointer. Consequently, if a compiler writer wanted to treat a null pointer dereference as an access to physical location zero, such behavior would be conforming.
Compilers where an access to address zero would be meaningful and useful will typically interpret a null pointer access as an access to location zero. The only problem would be with platforms where accesses to location zero are useful but compiler writers don't know that. That situation can generally only be handled by checking the compiler's documentation for ways to force the compiler to treat a null pointer like any other address.
Your question is a bit confusing. You ask whether it is possible to "access physical address zero." In a modern paged operating system, applications cannot specify physical addresses at all. Physical addresses can only be accessed through kernel mode.
In virtual memory systems, it became common not to map the first page of virtual memory into the process address space. Thus, accessing virtual address zero would trigger an access violation.
In such systems, it is usually possible for the application to map the first page to the process access space through system services. Even if the page is not there by default, it can be added by the application.
The answer is that it is possible to access physical memory address zero in kernel mode and it is possible to access virtual address zero IF the application maps that page to memory (and the OS allows that).
I was wondering... what if when you do a new, the address where the reservation starts is 0x0? I guess it is not possible, but why?
is the new operator prepared for that? is that part of the first byte not usable? it is always reserved when the OS starts?
Thanks!
The null pointer is not necessarily address 0x0, so potentially an architecture could choose another address to represent the null pointer and you could get 0x0 from new as a valid address. (I don't think anyone does that, btw, it would break the logic behind tons of memset calls and its just harder to implement anyway).
Whether the null pointer is reserved by the Operative System or the C++ implementation is unspecified, but plain new will never return a null pointer, whatever its address is (nothrow new is a different beast). So, to answer your question:
Is memory address 0x0 usable?
Maybe, it depends on the particular implementation/architecture.
"Early" memory addresses are typically reserved for the operating system. The OS does not use early physical memory addresses to match to virtual memory addresses for use by user programs. Depending on the OS, many things can be there - the Interrupt Vector Table, Page table, etc.
Here is a non-specific graph of layout of physical and virtual memory in Linux; could vary sligthly from distro to distro and release to release:
http://etutorials.org/shared/images/tutorials/tutorial_101/bels_0206.gif
^Don't be confused by the graphic - the Bootloader IS NOT in physical memory... don't know why they included that... but otherwise it's accurate.
I think you're asking why virtual memory doesn't map all the way down to 0x0. One of the biggest reasons is so that it's painfully obvious when you failed to assign a pointer - if it's 0x0, it's pointing to "nothing" and always wrong.
Of course, it's possible for NULL to be any value (as it's implementation-dependent), but as an uninitialized int's value is 0, on every implementation I've seen they've chosen to keep NULL 0 for consistency's sake.
There are a whole number of other reasons, but this is a good one. Here is a Wikipedia article talking a little bit more about virtual addressing.
Many memory addresses are reserved by the system to help with debugging.
0x00000000 Returned by keyword "new" if memory allocation failed
0xCDCDCDCD Allocated in heap, but not initialized
0xDDDDDDDD Released heap memory.
0xFDFDFDFD "NoMansLand" fences automatically placed at boundary of heap memory. Should never be overwritten. If you do overwrite one, you're probably walking off the end of an array.
0xCCCCCCCC Allocated on stack, but not initialized
But like a few others have pointed out, there is a distinction between physical memory addresses which is what the OS uses, and logical memory addresses which are assigned to your application by the OS. Example image shown here.
In C++ when i do new (or even malloc) is there any guarantee that the return address will be greater than a certain value? Because... in this project i find it -very- useful to use 0-1k as a enum. But i wouldn't want to do that if its possible to get a value that low. My only target systems are 32 or 64bit CPUs with the OS window/linux and mac.
Does the standard say anything about pointers? Does windows or linux say anything about their C runtime and what the lowest memory address (for ram) is?
-edit- i end up modifying my new overload to check if the address is above >1k. I call std::terminate if it doesn't.
In terms of standard, there is nothing. But in reality, it depends on the target OS, windows for instance reserves the first 64kb of memory as a no-mans land (depending on the build it is read-only memory, else it is marked as PAGE_NOACCESS), while it uses the upper 0x80000000+ for kernel memory, but it can be changed, see this & this on MSDN.
On x64 you can also use the higher bits of the address (only 47bits are used for addresses currently), but its not such a good idea, as later on it will change and your program will break (AMD who set the standard also advise against it).
There's no such guarantee. You can try using placement new if you need very specific memory locations but it has certain problems that you'll have to work hard to avoid. Why don't you try using a map with an integer key that has the pointer as its value instead? That way you wouldn't have to rely on specific memory addresses and ranges.
In theory, no -- a pointer's not even guaranteed to be > 0. However, in practice, viewed as an unsigned integer (don't forget that a pointer may have a high-order "1" bit), no system that I know of would have a pointer value less than about 1000. But relying on that is relying on "undefined behavior".
There is no standard for where valid memory addresses come from; to write safe system-independent code, you cannot rely on certain addresses (and even with anecdotal support, you never know when that will change with a new system update).
It's very platform-specific, so I would discourage relying on this kind of information unless you have a very good reason and are aware of consequences for portability, maintainability etc.
NULL is guaranteed to be 0x0 always. If I recall correctly, x86 reserves the first 128 MB of address space as "NULL-equivalent", so that valid pointers can't take on values in this range. On x64 there are some additional addresses which you shouldn't encounter in practice, at least for now.
As for address space reserved for the operating system, it will clearly depend on the OS. On Linux, the kernel-user space division is configurable in the kernel, so at least the 3 splits: 1-3 GB, 2-2 GB and 3-1 GB are common on 32-bit systems. You can find more details on kerneltrap.
Or are those things are reserved for the operation system and things like that?
Thanks.
While it's unlikely that 0x00000001, etc. will be valid pointers (especially if you use odd numbers on many processors) using a pointer to store an integer value will be highly system dependent.
Are you really that strapped for space?
Edit:
You could make it portable like this:
char *base = malloc(NUM_MAGIC_VALUES);
#define MAGIC_VALUE_1 (base + 0)
#define MAGIC_VALUE_2 (base + 1)
...
Well the OS is going to give each program it's own virtual memory space, so when the application references memory spaces 0x0000001 or 0x0000002, it's actually referencing some other physical memory address. I would take a look at paging and virtual memory. So a program will never have access to memory the operating system is using. However I would stay away from manually assigning a memory address for a pointer rather than using malloc() because those memory addresses might be text or reserved space.
This depends on operating system layout. For User space applications running in general purpose operating systems, these are inaccessible addresses.
This problem is related to a architecture's virtual address space. Have a loot at this http://web.cs.wpi.edu/~cs3013/c07/lectures/Section09.1-Intel.pdf
Of course, you can do this:
int* myPointer1 = 0x000001;
int* myPointer2 = 0x000032;
But do not try to dereference addresses, cause it will end in an Access Violation.
The OS gives you the memory, by the way these addresses are just virtual
the OS hides the details and shows it like a big, continous stripe.
Maybe the 0x000000-0x211501 part is on a webserver and you read/write it through net,
and remaining is on your hard disk. Physical memory is just an illusion from your current viewpoint.
You tagged your question C++. I believe that in C++ the address at 0 is reserved and is normally referred to as NULL. Other than that you cannot assume anything. If you want to ask about a particular implementation on a particular OS then that would be a different question.
It depends on the compiler/platform, but many older compilers actually have something like the string "(null)" at address 0x00000000. This is a debug feature because that string will show up if a NULL pointer is ever used by accident. On newer systems like Windows, a pointer to this area will most likely cause a processor exception.
I can pretty much guarantee that address 1 and 2 will either be in use or will raise a processor exception if they're ever used. You can store any value you like in a pointer. But if you try and dereference a pointer with a random value, you're definitely asking for problems.
How about a nice integer instead?
Although the standard requires that NULL is 0, a pointer that is NULL does not have to consist of all zero bits, although it will do in many implementations. That is also something you have to beware of if you memset a POD struct that contains some pointers, and then rely on the pointers holding "NULL" as their value.
If you want to use the same space as a pointer you could use a union, but I guess what you really want is something that doubles up as a pointer and something else, and you know it is not a pointer to a real address if it contains low-numbered values. (With a union you still need to know which type you have).
I'd be interested to know what the magic other value is really being used for. Is this some lazy-evaluation issue where the pointer gives an indication of how to load the data when it is not yet loaded and a genuine pointer when it is?
Yes, on some platforms address 0x00000001 and 0x00000002 are valid addresses. On other platforms they are not.
In the embedded systems world, the validity depends on what resides at those locations. Some platforms may put interrupt or reset vectors at those addresses. Other embedded platforms may place Position Independent executable code there.
There is no standard specification for the layout of addresses. One cannot assume anything. If you want your code to be portable then forget about accessing specific addresses and leave that to the OS.
Also, the structure of a pointer is platform dependent. So is the conversion of the value in a pointer to a physical address. Some systems may only decode a portion of the pointer, others use the entire pointer value. Some may use indirection (a.k.a. virtual addressing) to access real objects. Still no standardization here either.
I understand the purpose of the NULL constant in C/C++, and I understand that it needs to be represented some way internally.
My question is: Is there some fundamental reason why the 0-address would be an invalid memory-location for an object in C/C++? Or are we in theory "wasting" one byte of memory due to this reservation?
The null pointer does not actually have to be 0. It's guaranteed in the C spec that when a constant 0 value is given in the context of a pointer it is treated as null by the compiler, however if you do
char *foo = (void *)1;
--foo;
// do something with foo
You will access the 0-address, not necessarily the null pointer. In most cases this happens to actually be the case, but it's not necessary, so we don't really have to waste that byte. Although, in the larger picture, if it isn't 0, it has to be something, so a byte is being wasted somewhere
Edit: Edited out the use of NULL due to the confusion in the comments. Also, the main message here is "null pointer != 0, and here's some C/pseudo code that shows the point I'm trying to make." Please don't actually try to compile this or worry about whether the types are proper; the meaning is clear.
This has nothing to do with wasting memory and more with memory organization.
When you work with the memory space, you have to assume that anything not directly "Belonging to you" is shared by the entire system or illegal for you to access. An address "belongs to you" if you have taken the address of something on the stack that is still on the stack, or if you have received it from a dynamic memory allocator and have not yet recycled it. Some OS calls will also provide you with legal areas.
In the good old days of real mode (e.g., DOS), all the beginning of the machine's address space was not meant to be written by user programs at all. Some of it even mapped to things like I/O.
For instance, writing to the address space at 0xB800 (fairly low) would actually let you capture the screen! Nothing was ever placed at address 0, and many memory controller would not let you access it, so it was a great choice for NULL. In fact, the memory controller on some PCs would have gone bonkers if you tried writing there.
Today the operating system protects you with a virtual address space. Nevertheless, no process is allowed to access addresses not allocated to it. Most of the addresses are not even mapped to an actual memory page, so accessing them will trigger a general protection fault or the equivalent in your operating system. This is why 0 is not wasted - even though all the processes on your machine "have an address 0", if they try to access it, it is not mapped anywhere.
There is no requirement that a null pointer be equal to the 0-address, it's just that most compilers implement it this way. It is perfectly possible to implement a null pointer by storing some other value and in fact some systems do this. The C99 specification ยง6.3.2.3 (Pointers) specifies only that an integer constant expression with the value 0 is a null pointer constant, but it does not say that a null pointer when converted to an integer has value 0.
An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant.
Any pointer type may be converted to an integer type. Except as previously specified, the
result is implementation-defined. If the result cannot be represented in the integer type,
the behavior is undefined. The result need not be in the range of values of any integer
type.
On some embedded systems the zero memory address is used for something addressable.
The zero address and the NULL pointer are not (necessarily) the same thing. Only a literal zero is a null pointer. In other words:
char* p = 0; // p is a null pointer
char* q = 1;
q--; // q is NOT necessarily a null pointer
Systems are free to represent the null pointer internally in any way they choose, and this representation may or may not "waste" a byte of memory by making the actual 0 address illegal. However, a compiler is required to convert a literal zero pointer into whatever the system's internal representation of NULL is. A pointer that comes to point to the zero address by some way other than being assigned a literal zero is not necessarily null.
Now, most systems do use 0 for NULL, but they don't have to.
It is not necessarily an illegal memory location. I have stored data by dereferencing a pointer to zero... it happens the datum was an interrupt vector being stored at the vector located at address zero.
By convention it is not normally used by application code since historically many systems had important system information starting at zero. It could be the boot rom or a vector table or even unused address space.
On many processors address zero is the reset vector, wherein lies the bootrom (BIOS on a PC), so you are unlikely to be storing anything at that physical address. On a processor with an MMU and a supporting OS, the physical and logical address addresses need not be the same, and the address zero may not be a valid logical address in the executing process context.
NULL is typically the zero address, but it is the zero address in your applications virtual address space. The virtual addresses that you use in most modern operating systems have exactly nothing to do with actual physical addresses, the OS maps from the virtual address space to the physical addresses for you. So, no, having the virtual address 0 representing NULL does not waste any memory.
Read up on virtual memory for a more involved discussion if you're curious.
I don't see the answers directly addressing what i think you were asking, so here goes:
Yes, at least 1 address value is "wasted" (made unavailable for use) because of the constant used for null. Whether it maps to 0 in linear map of process memory is not relevant.
And the reason that address won't be used for data storage is that you need that special status of the null pointer, to be able to distinguish from any other real pointer. Just like in the case of ASCIIZ strings (C-string, NUL-terminated), where the NUL character is designated as end of character string and cannot be used inside strings. Can you still use it inside? Yeah but that will mislead library functions as of where string ends.
I can think of at least one implementation of LISP i was learning, in which NIL (Lisp's null) was not 0, nor was it an invalid address but a real object. The reason was very clever - the standard required that CAR(NIL)=NIL and CDR(NIL)=NIL (Note: CAR(l) returns pointer to the head/first element of a list, where CDR(l) returns ptr to the tail/rest of the list.). So instead of adding if-checks in CAR and CDR whether the pointer is NIL - which will slow every call - they just allocated a CONS (think list) and assigned its head and tail to point to itself. There! - this way CAR and CDR will work and that address in memory won't be reused (because it is taken by the object devised as NIL)
ps. i just remembered that many-many years ago i read about some bug of Lattice-C that was related to NULL - must have been in the dark MS-DOS segmentation times, where you worked with separate code segment and data segment - so i remember there was an issue that it was possible for the first function from a linked library to have address 0, thus pointer to it will be considered invalid since ==NULL
But since modern operating systems can map the physical memory to logical memory addresses (or better: modern CPUs starting with the 386), not even a single byte is wasted.
As people already have pointed out, the bit representation of the NULL pointer has not to be the same as the bit represention of a 0 value. It is though in nearly all cases (the old dinosaur computers that had special addresses can be neglected) because a NULL pointer can also be used as a boolean and by using an integer (of suffisent size) to hold the pointer value it is easier to represent in the common ISAs of modern CPU. The code to handle it is then much more straight forward, thus less error prone.
You are correct in noting that the address space at 0 is not usable storate for your program. For a number of reasons a variety of systems do not consider this a valid address space for your program anyway.
Allowing any valid address to be used would require a null value flag for all pointers. This would exceed the overhead of the lost memory at address 0. It would also require additional code to check and see if the address were null or not, wasting memory and processor cycles.
Ideally, the address that NULL pointer is using (usually 0) should return an error on access. VAX/VMS never mapped a page to address 0 so following the NULL pointer would result in a failure.
The memory at that address is reserved for use by the operating system. 0 - 64k is reserved. 0 is used as a special value to indicate to developers "not a valid address".