C++: What are pointers' hex values relative to?

C++: What are pointers' hex values relative to? - c++

In C ++ (a 32 bit program) you can get the Hex value of a pointer like so:
Class myClass;
DWORD hexValue = (DWORD)&myClass;
A hex value might look like 0xFF225290.
What is this value relative to? The start of all of the memory, the start of the process (in the memory), the start of the class, or something else?
EDIT: I don't believe I was clear enough. By "relative" I meant, if I have a pointer to 0x0000, will this be at the beginning of the memory of the process, or the memory of the system?
I understand that the memory itself is scattered all about, my question is simply of pointer number values. When you get a pointer to something, is the number assigned to that pointer within the range of the entire system, or just your local process's local memory heap? As one answer mentioned, in "Modern OS's" every process has its own virtual memory. This would suggest that 0x0000 is at the beginning of the process's virtual heap, and not the beginning of the physical memory. This was my question.

It's not relative to anything. It's a value which represents an address in memory.
Most modern computers have a single contiguous address space; in which case you could say it's "relative" to the start of that space. But that's not necessarily the case - there have been (and still are, in embedded systems, and probably will be in the future) platforms with more complicated address spaces.
To answer the question you meant but didn't ask: on most modern computer platforms, each process has its own virtual address space, so pointers from one process will be meaningless in another.

You cannot say a pointer is relative to anything on modern systems: Pointers are nothing more or less than addresses in the virtual memory space of the process. This virtual memory address space is something entirely abstract conjured up by the OS with some help of the hardware.
As it is, ranges of addresses that are contiguous in the virtual memory address space do not even need to be contiguous in physical memory, the pieces of the virtual memory (called pages) may freely be scattered across the physical memory, the OS can distribute them however it pleases. This breaks any notion of a pointer being relative to anything.

Related

Des-initializing a region of memory

I have learn in the few past days the issue with memory overcommitment (when memory overcommit is activated, which is usually a default), which basically means that:
void* p = malloc(100);
the operative system gives you 100 contiguous (virtual) addresses taken from the (virtual) address space of your process, whose total range is OS-defined. Since that memory region has not been initialized yet, it doesn't count as ocuppied storage from a system-wide point of view, so it's a pure abstraction besides consuming your virtual addresses.
memset(p, 0, 5);
That uses the first 5 bytes, so from the point of view of the OS, your process ocuppies now 5 extra bytes, and so the system has 5 bytes less of free storage. You have still 95 bytes of uninitialized storage.
The system only crash or start killing processes when the combined ocuppied storage (initialized) of every process is beyond what the OS can hold.
If my understanding is right at this regard, is there a way to "des-"initialize a region of memory when you are done with it, in order to increase the system-wide free space, without loosing the address region requested by malloc or aligned_malloc (so you don't increase fragmentation over time)?
The purpose of this question is more theoretical than practical and not about actually "freeing memory", but about freeing memory while conserving already assigned virtual addresses.
Source about the difference between requesting virtual addresses and ocuppying storage: https://www.win.tue.nl/~aeb/linux/lk/lk-9.html#ss9.6
PD: With knowing it for Linux to fill my curiosity I'm ok.

No, there is no way.
On most systems, as soon as you allocate memory, it counts towards RAM or swap.
As your link shows, on Linux, you may need to access the memory once so that the memory actually gets allocated. But as soon as you do, the system must keep that memory available somewhere, in case you access it later.
The way to tell the system you are done with the memory is to actually free it.

Why location 0x00000000 is accessible if it is a flash memory

As I know the memory location 0x00000000 isn't accessible through a pointer in C, but recently in my project I was able to access the zeroth location using a pointer.
The the zeroth location is flash memory.
But according to the post Why is address zero used for the null pointer? , referencing zero through pointer is a reserved value. So, my question is as it is a reserved value why this doesn't hold true for memory mapped to flash at location zero?
Processor : TMS470Rxx
OS : Micro C OS-II
Thanks

There are many platforms where address zero behaves differently than on the typical desktop PC.
There are microcontrollers where hardware devices are memory-mapped at device zero.
There are operating systems that will happily let you specifically map something at address zero if you wish to. If you have writable flash mapped there, then you can write there.
Different platforms work in different ways.

The decision as to what happens when using what address is rather complex. It depends on the processor architecture, OS and sometimes "what some software does".
The processor may not have memory protection, or address zero may indeed be "used for something" (in x86 real-mode, that DOS runs in, address zero contains the vector table, where interrupts and such jump to, so it's incredibly sensitive to being overwritten - hence badly written programs could and would crash not just themselves but the entire machine in DOS).
Typically, modern OS's on processors that have "virtual memory mapping" (so the physical address that the processor actually uses is not (or may not be) the same as the virtual address that the program sees) will map address zero as "not accessible" for your typical applications.
The OS may allow access to address zero at times:
I had a bug in a Windows driver many years ago, that used a NULL-pointer, and that that particular situation, address 0 was not causing a crash. The crash happened much later when the content of address zero was being used for something - at which point the system blue-screened (at least until I attached the debugger and debugged the problem, and then fixed it so it didn't try to use a NULL pointer - I don't remember if I had to allocate some memory or just skipped that bit if the pointer was NULL).
The choice of 0 for NULL is based on a combination of "we have to choose some value (and there is not one value that can be 100% sure that nobody will ever need to use)" - particularly on machines with a 16 or 32-bit address range. Normally, however, address zero, if it is used, contains something "special" (vector table for interrupts, bootstrap code for the processor, or similar), so you are unlikely to meaningfully store data in there. C and C++ as languages do not require that NULL-pointers can't be accessed [but also does not "allow you" to freely access this location], just that this value can be used as "pointer that doesn't point at anything" - and the spec also provides for the case where your NULL value is not ACTUALLY zero. But the compiler has to deal with "ZERO means NULL for pointers, so replace it with X".
The value zero does, at least sometimes, have a benefit over other values - mostly that many processors have special instructions to compare with or identify zero.

How does a computer 'know' what memory is allocated?

When memory is allocated in a computer, how does it know which bytes are already occupied and can't be overwritten?
So if these are some bytes of memory that aren't being used:
[0|0|0|0]
How does the computer know whether they are or not? They could just be an integer that equals zero. Or it could be empty memory. How does it know?

That depends on the way the allocation is performed, but it generally involves manipulation of data belonging to the allocation mechanism.
When you allocate some variable in a function, the allocation is performed by decrementing the stack pointer. Via the stack pointer, your program knows that anything below the stack pointer is not allocated to the stack, while anything above the stack pointer is allocated.
When you allocate something via malloc() etc. on the heap, things are similar, but more complicated: all theses allocators have some internal data structures which they never expose to the calling application, but which allow them to select which memory addresses to return on an allocation request. Some malloc() implementation, for instance, use a number of memory pools for small objects of fixed size, and maintain linked lists of free objects for each fixed size which they track. That way, they can quickly pop one memory region of that list, only doing more expensive computations when they run out of regions to satisfy a certain request size.
In any case, each of the allocators have to request memory from the system kernel from time to time. This mechanism always works on complete memory pages (usually 4 kiB), and works via the syscalls brk() and mmap(). Again, the kernel keeps track of which pages are visible in which processes, and at which addresses they are mapped, so there is additional memory allocated inside the kernel for this.
These mappings are made available to the processor via the page tables, which uses them to resolve the virtual memory addresses to the physical addresses. So here, finally, you have some hardware involved in the process, but that is really far, far down in the guts of the mechanics, much below anything that a userspace process is ever able to see. Still, even the page tables are managed by the software of the kernel, not by the hardware, the hardware only interpretes what the software writes into the page tables.

First of all, I have the impression that you believe that there is some unoccupied memory that doesn't holds any value. That's wrong. You can imagine the memory as a very large array when each box contains a value whereas someone put something in it or not. If a memory was never written, then it contains a random value.
Now to answer your question, it's not the computer (meaning the hardware) but the operating system. It holds somewhere in its memory some tables recording which part of the memory are used. Also any byte of memory can be overwriten.

In general, you cannot tell by looking at content of memory at some location whether that portion of memory is used or not. Memory value '0' does not mean the memory is not used.
To tell what portions of memory are used you need some structure to tell you this. For example, you can divide memory into chunks and keep track of which chunks are used and which are not.

There are memory blocks, they have an occupied or not occupied. On the heap, there are very complex data structures which organise it. But the answer to your question is too broad.

Is memory address 0x0 usable?

I was wondering... what if when you do a new, the address where the reservation starts is 0x0? I guess it is not possible, but why?
is the new operator prepared for that? is that part of the first byte not usable? it is always reserved when the OS starts?
Thanks!

The null pointer is not necessarily address 0x0, so potentially an architecture could choose another address to represent the null pointer and you could get 0x0 from new as a valid address. (I don't think anyone does that, btw, it would break the logic behind tons of memset calls and its just harder to implement anyway).
Whether the null pointer is reserved by the Operative System or the C++ implementation is unspecified, but plain new will never return a null pointer, whatever its address is (nothrow new is a different beast). So, to answer your question:
Is memory address 0x0 usable?
Maybe, it depends on the particular implementation/architecture.

"Early" memory addresses are typically reserved for the operating system. The OS does not use early physical memory addresses to match to virtual memory addresses for use by user programs. Depending on the OS, many things can be there - the Interrupt Vector Table, Page table, etc.
Here is a non-specific graph of layout of physical and virtual memory in Linux; could vary sligthly from distro to distro and release to release:
http://etutorials.org/shared/images/tutorials/tutorial_101/bels_0206.gif
^Don't be confused by the graphic - the Bootloader IS NOT in physical memory... don't know why they included that... but otherwise it's accurate.

I think you're asking why virtual memory doesn't map all the way down to 0x0. One of the biggest reasons is so that it's painfully obvious when you failed to assign a pointer - if it's 0x0, it's pointing to "nothing" and always wrong.
Of course, it's possible for NULL to be any value (as it's implementation-dependent), but as an uninitialized int's value is 0, on every implementation I've seen they've chosen to keep NULL 0 for consistency's sake.
There are a whole number of other reasons, but this is a good one. Here is a Wikipedia article talking a little bit more about virtual addressing.

Many memory addresses are reserved by the system to help with debugging.
0x00000000 Returned by keyword "new" if memory allocation failed
0xCDCDCDCD Allocated in heap, but not initialized
0xDDDDDDDD Released heap memory.
0xFDFDFDFD "NoMansLand" fences automatically placed at boundary of heap memory. Should never be overwritten. If you do overwrite one, you're probably walking off the end of an array.
0xCCCCCCCC Allocated on stack, but not initialized
But like a few others have pointed out, there is a distinction between physical memory addresses which is what the OS uses, and logical memory addresses which are assigned to your application by the OS. Example image shown here.

Is 0x000001, 0x000002, etc. ever a valid memory address in application level programming?

Or are those things are reserved for the operation system and things like that?
Thanks.

While it's unlikely that 0x00000001, etc. will be valid pointers (especially if you use odd numbers on many processors) using a pointer to store an integer value will be highly system dependent.
Are you really that strapped for space?
Edit:
You could make it portable like this:
char *base = malloc(NUM_MAGIC_VALUES);
#define MAGIC_VALUE_1 (base + 0)
#define MAGIC_VALUE_2 (base + 1)
...

Well the OS is going to give each program it's own virtual memory space, so when the application references memory spaces 0x0000001 or 0x0000002, it's actually referencing some other physical memory address. I would take a look at paging and virtual memory. So a program will never have access to memory the operating system is using. However I would stay away from manually assigning a memory address for a pointer rather than using malloc() because those memory addresses might be text or reserved space.

This depends on operating system layout. For User space applications running in general purpose operating systems, these are inaccessible addresses.

This problem is related to a architecture's virtual address space. Have a loot at this http://web.cs.wpi.edu/~cs3013/c07/lectures/Section09.1-Intel.pdf

Of course, you can do this:
int* myPointer1 = 0x000001;
int* myPointer2 = 0x000032;
But do not try to dereference addresses, cause it will end in an Access Violation.
The OS gives you the memory, by the way these addresses are just virtual
the OS hides the details and shows it like a big, continous stripe.
Maybe the 0x000000-0x211501 part is on a webserver and you read/write it through net,
and remaining is on your hard disk. Physical memory is just an illusion from your current viewpoint.

You tagged your question C++. I believe that in C++ the address at 0 is reserved and is normally referred to as NULL. Other than that you cannot assume anything. If you want to ask about a particular implementation on a particular OS then that would be a different question.

It depends on the compiler/platform, but many older compilers actually have something like the string "(null)" at address 0x00000000. This is a debug feature because that string will show up if a NULL pointer is ever used by accident. On newer systems like Windows, a pointer to this area will most likely cause a processor exception.
I can pretty much guarantee that address 1 and 2 will either be in use or will raise a processor exception if they're ever used. You can store any value you like in a pointer. But if you try and dereference a pointer with a random value, you're definitely asking for problems.
How about a nice integer instead?

Although the standard requires that NULL is 0, a pointer that is NULL does not have to consist of all zero bits, although it will do in many implementations. That is also something you have to beware of if you memset a POD struct that contains some pointers, and then rely on the pointers holding "NULL" as their value.
If you want to use the same space as a pointer you could use a union, but I guess what you really want is something that doubles up as a pointer and something else, and you know it is not a pointer to a real address if it contains low-numbered values. (With a union you still need to know which type you have).
I'd be interested to know what the magic other value is really being used for. Is this some lazy-evaluation issue where the pointer gives an indication of how to load the data when it is not yet loaded and a genuine pointer when it is?

Yes, on some platforms address 0x00000001 and 0x00000002 are valid addresses. On other platforms they are not.
In the embedded systems world, the validity depends on what resides at those locations. Some platforms may put interrupt or reset vectors at those addresses. Other embedded platforms may place Position Independent executable code there.
There is no standard specification for the layout of addresses. One cannot assume anything. If you want your code to be portable then forget about accessing specific addresses and leave that to the OS.
Also, the structure of a pointer is platform dependent. So is the conversion of the value in a pointer to a physical address. Some systems may only decode a portion of the pointer, others use the entire pointer value. Some may use indirection (a.k.a. virtual addressing) to access real objects. Still no standardization here either.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js