I was wondering... what if when you do a new, the address where the reservation starts is 0x0? I guess it is not possible, but why?
is the new operator prepared for that? is that part of the first byte not usable? it is always reserved when the OS starts?
Thanks!
The null pointer is not necessarily address 0x0, so potentially an architecture could choose another address to represent the null pointer and you could get 0x0 from new as a valid address. (I don't think anyone does that, btw, it would break the logic behind tons of memset calls and its just harder to implement anyway).
Whether the null pointer is reserved by the Operative System or the C++ implementation is unspecified, but plain new will never return a null pointer, whatever its address is (nothrow new is a different beast). So, to answer your question:
Is memory address 0x0 usable?
Maybe, it depends on the particular implementation/architecture.
"Early" memory addresses are typically reserved for the operating system. The OS does not use early physical memory addresses to match to virtual memory addresses for use by user programs. Depending on the OS, many things can be there - the Interrupt Vector Table, Page table, etc.
Here is a non-specific graph of layout of physical and virtual memory in Linux; could vary sligthly from distro to distro and release to release:
http://etutorials.org/shared/images/tutorials/tutorial_101/bels_0206.gif
^Don't be confused by the graphic - the Bootloader IS NOT in physical memory... don't know why they included that... but otherwise it's accurate.
I think you're asking why virtual memory doesn't map all the way down to 0x0. One of the biggest reasons is so that it's painfully obvious when you failed to assign a pointer - if it's 0x0, it's pointing to "nothing" and always wrong.
Of course, it's possible for NULL to be any value (as it's implementation-dependent), but as an uninitialized int's value is 0, on every implementation I've seen they've chosen to keep NULL 0 for consistency's sake.
There are a whole number of other reasons, but this is a good one. Here is a Wikipedia article talking a little bit more about virtual addressing.
Many memory addresses are reserved by the system to help with debugging.
0x00000000 Returned by keyword "new" if memory allocation failed
0xCDCDCDCD Allocated in heap, but not initialized
0xDDDDDDDD Released heap memory.
0xFDFDFDFD "NoMansLand" fences automatically placed at boundary of heap memory. Should never be overwritten. If you do overwrite one, you're probably walking off the end of an array.
0xCCCCCCCC Allocated on stack, but not initialized
But like a few others have pointed out, there is a distinction between physical memory addresses which is what the OS uses, and logical memory addresses which are assigned to your application by the OS. Example image shown here.
Related
I am allocating space with ::operator new( sizeof(T) * count).
The 1st call returns an address 0x742f30 and the 2nd returns 0x7f2ef0000d60. I am now confused about the huge difference.
My question: Is this normal that the returned addresses can differ that much?
Update:
SLES 11 SP3 VM on XenServer
gcc 4.9.3
10 GB RAM
Update:
Because some people suspected a wrong output format. I display the returned address by the new command with the same printf format. I copied the pointer values to this question by copy and paste and check them twice. They match the output from my Memory Allocator.
A possible cause is that the first object was allocated in the process's initial data segment, but by the time you allocated the second object this filled up. Traditional memory allocators use sbrk() to extend the data segment, but some modern memory allocators make use of mmap() on /dev/zero to create new memory segments. This might allocate its virtual memory in a very distant part of the address space.
Assuming that there are no restrictions regarding the location of the required memory block, as long as the result is a valid memory pointer (i.e. not null), it should be considered fine. But I as a programmer would be surprised to see the response from the memory allocating function being formatted in a different way (in this case with a different number of digits).
Considering that it's your own library, I would at least make sure that it always output the address of the allocated memory in exactly the same format.
My answer is that it must be some strange virtualization of the memory by Linux. The output of the addresses are always in the same format. I think the answer from Barmar is very close to the real reason. May be I ask the SuSe IT and they have an answer for this.
In C ++ (a 32 bit program) you can get the Hex value of a pointer like so:
Class myClass;
DWORD hexValue = (DWORD)&myClass;
A hex value might look like 0xFF225290.
What is this value relative to? The start of all of the memory, the start of the process (in the memory), the start of the class, or something else?
EDIT: I don't believe I was clear enough. By "relative" I meant, if I have a pointer to 0x0000, will this be at the beginning of the memory of the process, or the memory of the system?
I understand that the memory itself is scattered all about, my question is simply of pointer number values. When you get a pointer to something, is the number assigned to that pointer within the range of the entire system, or just your local process's local memory heap? As one answer mentioned, in "Modern OS's" every process has its own virtual memory. This would suggest that 0x0000 is at the beginning of the process's virtual heap, and not the beginning of the physical memory. This was my question.
It's not relative to anything. It's a value which represents an address in memory.
Most modern computers have a single contiguous address space; in which case you could say it's "relative" to the start of that space. But that's not necessarily the case - there have been (and still are, in embedded systems, and probably will be in the future) platforms with more complicated address spaces.
To answer the question you meant but didn't ask: on most modern computer platforms, each process has its own virtual address space, so pointers from one process will be meaningless in another.
You cannot say a pointer is relative to anything on modern systems: Pointers are nothing more or less than addresses in the virtual memory space of the process. This virtual memory address space is something entirely abstract conjured up by the OS with some help of the hardware.
As it is, ranges of addresses that are contiguous in the virtual memory address space do not even need to be contiguous in physical memory, the pieces of the virtual memory (called pages) may freely be scattered across the physical memory, the OS can distribute them however it pleases. This breaks any notion of a pointer being relative to anything.
Suppose there is a variable a and a pointer p which points to address of a.
int a;
int *p=&a;
Now since I have a pointer pointing to the location of the variable, I know the exact memory location (or the chunk of memory).
My questions are:
Given an address, can we find which variable is using them? (I don't think this is possible).
Given an address, can we atleast find how big is the chunk of memory to which that memory address belongs. (I know this is stupid but still).
You can enumerate all your (suspect) variables and check if they point to the same location as your pointer (e.g. you can compare pointers for equality)
If your pointer is defined as int *p, you can assume it points to an integer. Your assumption can be proven wrong, of course, if for example the pointer value is null or you meddled with the value of the pointer.
You can think of memory as a big array of bytes:
now if you have a pointer to somewhere in middle of array, can you tell me how many other pointers point to same location as your pointer?? Or can you tell me how much information I stored in memory location that you point to it?? Or can you at least tell me what kind of object stored at location of your pointer?? Answer to all of this question is really impossible and the question look strange. Some languages add extra information to their memory management routines that they can track such information at a later time but in C++ we have the minimum overhead, so your answer is no it is not possible.
For your first question you may handle it using smart pointers, for example shared_ptr use a reference counter to know how many shared_ptr are pointing to a memory location and be able to control life time of the object(but current design of shared_ptr do not allow you to read that counter).
There is non-standard platform dependent solution to query size of dynamically allocated memory(for example _msize on Windows and memory_size on Unix) but that only work with dynamic memories that allocated using malloc and is not portable, in C++ the idea is you should care for this, if you need this feature implement a solution for it and if you don't need it, then you never pay extra cost of it
Given an address ,can we find which variable is using them ?
No, this isn't possible. variables point to memory, not the other way around. There isn't some way to get to variable-names from compiled code, except maybe via the symbol table, reading which in-turn would probably need messing around with assembly.
Given an address ,can we atleast find how big is the chunk of memory
to which that memory address belongs..
No. There isn't a way to do that given just the address. You could find the sizeof() after dereferencing the address but not from the address itself.
Question 1.
A: It cannot be done natively, but could be done by Valgrind memcheck tool. The VM tracks down all variables and allocated memory space/stack. However, it is not designed to answer such question, but with some modification, memcheck tool could answer this question. For example, it can correlate invalid memory access or memory leakage address to variables in the source code. So, given a valid and known memory address, it must be able to find the corresponding variable.
Question 2.
A: It can be done like above, but it can also be done natively with some PRELOADED libraries for malloc, calloc, strdup, free, etc. By manual instructed memory allocation functions, you can save allocated address and size. And also save the return address by __builtin_return_address() or backtrace() to know where the memory chunk is being allocated. You have to save all allocated address and size to a tree. Then you should be able to query the address belongs to which chunk and the chunk size, and what function allocated the chunk.
Or are those things are reserved for the operation system and things like that?
Thanks.
While it's unlikely that 0x00000001, etc. will be valid pointers (especially if you use odd numbers on many processors) using a pointer to store an integer value will be highly system dependent.
Are you really that strapped for space?
Edit:
You could make it portable like this:
char *base = malloc(NUM_MAGIC_VALUES);
#define MAGIC_VALUE_1 (base + 0)
#define MAGIC_VALUE_2 (base + 1)
...
Well the OS is going to give each program it's own virtual memory space, so when the application references memory spaces 0x0000001 or 0x0000002, it's actually referencing some other physical memory address. I would take a look at paging and virtual memory. So a program will never have access to memory the operating system is using. However I would stay away from manually assigning a memory address for a pointer rather than using malloc() because those memory addresses might be text or reserved space.
This depends on operating system layout. For User space applications running in general purpose operating systems, these are inaccessible addresses.
This problem is related to a architecture's virtual address space. Have a loot at this http://web.cs.wpi.edu/~cs3013/c07/lectures/Section09.1-Intel.pdf
Of course, you can do this:
int* myPointer1 = 0x000001;
int* myPointer2 = 0x000032;
But do not try to dereference addresses, cause it will end in an Access Violation.
The OS gives you the memory, by the way these addresses are just virtual
the OS hides the details and shows it like a big, continous stripe.
Maybe the 0x000000-0x211501 part is on a webserver and you read/write it through net,
and remaining is on your hard disk. Physical memory is just an illusion from your current viewpoint.
You tagged your question C++. I believe that in C++ the address at 0 is reserved and is normally referred to as NULL. Other than that you cannot assume anything. If you want to ask about a particular implementation on a particular OS then that would be a different question.
It depends on the compiler/platform, but many older compilers actually have something like the string "(null)" at address 0x00000000. This is a debug feature because that string will show up if a NULL pointer is ever used by accident. On newer systems like Windows, a pointer to this area will most likely cause a processor exception.
I can pretty much guarantee that address 1 and 2 will either be in use or will raise a processor exception if they're ever used. You can store any value you like in a pointer. But if you try and dereference a pointer with a random value, you're definitely asking for problems.
How about a nice integer instead?
Although the standard requires that NULL is 0, a pointer that is NULL does not have to consist of all zero bits, although it will do in many implementations. That is also something you have to beware of if you memset a POD struct that contains some pointers, and then rely on the pointers holding "NULL" as their value.
If you want to use the same space as a pointer you could use a union, but I guess what you really want is something that doubles up as a pointer and something else, and you know it is not a pointer to a real address if it contains low-numbered values. (With a union you still need to know which type you have).
I'd be interested to know what the magic other value is really being used for. Is this some lazy-evaluation issue where the pointer gives an indication of how to load the data when it is not yet loaded and a genuine pointer when it is?
Yes, on some platforms address 0x00000001 and 0x00000002 are valid addresses. On other platforms they are not.
In the embedded systems world, the validity depends on what resides at those locations. Some platforms may put interrupt or reset vectors at those addresses. Other embedded platforms may place Position Independent executable code there.
There is no standard specification for the layout of addresses. One cannot assume anything. If you want your code to be portable then forget about accessing specific addresses and leave that to the OS.
Also, the structure of a pointer is platform dependent. So is the conversion of the value in a pointer to a physical address. Some systems may only decode a portion of the pointer, others use the entire pointer value. Some may use indirection (a.k.a. virtual addressing) to access real objects. Still no standardization here either.
I am wondering , what exactly is stored in the memory when we say a particular variable pointer to be NULL. suppose I have a structure, say
typdef struct MEM_LIST MEM_INSTANCE;
struct MEM_LIST
{
char *start_addr;
int size;
MEM_INSTANCE *next;
};
MEM_INSTANCE *front;
front = (MEM_INSTANCE*)malloc(sizeof(MEM_INSTANCE*));
-1) If I make front=NULL. What will be the value which actually gets stored in the different fields of the front, say front->size ,front->start_addr. Is it 0 or something else. I have limited knowledge in this NULL thing.
-2) If I do a free(front); It frees the memory which is pointed out by front. So what exactly free means here, does it make it NULL or make it all 0.
-3) What can be a good strategy to deal with initialization of pointers and freeing them .
Thanks in advance
There are many good contributed answers which adequately address the questions. However, the coverage of NULL is light.
In a modern virtual memory architecture, NULL points to memory for which any reference (that is, an attempt to read from or write to memory at that address) causes a segfault exception—also called an access violation or memory fault. This is an intentional protective mechanism to detect and deal appropriately with invalid memory accesses:
char *p = 0;
for (int j = 0; j < 50000000; ++j)
*(p += 1000000) = 10;
This code writes a ten at every millionth memory byte. It won't run for many loops—probably not even once. Either it will attempt to access an unmapped address, or it will attempt to modify read-only memory—where constant data or program code reside. The CPU will interrupt the instruction midway and report the exception to the operating system. Since there's no exception handling specified, the default o/s handling is to terminate the program. Linux displays Segmentation fault (for historical reasons). MS Windows is inconsistent, but tends to say access violation. The same should happen with any program in protected virtual memory doing this:
char *p = NULL;
if (p [34] == 'Y')
printf ("A miracle has occurred!\n");
This segfaults. A memory location near the NULL address is being dereferenced.
At the risk of confusion, it is possible that a large offset from zero will be valid memory. Thirty-four certainly won't be okay, but 34,000 might be. Different operating systems and different program processing tools (linkers) reserve a fixed amount of the zero end of memory and arrange for it to be unmapped. It could be as little as 1K, though 8K was a popular choice in the 1990s. Systems with ample virtual address space (not memory, but potential memory) might leave an 8M or 16M memory hole. Some modern operating systems randomize the amount of space reserved, as well as randomly varying the locations for the code and data sections each time a program starts.
The other extreme is non-virtual memory architectures. Typically, these provide valid addresses beginning at address zero up to the limit of installed memory. Such is common in embedded processors, many DSPs, and pre-protected mode CPUs, 8 and 16-bit processors like the 8086, 68000, etc. Reading a NULL address does not cause any special CPU reaction—it simply reads whatever is there, which is usually interrupt vectors. Writes to low memory usually result in hard-to-diagnose dire consequences as interrupt vectors are used asynchronously and infrequently.
Even stranger is the segment model of the oddly named "real-mode" x86 using small or medium memory model conventions. Such data addresses are 16 bits using the DS register which is set to the program's initialized data area. Dereferencing NULL accesses the first bytes of this space, but MSDOS programs contain ancient runtime structures for compatibility with CP/M, an o/s Fred Flintstone used. No exceptions, and maybe no consequences for modifying the memory near NULL in this environment. These were challenging bugs to find without program source code.
Virtual memory protection was a huge leap forward in creating stable systems and protecting programmers from themselves. Properly used NULLs provide significant safety and rapid debugging of programming flaws.
Wow, a lot of answers effectively claim that assigning NULL to a pointer sets it to point to the address 0, confusing value and representation. It does not. Setting a pointer to the value NULL or 0 is an abstract conception that sets the pointer to an invalid value not pointing to any valid object. The binary representation actually stored in memory does not need to be all bits 0. This is usually not an architecture thing, it is up to the compiler. In fact I had an old DOS compiler (on x86) that used all bits 1 for a NULL pointer.
Additionally, any pointer type is allowed to have its own binary representation for NULL, as long as all these pointers compare as equal when compared.
Granted, most of the times all bits are 0 for a NULL pointer for practical reasons, but it is not required. This means that using calloc() or memset(0) is not a portable initialization of pointers.
NULL assigned to a pointer does not change the "fields pointed by it".
In your case if you make front = NULL, front will no longer point to the structure allocated by your malloc, but will contain zero (NULL is 0 according to the C standard). Nothing will point to your allocated struct - it's a memory leak.
Note the critical distinction here between the pointer (front) and what it points to (the structure) - it's a big difference.
To answer your specific questions:
If you run front=NULL, front will no longer point to a MEM_INSTANCE structure, and hence front->size will have no meaning (it will probably crash the program)
If you do free(front) the OS will free the memory allocated to you for the MEM_INSTANCE structure. front will now point to memory that's no longer yours - and you can't access it
It's a broad question - please ask a more specific one.
Assignment to a pointer is not the same as assigmment to the elements to which the pointer points. Assigning NULL to front will make it so that front points to nothing, but your allocated memory will be unaffected. It will not write any data into the fields formerly pointed to by front Moreover, that is a memory leak.
Invoking free(front) will deallocate the block of memory but will not affect the value of front; in other words, front will point to a memory region that you no longer own and which is no longer valid for you to access. This is also known as a "dangling pointer", and it is generally a good idea to follow free(front) immediately with front=NULL so that you know that front is no longer valid.
A good strategy for dealing with pointers is, at least in C++, to use smart pointer classes and to perform allocation only in constructors and to perform deallocation only in destructors. Another good strategy is to ensure that you always assign NULL to any pointer that you have just freed. In C, you really just have to make sure that your allocations are matched properly with deallocations. It can also help to use "name_of_object_create" and "name_of_object_destroy" functions that parallel C++ constructors/destructors; however, there is no way in C to ensure automatic destruction.
NULL is a sentinel value that denotes that a pointer does not point to a meaningful location. I.e. "Do not attempt to dereference this".
So what exactly free means here, does it make it NULL or make it all 0.
Neither. It merely frees the memory block that front points to. The value in front remains as it was.
In C/C++ NULL == 0.
int* a = NULL;
int* b = 0;
The value stored in variables 'a' & 'b' will both be 0. There is no special magic to "NULL". The day this was explained to me, pointers suddenly made sense.
The definition of NULL is usually
#define NULL 0
or
#define NULL (void*)0
Assigning it to a pointer merely makes that pointer stop pointing at whatever memory address it was pointing at and now point to memory address 0. This is usually done at initialization or after a pointer's memory has been free-d, though it isn't necessary. Setting a pointer equal to NULL does not deallocate memory or change any values of whatever it used to be pointing to.
Calling free() (in C) or delete (in C++) will deallocate the memory the pointer pointed to, but it will not set the pointer to NULL. Dereferencing the pointer after its been free-d is undefined behavior (ie. crashes normally). Therefore a common idiom is to set a pointer to NULL after it has been deallocated to more easily catch erroneous deferences later on.
It depends on what you mean. A null is tradionally means no value.
In C generally null mean 0. Therefore a pointer points to address 0. However if you actually have a a piece of memory then there coulkd be anything the memory from whatever used it last. If you clear the memory to (say) 0's then if you say that that memory contains pointers, those pointers will be null.
You have to think about what a pointer is: Its simply a value that holds an address of some memory. So if the address is 0x1 then this pointer with the value is pointing at the second byte in memory (remember addressing traditionally is 0 for first item, 1 for second etc). So if I char * p = 0x1; is say p points to the memory starting at address 0. Since I have declared it as char *, I've saying that I'm interested in a char sized value in the memory pointed at by 0. So *p is the value in the second byte in memory.
for example:
take the following
struct somestruct { char p } ;
// this means that I've got somestruct at location null (0x00000)
somestruct* ptrToSomeStruct = null;
so the ptrToSomeStruct->p says take the contents of where ptrToSomeStruct points (0x00000) and then what ever is there take tat to be a value of a char, so you are read the the first byte in memory
now if I declare it like so:
// this means that I've got somestruct on the stack and there for it's got some memory behind it.
somestruct ptrToSomeStruct;
so the ptrToSomeStruct->p says take the contents of where ptrToSomeStruct points (somewhere on the stack) and then what ever is there take that to be a value of a char, so you are read the the some byte from the stack.
Reflecting the comments below:
One of the key problems faced by C (and simmilar laguages) programmers is that sometimes a pointer will be pointing to the wrong part of memory, so when you read the value then you gone to the wrong part of memory to start with, hence what you find there is wrong anyway. In lots of cases the wrong address is actually set to 0. This like my examples means go to the start of memory and read stuff there. To help with programming errors where you actually have 0 in a pointer, many operating systems/architectures prevent you from reading or writing that memory and when you do, your program gets a address exception/fault.
Typically, in classic pointers, a null pointer points to address 0x0. It depends on architecture and specific language, but if it is a primitive type then the value 0 would be considered NULL.
In Intel architectures, the beginning of memory (address 0) contains reserved space, which cannot be allocated. It is also outside the boundary of any running application. So a pointer pointing there would quite safely mean NULL as well.
Speaking in C:
The preprocesor macro NULL is #defined (by stdio.h or stddef.h), with value 0 (or (void *)0)
-1) If I make front=NULL. What will be the value which actually gets stored in the different fields of the front, say front->size ,front->start_addr. Is it 0 or something else. I have limited knowledge in this NULL thing.
You will have front = 0x0. Doing font->size will raise a SIGSEG.
-2) If I do a free(front); It frees the memory which is pointed out by front. So what exactly free means here, does it make it NULL or make it all 0.
Free will mark the memory once held in front as free, so another malloc/realloc call may use it. Whether it sets your pointer to NULL or leaves its value unchanged its implementation dependant, but surely wont set all the struct to 0.
-3) What can be a good strategy to deal with initialization of pointers and freeing them .
I like to initialize my pointers to NULL and set them to NULL after being deallocated.
Your question suggests that you don't understand pointers at all.
If you put front = NULL the compiler will do front = 0 and as front contained an address of actual structure then you'll lose a possibility to free it.
Read "Kernighan & Ritchie" one again.