Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I personally believe I have a firm grasp of pointers, but let's say
int* pMyPointer;
int number = 1000;
pMyPointer = &number;
cout << pMyPointer << endl;
pMyPointer might return a memory address of 0037FBB0, but why does that matter? How can this be useful while programming?
Object identity.
If you have two pointers or references, how can you tell if using one could affect the other? Simply printing the current value of all the data members won't tell you if they are the same object or clones/copies.
So when you're debugging, you become very interested in whether the addresses stored in different pointers are the same, which requires you to inspect those address values.
This is a very important information in static memory management.
In userspace application development, this information might be irrelevant to the most developers, but they are very important for the low-level developer. Remember, programming languages with static memory manager will always be lower level languages.
This also the main point in how pointers work. They store memory addresses to where they point.
You can also create a pointer like this: int *mptr = (int*)0x13371234;. This creates a pointer which points to the int at 0x13371234.
It also gives you information about where your stuff in stored, and this can also be used to determine the location of the stack, if for any reason inline assembly is not allowed. If you use malloc, then it is generally not a must-need information.
A typical use is to create linked data structures like lists and trees.
For example, in a binary tree each node contains a pointer to its two children.
As far as the pointer goes, the actual value only rarely means much in itself. It's usually used as a "magic cookie" -- an essentially "magic" value gives you access to some particular variable. In a typical case you save an address into a pointer, then dereference the pointer to get to the item at that address, without ever examining (or even caring about) the value of the pointer itself.
There are a few exceptions to this general rule though. For one example, some memory allocators use the address of a block to track not only the location of the block, but also the block's size. By starting with a block aligned to a large boundary, and always splitting blocks by powers of 2, the whole address tells the location of the block, and the lower bits of the address tell the size of the block that must have been allocated to get to that address.
The latter are definitely the exceptions though. The typical case is that the value of the pointer means nothing beyond giving access to the item at that address.
Related
I want to clarify my mental map on memory allocation.
Lets suppose I have the following Array:
int arr [] = {1,2,3};
Lets suppose each integer will occupy 4 bytes in memory.
Such that the memory addresses of the integers could be :
HHH01 HHH05 HHH09
Will the memory chunk of arr be a superset of the memory chunks of each integer?
Strictly speaking, IIRC, the answer to your question is undefined and that's important because dipping into undefined behavior leads to some of the hardest and most obscure to track down bugs. Pointers and arrays don't necessarily have to map in memory in any specific fashion within the CPP standard. As long as they can properly perform the necessary arithmetic to find and de-reference to the proper elements, etc... anything beyond that should be considered safely abstracted away.
With that said... I think the answer to the question for most (if not all?) practical purposes is that your understanding is correct. If you were to cout << &(arr[0]);cout << &(arr[2]) you'd get the addresses you expect in any compiler that I've used and the amount of memory allocated will be the amount you'd expect. Doing cout << &(arr[3]) will even give you a valid address, though the data that's actually stored in arr[3] would be garbage. The only thing to be aware of is that different compilers and operating systems could provide different sizes and alignments of those elements. It's possible that if you were to check the size of int it could tell you that it's 2 bytes, but when you start looking at the addresses of the elements in an array printed out by a compiler they've got a 4-byte spacing.
In the end, while this can be interesting from an academic point of view... making actual use of it should be avoided at essentially all costs. If you start trying to manually access memory locations, it's likely to come back and bite you or whoever has to maintain your code down the line somewhere.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
How does a compiler know which nodes to point to next in a linked list since each node's next or previous can be anywhere in the heap. Can i attach a specific memory address say 0x00000001 to a pointer and bind it to that address?
Also, can you bind a specific address to a pointer?
Yes. In fact, that is precisely the purpose of a pointer. Memory is addressed by numbers. The first byte is in the address 0, the next is 1 and so on. A pointer is essentially an object that stores a memory address, which is internally simply a number.
In following example, we store the address of the object i into a pointer:
int i;
int* ptr = &i;
How does a compiler string together addresses in a linked list
A linked list node is simply a structure with a pointer to the next (and previous in case of doubly linked list) node. The address of the next node is stored in the pointer.
how does the compiler allocate [...] these addresses during runtime?
In whatever way the language implemnetation chooses to allocate. The language does not specify "how" memory is allocated. The way will differ depending on the storage duration class of the object.
Typically in practice, the language implementation will ask the operating system to map some memory. Variables with automatic storage are stored in what is called the "call stack", and dynamic objects are stored in "free store". How either of these are implemented is outside the scope of my answer. I would suggest studying how operating systems and compilers are made. It's a wide topic.
how does the compiler [...] reference these addresses during runtime?
It stores the address. There are CPU instructions for accessing memory at given address.
Can I for example , bind 0x00000001 to node3?
I don't know what "node3" is, but you can point at the address 1 if you want to. There's not much you can do with the pointer unless there actually is an object at the address. There's no way of creating objects into unallocated memory, and there is no way of allocating memory from arbitrary address in standard C++.
I have a question about heap overflows.
I understand that if a stack variable overruns it's buffer, it could overwrite the EIP and ESP values and, for example, make the program jump to a place where the coder did not expect it to jump.
This seems, as I understand, to behave like this because of the backward little endian storing (where f.e. the characters in an array are stored "backwards", from last to first).
If you on the other hand put that array into the heap, which grows contra the stack, and you would overflow it, would it just write random garbage into empty memory space then? (unless you where on a solaris which as far as I know has a big endian system,side note)
Would this basicly be a danger since it would just write into "empty space"?
So no aimed jumping to adresses and areas the code was not designed for?
Am I getting this wrong?
To specify my question:
I am writing a program where the user is meant to pass a string argument and a flag when executing it via command line, and I want to know if the user could perform a hack with this string argument when it is put on the heap with the malloc function.
If you on the other hand put that array into the heap, which grows contra the stack, and you would overflow it, would it just write random garbage into empty memory space then?
You are making a couple of assumptions:
You are assuming that the heap is at the end of the main memory segment. That ain't necessarily so.
You are assuming that the object in the heap is at the end of the heap. That ain't necessarily so. (In fact, it typically isn't so ...)
Here's an example that is likely to cause problems no matter how the heap is implemented:
char *a = malloc(100);
char *b = malloc(100);
char *c = malloc(100);
for (int i = 0; i < 200; i++) {
b[i] = 'Z';
}
Writing beyond the end of b is likely to trample either a or c ... or some other object in the heap, or the free list.
Depending on what objects you trample, you may overwrite function pointers, or you may do other damage that results in segmentation faults, unpredictable behaviour and so on. These things could be used for code injection, to cause the code to malfunction in other ways that are harmful from a security standpoint ... or just to implement a denial of service attack by crashing the target application / service.
There are various ways heap overflow could lead to code execution:
Most obvious - you overflow into another object that contains function pointers and get to overwrite one of them.
Slightly less obvious - the object you overflow into doesn't itself contain function pointers, but it contains pointers that will be used for writing, and you get to overwrite one of them to point to a function pointer so that a subsequent write overwrites a function pointer.
Exploiting heap bookkeeping structures - by overwriting the data that the heap allocator itself uses to track size and status of allocated/free blocks, you trick it into overwriting something valuable elsewhere in memory.
Etc.
For some advanced techniques, see:
http://packetstormsecurity.com/files/view/40638/MallocMaleficarum.txt
Even if you can't overwrite a return address, how do you feel about an attacker modifying the rest of your data? This shouldn't thrill you.
To answer your question generally: it is a very bad idea to let the user copy data anywhere without checking its size. You should absolutely never do that, especially on purpose.
If the user means no harm, they may crash your program, either by overwriting useful data, or by causing a page fault. If your user is malicious, you're potentially letting them hijack your system. Both are highly undesirable.
Endianness does not matter to buffer overflows. Big endian machines are just as vulnerable as little-endian machines. The only difference will be the byte order of the malicious data.
You may be thinking instead of the direction the stack grows in, which is independent of endianness. In the case where it grows up, you won't be able to hijack the return address of the function that declares the buffer. However, if you pass that buffer address to any other function, and this function overflows instead, an attacker may change this function's return address. This would be the case, for instance, if you called memcpy of scanf or any other function to modify your buffer (assuming that the compiler didn't inline them).
The stack usually grows downwards. In this case, an attacker can use an overflow to hijack the return address of the function that declares it.
In other words, neither the stack configuration nor endianness offer meaningful protection against stack buffer overflows.
As for the heap:
If you on the other hand put that array into the heap, which grows contra the stack, and you would overflow it, would it just write random garbage into empty memory space then?
The answer, as almost always, is it depends, but probably not. The 32-bit implementation of malloc in glibc keeps bookkeeping structure at the end of the buffer (or at least, used to). By overflowing onto the bookkeeping structures with the correct incantations, when the allocation was freed, you could cause free to write four arbitrary bytes at an arbitrary location. This is a lot of power. This kind of exploit comes up regularly in capture-the-flag competitions and is very exploitable.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I would like to know which pointer values are invalid so i would not have to allocate new memory just to mark special chunk states(Memory consumption is critical). So i could use them for special states like
0x00000000 - would mean chunk is not loaded
0x00000001 - would mean chunk is empty
0x00000002 - chunk is full. And when some real stuff needs to be saved to the memory i would do new Chunk(...);
I would suggest just using a struct that contains a pointer and an enum. But if for some reason that's inconvenient, just allocate some small structures and use their addresses just to indicate magic pointer values. (Of course, don't ever free them.)
You can also use the address of static objects. Like this:
static int chunk_not_loaded_i, chunk_empty_i, chunk_full_i;
void *chunk_not_loaded = &chunk_not_loaded_i;
void *chunk_full = &chunk_full_i;
if (some_chunk == chunk_not_loaded)
...
Assigning exact values to the pointer is quite unstable and error-prone. That way, your code would be tight to exact hardware architecture(s). For example, some platforms have 0x00000000 as absolutely valid address.
So the fact that address is assigned or not is not related to numeric value of the pointer (at common case).
0x00000000 is equivalent to NULL, which should be replaced by nullptr in C++11 (which does not have a specified numerical value - though, most implementations treat it as 0 to make backwards compatibility easy - that is not guaranteed, though).
It is the only "special" pointer value. All other values are treated as valid pointer values (meaning attempting to use them would attempt to dereference the pointer - and likely will have bad consequences for values like 0x00000001 or 0x00000002). It sounds like you need a container (e.g. pool) that has a state (which could be an enum or some other value you desire). Alternatively, you could use boost::optional<T> or std::pair<T*, bool> to mark pointers as valid or invalid.
Suppose there is a variable a and a pointer p which points to address of a.
int a;
int *p=&a;
Now since I have a pointer pointing to the location of the variable, I know the exact memory location (or the chunk of memory).
My questions are:
Given an address, can we find which variable is using them? (I don't think this is possible).
Given an address, can we atleast find how big is the chunk of memory to which that memory address belongs. (I know this is stupid but still).
You can enumerate all your (suspect) variables and check if they point to the same location as your pointer (e.g. you can compare pointers for equality)
If your pointer is defined as int *p, you can assume it points to an integer. Your assumption can be proven wrong, of course, if for example the pointer value is null or you meddled with the value of the pointer.
You can think of memory as a big array of bytes:
now if you have a pointer to somewhere in middle of array, can you tell me how many other pointers point to same location as your pointer?? Or can you tell me how much information I stored in memory location that you point to it?? Or can you at least tell me what kind of object stored at location of your pointer?? Answer to all of this question is really impossible and the question look strange. Some languages add extra information to their memory management routines that they can track such information at a later time but in C++ we have the minimum overhead, so your answer is no it is not possible.
For your first question you may handle it using smart pointers, for example shared_ptr use a reference counter to know how many shared_ptr are pointing to a memory location and be able to control life time of the object(but current design of shared_ptr do not allow you to read that counter).
There is non-standard platform dependent solution to query size of dynamically allocated memory(for example _msize on Windows and memory_size on Unix) but that only work with dynamic memories that allocated using malloc and is not portable, in C++ the idea is you should care for this, if you need this feature implement a solution for it and if you don't need it, then you never pay extra cost of it
Given an address ,can we find which variable is using them ?
No, this isn't possible. variables point to memory, not the other way around. There isn't some way to get to variable-names from compiled code, except maybe via the symbol table, reading which in-turn would probably need messing around with assembly.
Given an address ,can we atleast find how big is the chunk of memory
to which that memory address belongs..
No. There isn't a way to do that given just the address. You could find the sizeof() after dereferencing the address but not from the address itself.
Question 1.
A: It cannot be done natively, but could be done by Valgrind memcheck tool. The VM tracks down all variables and allocated memory space/stack. However, it is not designed to answer such question, but with some modification, memcheck tool could answer this question. For example, it can correlate invalid memory access or memory leakage address to variables in the source code. So, given a valid and known memory address, it must be able to find the corresponding variable.
Question 2.
A: It can be done like above, but it can also be done natively with some PRELOADED libraries for malloc, calloc, strdup, free, etc. By manual instructed memory allocation functions, you can save allocated address and size. And also save the return address by __builtin_return_address() or backtrace() to know where the memory chunk is being allocated. You have to save all allocated address and size to a tree. Then you should be able to query the address belongs to which chunk and the chunk size, and what function allocated the chunk.