Special Pointer Value 0xFEEEFEF6 - c++

I came across an access violation while reading location 0xFEEEFEF6 in Visual Studio 2012 (using Nov 2012 CTP compiler) somewhere deep inside the C++11 concurrency implementation.
Does this value have any special meaning? Looking in Wikipedia I found similar entries (0xFEEEFEEE and 0xFEEDFACE).

0xFEEEFEF6 itself doesn't have a special meaning, but it's likely based on one of the "guard bytes" segments that MSVC likes to put around heap allocations. As Jan Dvorak noted, it's probably 8 bytes past the end of something, quite likely 2 pointers past the end of an array.
The concept is that memory you're likely to accidentally access, but shouldn't, is marked with very obvious patterns. Some of the most common are 0xCDCDCDCD and 0xFDFDFDFD, although 0xDDDDDDDD and 0xFEEEFEEE are also easy to run into. Classic compilers (not sure if any still use it) liked 0xDEADBEEF. Here's a pretty good write-up of the cases and positions you'll see guard bytes in.
The two most common causes of seeing these in a segfault (access violation) are typically accessing memory that's already been freed and over-running your bounds, especially in an array of pointers. Most of the values used for guard data aren't valid if they were to show up in the app otherwise (you're not going to get a block of memory at 0x00000000 or 0xCDCDCDCD, those are well outside of the virtual address space your heap lives in). Knowing the common ones off the top of your head can save a lot of time debugging.
Note that, with very few/no exceptions, these guard bytes only show up in debug builds. Writing memory with special patterns everytime it's allocated/deallocated (in fact, writing significantly more memory than has been allocated, since most of the guard patterns occur on the boundaries of allocated chunks) is fairly expensive and shouldn't be done at runtime. If you have issues like this in your debug builds, you're likely to get seemingly random (undefined) addresses from a release build. You may also get unlucky enough that a legitimate address ends up being grabbed by mistake, which can lead to all sorts of heap corruption.
Because guard bytes don't show up in release builds, you can't check for them as you might NULL and use that as a condition in your code. Instead, smart pointers and containers can help you manage memory correctly and avoid bad access in the first place. While occasionally annoying, smart pointers very much help avoid issues like this. Note that some types of access violation, particularly buffer overruns, are considered an entire class of security vulnerability because of how often they appear.
Without a VM/runtime forcing you to stay within certain memory constraints, if can be very easy to access memory you ought not. For example:
int values[10];
int output = 0;
int length = 10;
while (int i <= length) {
output += values[++length];
}
The prefix-increment will cause you to run off the end of the array and access values[10], an invalid index. Sometimes that may immediately cause an access violation and halt your program, other times it may be memory you're allowed to access and the value will be added to output, which could cause an overflow on output and unexpected behavior through the rest of the app.
Guard bytes exist so that your segfaults or increments will, while debugging, have repeatable values and be as obvious as possible.

Related

dangers of heap overflows?

I have a question about heap overflows.
I understand that if a stack variable overruns it's buffer, it could overwrite the EIP and ESP values and, for example, make the program jump to a place where the coder did not expect it to jump.
This seems, as I understand, to behave like this because of the backward little endian storing (where f.e. the characters in an array are stored "backwards", from last to first).
If you on the other hand put that array into the heap, which grows contra the stack, and you would overflow it, would it just write random garbage into empty memory space then? (unless you where on a solaris which as far as I know has a big endian system,side note)
Would this basicly be a danger since it would just write into "empty space"?
So no aimed jumping to adresses and areas the code was not designed for?
Am I getting this wrong?
To specify my question:
I am writing a program where the user is meant to pass a string argument and a flag when executing it via command line, and I want to know if the user could perform a hack with this string argument when it is put on the heap with the malloc function.
If you on the other hand put that array into the heap, which grows contra the stack, and you would overflow it, would it just write random garbage into empty memory space then?
You are making a couple of assumptions:
You are assuming that the heap is at the end of the main memory segment. That ain't necessarily so.
You are assuming that the object in the heap is at the end of the heap. That ain't necessarily so. (In fact, it typically isn't so ...)
Here's an example that is likely to cause problems no matter how the heap is implemented:
char *a = malloc(100);
char *b = malloc(100);
char *c = malloc(100);
for (int i = 0; i < 200; i++) {
b[i] = 'Z';
}
Writing beyond the end of b is likely to trample either a or c ... or some other object in the heap, or the free list.
Depending on what objects you trample, you may overwrite function pointers, or you may do other damage that results in segmentation faults, unpredictable behaviour and so on. These things could be used for code injection, to cause the code to malfunction in other ways that are harmful from a security standpoint ... or just to implement a denial of service attack by crashing the target application / service.
There are various ways heap overflow could lead to code execution:
Most obvious - you overflow into another object that contains function pointers and get to overwrite one of them.
Slightly less obvious - the object you overflow into doesn't itself contain function pointers, but it contains pointers that will be used for writing, and you get to overwrite one of them to point to a function pointer so that a subsequent write overwrites a function pointer.
Exploiting heap bookkeeping structures - by overwriting the data that the heap allocator itself uses to track size and status of allocated/free blocks, you trick it into overwriting something valuable elsewhere in memory.
Etc.
For some advanced techniques, see:
http://packetstormsecurity.com/files/view/40638/MallocMaleficarum.txt
Even if you can't overwrite a return address, how do you feel about an attacker modifying the rest of your data? This shouldn't thrill you.
To answer your question generally: it is a very bad idea to let the user copy data anywhere without checking its size. You should absolutely never do that, especially on purpose.
If the user means no harm, they may crash your program, either by overwriting useful data, or by causing a page fault. If your user is malicious, you're potentially letting them hijack your system. Both are highly undesirable.
Endianness does not matter to buffer overflows. Big endian machines are just as vulnerable as little-endian machines. The only difference will be the byte order of the malicious data.
You may be thinking instead of the direction the stack grows in, which is independent of endianness. In the case where it grows up, you won't be able to hijack the return address of the function that declares the buffer. However, if you pass that buffer address to any other function, and this function overflows instead, an attacker may change this function's return address. This would be the case, for instance, if you called memcpy of scanf or any other function to modify your buffer (assuming that the compiler didn't inline them).
The stack usually grows downwards. In this case, an attacker can use an overflow to hijack the return address of the function that declares it.
In other words, neither the stack configuration nor endianness offer meaningful protection against stack buffer overflows.
As for the heap:
If you on the other hand put that array into the heap, which grows contra the stack, and you would overflow it, would it just write random garbage into empty memory space then?
The answer, as almost always, is it depends, but probably not. The 32-bit implementation of malloc in glibc keeps bookkeeping structure at the end of the buffer (or at least, used to). By overflowing onto the bookkeeping structures with the correct incantations, when the allocation was freed, you could cause free to write four arbitrary bytes at an arbitrary location. This is a lot of power. This kind of exploit comes up regularly in capture-the-flag competitions and is very exploitable.

Why does my dynamically allocated array get initialized to 0?

I have some code that creates a dynamically allocated array with
int *Array = new int[size];
From what I understand, Array should be a pointer to the first item of Array in memory. When using gdb, I can call x Array to examine the value at the first memory location, x Array+1 to examine the second, etc. I expect to have junk values left over from whatever application was using those spots in memory prior to mine. However, using x Array returns 0x00000000 for all those spots. What am I doing wrong? Is my code initializing all of the values of the Array to zero?
EDIT: For the record, I ask because my program is an attempt to implement this: http://eli.thegreenplace.net/2008/08/23/initializing-an-array-in-constant-time/. I want to make sure that my algorithm isn't incrementing through the array to initialize every element to 0.
In most modern OSes, the OS gives zeroed pages to applications, as opposed to letting information seep between unrelated processes. That's important for security reasons, for example. Back in the old DOS days, things were a bit more casual. Today, with memory protected OSes, the OS generally gives you zeros to start with.
So, if this new happens early in your program, you're likely to get zeros. You'd be crazy to rely on that though; it's undefined behavior if you do.
If you keep allocating, filling, and freeing memory, eventually new will return memory that isn't zeroed. Rather, it'll contain remnants of your process' own earlier scribblings.
And there's no guarantee that any particular call to new, even at the beginning of your program, will return memory filled with zeros. You're just likely to see that for calls to new early in your program. Don't let that mislead you.
I expect to have junk values left over from whatever application was using those spots
It's certainly possible but by no means guaranteed. Particularly in debug builds, you're just as likely to have the runtime zero out that memory (or fill it with some recognisable bit pattern) instead, to help you debug things if you use the memory incorrectly.
And, really, "those spots" is a rather loose term, given virtual addressing.
The important thing is that, no, your code is not setting all those values to zero.

Is it possible to force Linux to invalidate virtual memory after free

On Windows, I have noticed that trying to dereference a pointer to recently freed memory results in a crash, trapped by Visual Studio stating that the memory is invalid. This is as expected. However, executing the same application and code path leading to dereferencing a pointer to recently freed memory does not immediately cause a crash in Linux. This suggests to me that the Linux kernel (or GNU C++ run-time) does not invalidate freed memory very quickly, even on debug builds. The application takes much longer to crash. Is this the case? If so, can I force the memory to be unmapped more quickly? If not then what's going on?
Have you tried http://valgrind.org/? It's purpose is to help track down problems such as the one you've described.
Most implementations of new/delete do not return memory
immediately to the system, or at least don't return smaller
blocks. I'm rather surprised that your code crashed under
Windows simply by dereferencing a pointer to the memory; are you
sure you didn't do more than that (e.g. use the value you read
through the pointer).
How big was the block? Many implementations use different
strategies depending on the size of the block, and will free
very large blocks immediately. (IIRC, Linux will do a mmap
immediately for very large blocks, and unmap it immediately when
you free it. Of course, if you reallocate memory between the
free and dereferencing the pointer, it's possible that the
address is in the newly allocated space, and it won't crash.)
In the end: the granularity of mapping is the page, and you
cannot expect the allocator to block up a complete page for each
allocation, just so that it can invalidate the memory
immediately on deallocation. (A page is probably at least 4K,
and you don't want to loose 4K address space each time you
allocate 16 bytes.) Short of tracking all memory accesses (like
ValGrand or Purify), and running a lot slower, the only
alternative is to use garbage collection, to ensure that the
memory isn't reallocated as long as there is a pointer to it,
and overwrite it completely on deallocation (i.e. in delete
or free) with values that are likely to cause problems if used
(0xDEADBEEF, or something like that). And even then, you're
not really guaranteed that you'll crash—0xDEADBEEF could
be a valid value for what you think you're reading. (But this
does allow e.g. setting a flag in the constructor, resetting it
in the destructor, and testing it in each function. For code
that has to be actively defensive.)

Pointers to statically allocated objects

I'm trying to understand how pointers to statically allocated objects work and where they can go wrong.
I wrote this code:
int* pinf = NULL;
for (int i = 0; i<1;i++) {
int inf = 4;
pinf = &inf;
}
cout<<"inf"<< (*pinf)<<endl;
I was surprised that it worked becasue I thought that inf would dissapear when the program left the block and the pointer would point to something that no longer exists. I expected a segmentation fault when trying to access pinf. At what stage in the program would inf die?
Your understanding is correct. inf disappears when you leave the scope of the loop, and so accessing *pinf yields undefined behavior. Undefined behavior means the compiler and/or program can do anything, which may be to crash, or in this case may be to simply chug along.
This is because inf is on the stack. Even when it is out of scope pinf still points to a useable memory location on the stack. As far as the runtime is concerned the stack address is fine, and the compiler doesn't bother to insert code to verify that you're not accessing locations beyond the end of the stack. That would be prohibitively expensive in a language designed for speed.
For this reason you must be very careful to avoid undefined behavior. C and C++ are not nice the way Java or C# are where illegal operations pretty much always generate an immediate exception and crash your program. You the programmer have to be vigilant because the compiler will miss all kinds of elementary mistakes you make.
You use so called Dangling pointer. It will result in undefined behavior by the C++ Standard.
It probably will never die because pinf will point to something on the stack.
Stacks don't often shrink.
Modify it and you'll pretty much be guaranteed an overwrite though.
If you are asking about this:
int main() {
int* pinf = NULL;
for (int i = 0; i<1;i++){
int inf = 4;
pinf = &inf;
}
cout<<"inf"<< (*pinf)<<endl;
}
Then what you have is undefined behaviour. The automatically allocated (not not static) object inf has gone out of scope and notionally been destroyed when you access it via the pointer. In this case, anything might happen, including it appearing to "work".
You won't necessarily get a SIGSEGV (segmentation fault). inf memory is probably allocated in the stack. And the stack memory region is probably still allocated to your process at that point, so, that's probably why you are not getting a seg fault.
The behaviour is undefined, but in practice, "destructing" an int is a noop, so most compilers will leave the number alone on the stack until something else comes along to reuse that particular slot.
Some compilers might set the int to 0xDEADBEEF (or some such garbage) when it goes out of scope in debug mode, but that won't make the cout << ... fail; it will simply print the nonsensical value.
The memory may or may not still contain a 4 when it gets to your cout line. It might contain a 4 strictly by accident. :)
First things first: your operating system can only detect memory access gone astray on page boundaries. So, if you're off by 4k or 8k or 16k or more. (Check /proc/self/maps on a Linux system some day to see the memory layout of a process; any addresses in the listed ranges are allowed, any outside the listed ranges aren't allowed. Every modern OS on protected-memory CPUs will support a similar mechanism, so it'll be instructive even if you're just not that interested in Linux. I just know it is easy on Linux.) So, the OS can't help you when your data is so small.
Also, your int inf = 4; might very well be stashed in the .rodata, .data or .text segments of your program. Static variables may be stuffed into any of these sections (I have no idea how the compiler/linker decides; I consider it magic) and they will therefore be valid throughout the entire duration of the program. Check size /bin/sh next time you are on a Unix system for an idea how much data gets put into which sections. (And check out readelf(1) for way too much information. objdump(1) if you're on older systems.)
If you change inf = 4 to inf = i, then the storage will be allocated on the stack, and you stand a much better chance of having it get overwritten quickly.
A protection fault occurs when the memory page you point to is not valid anymore for the process.
Luckily most OS's don't create a separate page for each integer's worth of stack space.

Array index out of bound behavior

Why does C/C++ differentiates in case of array index out of bound
#include <stdio.h>
int main()
{
int a[10];
a[3]=4;
a[11]=3;//does not give segmentation fault
a[25]=4;//does not give segmentation fault
a[20000]=3; //gives segmentation fault
return 0;
}
I understand that it's trying to access memory allocated to process or thread in case of a[11] or a[25] and it's going out of stack bounds in case of a[20000].
Why doesn't compiler or linker give an error, aren't they aware of the array size? If not then how does sizeof(a) work correctly?
The problem is that C/C++ doesn't actually do any boundary checking with regards to arrays. It depends on the OS to ensure that you are accessing valid memory.
In this particular case, you are declaring a stack based array. Depending upon the particular implementation, accessing outside the bounds of the array will simply access another part of the already allocated stack space (most OS's and threads reserve a certain portion of memory for stack). As long as you just happen to be playing around in the pre-allocated stack space, everything will not crash (note i did not say work).
What's happening on the last line is that you have now accessed beyond the part of memory that is allocated for the stack. As a result you are indexing into a part of memory that is not allocated to your process or is allocated in a read only fashion. The OS sees this and sends a seg fault to the process.
This is one of the reasons that C/C++ is so dangerous when it comes to boundary checking.
The segfault is not an intended action of your C program that would tell you that an index is out of bounds. Rather, it is an unintended consequence of undefined behavior.
In C and C++, if you declare an array like
type name[size];
You are only allowed to access elements with indexes from 0 up to size-1. Anything outside of that range causes undefined behavior. If the index was near the range, most probably you read your own program's memory. If the index was largely out of range, most probably your program will be killed by the operating system. But you can't know, anything can happen.
Why does C allow that? Well, the basic gist of C and C++ is to not provide features if they cost performance. C and C++ has been used for ages for highly performance critical systems. C has been used as a implementation language for kernels and programs where access out of array bounds can be useful to get fast access to objects that lie adjacent in memory. Having the compiler forbid this would be for naught.
Why doesn't it warn about that? Well, you can put warning levels high and hope for the compiler's mercy. This is called quality of implementation (QoI). If some compiler uses open behavior (like, undefined behavior) to do something good, it has a good quality of implementation in that regard.
[js#HOST2 cpp]$ gcc -Wall -O2 main.c
main.c: In function 'main':
main.c:3: warning: array subscript is above array bounds
[js#HOST2 cpp]$
If it instead would format your hard disk upon seeing the array accessed out of bounds - which would be legal for it - the quality of implementation would be rather bad. I enjoyed to read about that stuff in the ANSI C Rationale document.
You generally only get a segmentation fault if you try to access memory your process doesn't own.
What you're seeing in the case of a[11] (and a[10] by the way) is memory that your process does own but doesn't belong to the a[] array. a[25000] is so far from a[], it's probably outside your memory altogether.
Changing a[11] is far more insidious as it silently affects a different variable (or the stack frame which may cause a different segmentation fault when your function returns).
C isn't doing this. The OS's virtual memeory subsystem is.
In the case where you are only slightly out-of-bound you are addressing memeory that is allocated for your program (on the stack call stack in this case). In the case where you are far out-of-bounds you are addressing memory not given over to your program and the OS is throwing a segmentation fault.
On some systems there is also a OS enforced concept of "writeable" memory, and you might be trying to write to memeory that you own but is marked unwriteable.
Just to add what other people are saying, you cannot rely on the program simply crashing in these cases, there is no gurantee of what will happen if you attempt to access a memory location beyond the "bounds of the array." It's just the same as if you did something like:
int *p;
p = 135;
*p = 14;
That is just random; this might work. It might not. Don't do it. Code to prevent these sorts of problems.
As litb mentioned, some compilers can detect some out-of-bounds array accesses at compile time. But bounds checking at compile time won't catch everything:
int a[10];
int i = some_complicated_function();
printf("%d\n", a[i]);
To detect this, runtime checks would have to be used, and they're avoided in C because of their performance impact. Even with knowledge of a's array size at compile time, i.e. sizeof(a), it can't protect against that without inserting a runtime check.
As I understand the question and comments, you understand why bad things can happen when you access memory out of bounds, but you're wondering why your particular compiler didn't warn you.
Compilers are allowed to warn you, and many do at the highest warning levels. However the standard is written to allow people to run compilers for all sorts of devices, and compilers with all sorts of features so the standard requires the least it can while guaranteeing people can do useful work.
There are a few times the standard requires that a certain coding style will generate a diagnostic. There are several other times where the standard does not require a diagnostic. Even when a diagnostic is required I'm not aware of any place where the standard says what the exact wording should be.
But you're not completely out in the cold here. If your compiler doesn't warn you, Lint may. Additionally, there are a number of tools to detect such problems (at run time) for arrays on the heap, one of the more famous being Electric Fence (or DUMA). But even Electric Fence doesn't guarantee it will catch all overrun errors.
That's not a C issue its an operating system issue. You're program has been granted a certain memory space and anything you do inside of that is fine. The segmentation fault only happens when you access memory outside of your process space.
Not all operating systems have seperate address spaces for each proces, in which case you can corrupt the state of another process or of the operating system with no warning.
C philosophy is always trust the programmer. And also not checking bounds allows the program to run faster.
As JaredPar said, C/C++ doesn't always perform range checking. If your program accesses a memory location outside your allocated array, your program may crash, or it may not because it is accessing some other variable on the stack.
To answer your question about sizeof operator in C:
You can reliably use sizeof(array)/size(array[0]) to determine array size, but using it doesn't mean the compiler will perform any range checking.
My research showed that C/C++ developers believe that you shouldn't pay for something you don't use, and they trust the programmers to know what they are doing. (see accepted answer to this: Accessing an array out of bounds gives no error, why?)
If you can use C++ instead of C, maybe use vector? You can use vector[] when you need the performance (but no range checking) or, more preferably, use vector.at() (which has range checking at the cost of performance). Note that vector doesn't automatically increase capacity if it is full: to be safe, use push_back(), which automatically increases capacity if necessary.
More information on vector: http://www.cplusplus.com/reference/vector/vector/