what the meaning "source of entropy of sth." on stack - c++

I'm reading a document showing ways of safe coding named terminator canary. The terminator canary means a piece of padding into function stack frame preventing maliciously overwriting the return address on stack. The terminator canary act as a safe band, hard to compute the size or to know where is the return address for attackers, whatever, as my understanding, may not accurate.
After the doc, there's a test, in which one true-or-false question is as below:
"The source of entropy for a terminator canary can be attacked"
My question is:
I have no idea what "The source of entropy for sth." and the question means.

The canary is typically a random value placed before the return address on the stack. A buffer overrun that reaches to and overwrites the return address, would also overwrite the canary. The compiler inserts a check, right before returning from the function, that the canary is unmodified, that it still contains the original random value. If it has been modified, the check usually terminates the process (rather than jumping to a compromised return address and giving control to the attacker).
In information theory, "enthropy" is, roughly, another term for "randomness". The source of enthropy is basically the random number generator - in this context, one that is used to set up the canary. If the attacker can predict the random values produced by that generator, then it can arrange its buffer overrun to keep the canary intact, thus bypassing the safety check.

Related

What does HeapValidate in windows do?

I have been reading up on HeapValidate in an existing code and trying to figure out what does it do. The documentation says that it checks whether the heap control structures are in a consistent state. What does that mean?
The heap is a data structure like any other, and in its metadata-state-variables there are certain conditions that should always be true. As a made-up example, the number of children in a heap-tree-node should always be a non-negative number; so if HeapValidate() reads a child-count variable and sees that it is negative, it knows something has gone badly wrong and can flag that block as broken.
You might wonder, assuming Microsoft’s heap code does not have any bugs, how the heap’s metadata might get in to an invalid/“impossible” state like that in the first place. Since the heap’s metadata structures live in the same address space that the user-code has access to, it’s usually the result of buggy user code writing some other data via an invalid pointer that happens to point at a memory location where the heap’s metadata field happens to reside, silently overwriting/corrupting the metadata.

What will memory will override by overflow in the code

i got the following code:
char *func(char * a)
{
char b[1000];
strcpy(b,a);
return b;
}
(I know that the code is bad, because I return address of array, that will delete when I exit the function.) My question is, what will be deleted/override, if I put in "a", an array of 2000 chars, and "b" is only 1000 chars array. I read this question somewhere, and they said that by this code I can know what will override.
It seems you are not familiar with the idea of stack. When the program control enters into a function a pointer to stack is given to the program control. And all local variables are allocated on this stack. When program returns from the function then the stack pointer is changed to it's original value. Therefore b is on stack and 1000 bytes are allocated for it. And when program returns from func then in fact nothing will be deleted or overwritten until some other function uses that area of stack. You can try accessing 'b' just after you come out from the function and it must work. But suppose after calling 'func' you call another function 'func1' which has some local variables then updating those variables will overwrite the content where b is pointing
what will be deleted/override, if I put in "a", an array of 2000 chars, and "b" is only 1000 chars array.
The behaviour is undefined. From the standard's perspective, nothing is guaranteed. Some memory could be overwritten, or it might not be. In practice, it's likely that some memory would be overwritten.
correct me if I wrong, that this overflow, will rewrite the value of the pointer. right?
It could. It might not. The behaviour is undefined.
What will happen depends on the compiler, the version of the compiler, the cpu architecture, the compilation options, how the rest of the program is defined and possibly other factors.
In a typical implementation, the return value fits in a register and is a compile time constant so in practice it is unlikely to be stored on the stack in which case it would not be affected by the overflow. There are much more dangerous potential side-effects, such as the function returning in a completely different place than where it was called from.
If the C-string in a that you pass is bigger than the space you allocated for b (1000 bytes) it will happily write past the end of b and down through whatever is on the stack below b. A C-string has no defined length. strcpy(b,a) will keep copying bytes into b until it finds a \0 inside a. On your function func's stack, the compiler will reserve 1000 bytes for b then save the return address to whatever called func. If a overwrites b you'll write over the return address and when func returns you'll jump to some random address with horrible results. Each compiler is free to put the return address from the function wherever it likes. Maybe it puts the return address at the top of its stack. But even in that scenario you'll be writing over stuff that you should not be writing over. If you're lucky you'll get an access violation.
To protect against this you can use strncpy(b,a,sizeof(b)-1) and put a \0 at the end of b. Best to check strlen(a) and handle the error sanely if strlen(a) > sizeof(b)-1.
This is exactly the "buffer overrun" technique that hackers used to break Windows security restrictions. You call a Windows function that expects a C-string but does not check the length of the C-string that is passed in. The string that is passed in guesses the length of the input buffer and eventually it guesses correctly and overwrites the return address of the function to point into the string that was passed in. The remainder of the input string contains machine code instructions that then operate under the security permissions of the Windows function. It can then do whatever it wants.
Microsoft closed this security loophole long ago, but it remains as a good lesson to check the length of C-strings that you accept as input parameters.

non-NULL reserved pointer value

How can I create a reserved pointer value?
The context is this: I have been thinking of how to implement a data structure for a dynamic scripting language (I am not planning on implementing this - just wondering how it would be done).
Strings may contain arbitrary bytes, including NUL. Thus, it is necessary to store the value separately. This requires a pointer (to point to the array) and a number. The first trick is that if the pointer is NULL, it cannot possibly be a valid string, so the number can be used for an actual integer.
If a second reserved pointer value could be created, this could be used to imply that the other field is now being used as a floating-point value. Can this be done?
One thought is to mmap() an address with no permissions, which could also be done to replace the usage of the NULL pointer.
On any modern system, you can just use the pointer values 1, 2, ... 4095 for such purposes. Another frequent choice is (uintptr_t)-1, which is technically inferior, but used more frequently than 1 nevertheless.
Why are these values "safe"?
Modern systems safeguard against NULL pointer accesses by making it impossible to map anything at virtual address zero. Almost any dereferencing of a NULL pointer will hit this nonexistant region, and the hardware will tell the OS system that something bad happened, which triggers the OS to segfault the process.
Since virtual memory pages are page aligned (at least 4k on current hardware), and nothing is mapped to address zero, nothing can be mapped to the entire range 0, ..., 4095, protecting all these addresses in the same way, and you can use them as special purpose values.
How much virtual memory space is reserved for this purpose is a system parameter, on linux it is controlled by /proc/sys/vm/mmap_min_addr, and the root user can change it to zero, which would disable this protection (which would not be a very smart idea). The default on Ubuntu is 64k (i. e. 16 pages).
This is also the reason why (uintptr_1)-1 is less safe than 1; even though any load of more than one byte will hit the zero page, the address (uintptr_1)-1 itself is not necessarily protected in this way. Consequently, doing string operations on (char*)-1 does not necessarily segfault.
Edit:
My original explanation with the special mapping seems to have been a bit stale, probably this was the way things were handled on the old Mac/PPC platform. Even though the effect is pretty much the same, I changed the details of the answer to reflect modern linux. Anyway, the important point is not how the null page protection is achieved, the important point is that any sane, modern system will have some null page protection that encompasses at least the mentioned address range. Some more details can be found in this SO answer: https://stackoverflow.com/a/12645890/2445184
In standard C (and standard C++), the approach that's 100% valid and works is simple: declare a variable, use its address as a magic value.
char *ptr;
char magic;
if (ptr == &magic) { ... }
This guarantees that magic will never have any overlap with another object.
Magic pointer values such as (char *) 1 have their advantages too, but it's so easy to get them wrong (even if you disregard the theoretical implementations where (char *) 1 may be a valid object, if you use (int *) 1 as a magic pointer value, and the optimiser assumes int * values are suitably aligned, it may removes checks that are no-ops only in 100% valid code, not in your code) that I'd recommend the standard approach, and optionally temporarily switch to magic pointer values only if you find they help you debug.
mmaping an address can fail if the address is already assigned. Probably it would better to use an address of some static variable or function. Or to obtain an unique address via malloc(1).

What is security cookie in C++?

I have read from Google that it is used for controlling buffer overruns at application level and it is called by CRT.
It also says that
" Essentially, on entry to an overrun-protected function, the cookie is put on the stack, and on exit, the value on the stack is compared against the global cookie. Any difference between them indicates that a buffer overrun has occurred and results in immediate termination of the program."
But I could not much understand how it works? Please help.
The "cookie" is basically nothing more than an arbitrary value.
So, the basic idea is that you write the chosen value on the stack before calling a function. Although it's probably not a very good value, let's arbitrarily chose 0x12345678 as the value.
Then it calls the function.
When the function returns, it goes back to the correct spot on the stack, and compares that value to 0x12345678. If the value has changed, this indicates that the function that was called wrote outside the area of the stack where it was allowed to write, so it (and that process in general) are deemed untrustworthy, and shut down.
In this case, instead of choosing 0x12345678, the system chooses a different value on a regular basis, such as every time the system is started. This means it's less likely to hit the correct value by accident -- it might happen to do so once, but if it's writing a specific value there, when the correct/chosen value changes, it'll end up writing the wrong value, and the problem will be detected.
It's probably also worth noting that this basic idea isn't particularly new. Just for example, back in the MS-DOS days, both Borland's and Microsoft's compilers would write some known value at the very bottom of the stack before calling main in your program. After main returned, they'd re-check that value. It would then print out an error message (right as the program exited) if the value didn't match what was expected.
It's exactly what the explanation says, but you can replace "cookie" with "some value". When the function is called, it puts some value on the stack. When the function returns, it checks it again to see if it changed.
The normal behavior of the function is to not touch the memory location. If the value there changed, it means that function code somehow overwrote it, and this means there was a buffer overflow.

Interprocess Memory Editing - Finding changed addresses

I'm currently making one of those game trainers as a small project. I've already ran into a problem; when you "go into a different level", the addresses for things such as fuel, cash, bullets, their addresses change. This would also happen say, if you were to restart the application.
How can I re-locate these addresses?
I feel like it's a fairly basic question, but it's one of those "it is or is not possible" questions to me. Should I just stop looking and forget the concept entirely? "Too hard?"
It's a bit hard to describe exactly how to do this since it heavily dependents on the program you're studying and whether the author went out if his way to make your life difficult. Note that I've only done this once but it worked reasonably well even if I only knew a little assembly.
What is probably happening is that the values are allocated on the heap using a call to malloc/new and everytime you change level they are cleaned up and re-allocated somewhere else. So the idea is to look at the assembly code of the program to find where the pointer returned by malloc is stored and figure out a way to reliably read the content of the pointer and find the value you're looking for.
First thing you'll want is a debugger like OllyDbg and a basic knowledge of assembly. After that, start by setting a read and write breakpoint on the variable you want to examine. Since you said that you can't tell exactly where the variable is, you'll have to pause the process while it's running and search the program's memory for the value. Hopefully you'll end up with only a few results to sift through but be suspicious of anything that is on the stack since it might just be a copy for a function call or for local use.
Once the breakpoint is set just run the program until a break occurs. Now all you have to do is look at the code and examine how the variable is being accessed. If it's being passed as a parameter, go examine the call site of the function. If it's being accessed through a pointer, make a note of it and start examining the pointer. If it's being accessed as an offset of a pointer, that means it's part of a data structure so make a note of it and start examining the other variable. And so on.
Stay focused on your variable and just keep examining the code until you eventually find the root which can be one of two things:
A global variable that has a static address. This is the easiest scenario since you have a static address hardcoded straight into the code that you can use to reliably walk through the data structures.
A stack allocated variable. This is trickier and I'm not entirely sure how to deal with this scenario reliably. It's possible that its address will have the same offset from the beginning of the stack most of the time but it might not. You could also walk the stack to find the corresponding function and its parameters but this a bit tricky to get right.
Once you have an address all that's left to do is use ReadProcessMemory to locate your variable using the information you found. For example, if the address you have represents a pointer to a data structure where at offset 0x40 your fuel value is stored, then you'll have to read the value at the address, add 0x40 to it and do another read on the result.
Note that the address is only valid as long as the executable doesn't change in any way. If it's recompiled or patched then you have to start over. I believe you'll also have to be careful about Windows' ASLR which might change the address around every time you start the program.
Comment box was too small to fit this so I'll put it here.
If it's esp plus a constant then I believe that this is a parameter and not a local variable (do confirm by checking the layout of the calling convention). If that's the case, then you should step the program until it returns to its caller, figure out how the parameter is being set (look for push instructions before the call instruction) and continue exploring from there. When I did this I had to unwind the stack once or twice before I found the global pointer to the data structure.
Also the esi register is not related to the stack (I had to look it up) so I'd check how it's being set. It could be that it contains the address of the data structure and the constant is the offset to the variable. If you figure out how the register is set you'll be that much closer to the pointer.