Stack alignment analysis with gdb

Stack alignment analysis with gdb - gdb

During a lecture I learned about the importance of 8-byte alignment for x86 architectures. This was visualized using the following example:
// char 1 byte
int main()
{
char a1;
char a2;
char b1[5];
char b2[8];
char b3[3];
return 1;
}
..which during debugging would show the 8-byte-alignment for the array variables:
0xffbfe6e8, 0xffbfe6e0, 0xffbfe6d8
(gdb) print &a1
$1 = 0xffbfe6f7 ""
(gdb) print &a2
$2 = 0xffbfe6f6 ""
(gdb) print &b1
$3 = (char (*)[5]) 0xffbfe6e8
(gdb) print &b2
$4 = (char (*)[8]) 0xffbfe6e0
(gdb) print &b3
$5 = (char (*)[3]) 0xffbfe6d8
I am trying to reproduce this example, but I can't get to reveal the memory addresses.
g++ -g main.cpp (OK)
gdb a.out (OK)
(gdb) print &a1 ==> No symbol "a1" in current context.
Can somebody enlighten me to see what I am doing wrong?

Automatic variables like a1 will have an address only when the function containing them, main in this case, has begun executing. The first few instructions of the function, called its prologue, allocate this space by, for example, subtracting an appropriate amount from the stack pointer on systems where stacks grow from high addresses toward low addresses.
If you type b main and then run, this should run enough of main so that you can print the addresses of some or all of those variables. Note that, depending on optimizations done by the compiler, some variables might be placed in registers or might not get allocated at all, and will not have an address.

Related

Segmentation Fault before even the first line of `main()` is executed and there are no non-local variables

In the C++ code below, a segmentation fault occurs before the first line of main() is executed.
This happens even though there are no objects to be constructed before entering main() and it does not happen if I remove a (large) variable definition at the second line of main().
I assume the segmentation fault occurs because of the size of the variable being defined. My question is why does this occur before the prior line is executed?
It would seem this shouldn't be occurring due to instruction reordering by the optimizer. I say this based on the compilation options selected and based on debug output.
Is the size of the (array) variable being defined blowing the stack / causing the segfault?
It would seem so since using a smaller array (e.g. 15 elements) does not result in a segmentation fault and since the expected output to stdout is seen.
#include <array>
#include <iostream>
#include <vector>
using namespace std;
namespace {
using indexes_t = vector<unsigned int>;
using my_uint_t = unsigned long long int;
constexpr my_uint_t ITEMS{ 52 };
constexpr my_uint_t CHOICES{ 5 };
static_assert(CHOICES <= ITEMS, "CHOICES must be <= ITEMS");
constexpr my_uint_t combinations(const my_uint_t n, my_uint_t r)
{
if (r > n - r)
r = n - r;
my_uint_t rval{ 1 };
for (my_uint_t i{ 1 }; i <= r; ++i) {
rval *= n - r + i;
rval /= i;
}
return rval;
}
using hand_map_t = array<indexes_t, combinations(ITEMS, CHOICES)>;
class dynamic_loop_functor_t {
private:
// std::array of C(52,5) = 2,598,960 (initially) empty vector<unsigned int>
hand_map_t hand_map;
};
}
int main()
{
cout << "Starting main()..." << endl
<< std::flush;
// "Starting main()..." is not printed if and only if the line below is included.
dynamic_loop_functor_t dlf;
// The same result occurs with either of these alternatives:
// array<indexes_t, 2598960> hand_map;
// indexes_t hand_map[2598960];
}
OS: CentOS Linux release 7.9.2009 (Core)
Compiler: g++ (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
Compile command:
g++ -std=c++14 -Wall -Wpedantic -Og -g -o create_hand_map create_hand_map.cpp
No errors or warnings are generated at compile time.
Static analysis:
A static analysis via cppcheck produces no unexpected results.
Using check-config as suggested in the command output below yields only: Please note: Cppcheck does not need standard library headers to get proper results.
$ cppcheck --enable=all create_hand_map.cpp
create_hand_map.cpp:136:27: style: Unused variable: dlf [unusedVariable]
dynamic_loop_functor_t dlf;
^
nofile:0:0: information: Cppcheck cannot find all the include files (use --check-config for details) [missingIncludeSystem]
Attempted debug with GDB:
$ gdb ./create_hand_map
GNU gdb (GDB) Red Hat Enterprise Linux 8.0.1-36.el7
<snip>
This GDB was configured as "x86_64-redhat-linux-gnu".
<snip>
Reading symbols from ./create_hand_map...done.
(gdb) run
Starting program: ./create_hand_map
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400894 in std::operator<< <std::char_traits<char> > (__s=0x4009c0 "Starting main()...",
__out=...) at /opt/rh/devtoolset-7/root/usr/include/c++/7/ostream:561
561 __ostream_insert(__out, __s,
(gdb) bt
#0 0x0000000000400894 in std::operator<< <std::char_traits<char> > (
__s=0x4009c0 "Starting main()...", __out=...)
at /opt/rh/devtoolset-7/root/usr/include/c++/7/ostream:561
#1 main () at create_hand_map.cpp:133
(gdb)

This is definitely a stack overflow. sizeof(dynamic_loop_functor_t) is nearly 64 MiB, and the default stack size limit on most Linux distributions is only 8 MiB. So the crash is not surprising.
The remaining question is, why does the debugger identify the crash as coming from inside std::operator<<? The actual segfault results from the CPU exception raised by the first instruction to access to an address beyond the stack limit. The debugger only gets the address of the faulting instruction, and has to use the debug information provided by the compiler to associate this with a particular line of source code.
The results of this process are not always intuitive. There is not always a clear correspondence between instructions and source lines, especially when the optimizer may reorder instructions or combine code coming from different lines. Also, there are many cases where a bug or problem with one source line can cause a fault in another section of code that is otherwise innocent. So the source line shown by the debugger should always be taken with a grain of salt.
In this case, what happened is as follows.
The compiler determines the total amount of stack space to be needed by all local variables, and allocates it by subtracting this number from the stack pointer at the very beginning of the function, in the prologue. This is more efficient than doing a separate allocation for each local variable at the point of its declaration. (Note that constructors, if any, are not called until the point in the code where the variable's declaration actually appears.)
The prologue code is typically not associated with any particular line of source code, or maybe with the line containing the function's opening {. But in any case, subtracting from the stack pointer is a pure register operation; it does not access memory and therefore cannot cause a segfault by itself. Nonetheless, the stack pointer is now pointing outside the area mapped for the stack, so the next attempt to access memory near the stack pointer will segfault.
The next few instructions of main execute the cout << "Starting main". This is conceptually a call to the overloaded operator<< from the standard library; but in GCC's libstdc++, the operator<< is a very short function that simply calls an internal helper function named __ostream_insert. Since it is so short, the compiler decides to inline operator<< into main, and so main actually contains a call to __ostream_insert. This is the instruction that faults: the x86 call instruction pushes a return address to the stack, and the stack pointer, as noted, is out of bounds.
Now the instructions that set up arguments and call __ostream_insert are marked by the debug info as corresponding to the source of operator<<, in the <ostream> header file - even though those instructions have been inlined into main. Hence your debugger shows the crash as having occurred "inside" operator<<.
Had the compiler not inlined operator<< (e.g. if you compile without optimization), then main would have contained an actual call to operator<<, and this call is what would have crashed. In that case the traceback would have pointed to the cout << "Starting main" line in main itself - misleading in a different way.
Note that you can have GCC warn you about functions that use a large amount of stack with the options -Wstack-usage=NNN or -Wframe-larger-than=NNN. These are not enabled by -Wall, but could be useful to add to your build, especially if you expect to use large local objects. Specifying either of them, with a reasonable number for NNN (say 4000000), I get a warning on your main function.

You must raise the stack size limit before putting the huge object on stack.
In Linux you can achieve that by calling setrlimit() from main(). From then on you can invoke functions with huge stack objects. E.g.:
struct huge_t { /* something really huge lives here */ };
int main () {
struct rlimit rlim;
rlim.rlim_cur = sizeof (huge_t) + 1048576;
setrlimit (RLIMIT_STACK, &rlim);
return worker ();
}
int worker () {
struct huge_t huge;
/* do something with huge */
return EXIT_SUCCESS;
}
Because local objects are allocated on stack before you have the chance to call setrlimit() the huge object must be in worker().

Why is there a memory space between my function local variables in the stack?

I have a C program compiled with gcc on Ubuntu x86. This is a function I am calling from main
void addme()
{
long a = 5;
char c = '3';
long array[3];
array[0] = 2;
array[1] = 4;
array[2] = 8;
}
If I break at last line, and debug/inspect this is what I get
(gdb) print &a
$5 = (long *) 0xbffff04c
(gdb) print &c
$6 = 0xbffff04b "3\005"
(gdb) print &array
$7 = (long (*)[3]) 0xbffff03c
(gdb) x 0xbffff03c
0xbffff03c: 0x00000002
(gdb) x 0xbffff040
0xbffff040: 0x00000004
(gdb) x 0xbffff044
0xbffff044: 0x00000008
(gdb) x 0xbffff04c
0xbffff04c: 0x00000005
Why is 0xbffff048, 0xbffff049, 0xbffff04a and 0xbffff04b reserved for the char c when only 0xbffff04b is required to store a char?
Also what does this notation "3\005" mean?
On the other hand if my method is as below, there is no padding for the character with three extra bytes of storage
void addme()
{
long a = 5;
char c = '3';
char line[9];
char d = '4';
}
This is how memory allocation looks like for these variables (skipping the leading part of the address)
a - f04c
c - f04b
d - f04a
line - f041, f042, f043, f044, f045, f046, f047, f048, f049
Also not sure why d was hoisted above line in memory reservation. I assume because it wasn't initialized, it goes to a different region in stack than initialized variables?

This is called alignment. Objects are aligned to multiples of specific integers (usually 4 or 8 in case of long) for fast access.
In general, you don't need to worry too much about the placement in C++, since the language specification usually enables the compiler to choose the most efficient (in terms of your orientation of optimization) way to store objects, which is usually the case.
Every object type has the property called alignment requirement, which is an integer value (of type std::size_t, always a power of 2) representing the number of bytes between successive addresses at which objects of this type can be allocated. The alignment requirement of a type can be queried with alignof or std::alignment_of. The pointer alignment function std::align can be used to obtain a suitably-aligned pointer within some buffer, and std::aligned_storage can be used to obtain suitably-aligned storage.
Each object type imposes its alignment requirement on every object of that type; stricter alignment (with larger alignment requirement) can be requested using alignas.
In order to satisfy alignment requirements of all non-static members of a class, padding may be inserted after some of its members.
(cppreference)
Regarding your second question, #prl gives the answer:
Because c is a char, &c is a char *, so gdb prints it as a string. The first character of the string is '3', the value of c. The next character is 5, the low byte of a, which gdb prints in octal escape notation. Escape sequences in C on Wikipedia – prl 1024 min ago
Why did the pad disappear when you declare chars after the char? Because in this case, char's alignment appears to be 1, which means no padding is needed. On the other hand, long's appears to be 4, so there has to be a 4-byte space, in which the char is placed.
I assume because it wasn't initialized, it goes to a different region in stack than initialized variables?
Not really. Whether an variable is initialized (in general) does not affect its placement, only that it has an indeterminate value. On the other hand, the compiler is free to place the objects in memory the way it likes. In practice, compilers "enjoy" implementations that lead to efficiency, both in memory and time.

Should the "optimized out" value be a random or not?

I want to check how optimization options of gcc to affect the program, and the code is like this:
#include <iostream>
class A {
public:
A() {
a[0] = 10;
a[1] = 20;
empty;
}
int a[5];
bool empty;
};
int main(void){
A a;
std::cout << a.empty << std::endl;
return 0;
}
Since empty member isn't assigned value in constructor, I expect it is a random value. Compile and run it:
# g++ -g test.cpp
# ./a.out
254
# ./a.out
254
# ./a.out
253
# ./a.out
253
The result is as my expected. Then I use -O2 compile option:
# g++ -g -O2 test.cpp
# ./a.out
0
# ./a.out
0
# ./a.out
0
# ./a.out
0
It seems always 0. I use gdb to debug the program:
14 int main(void){
(gdb) n
17 std::cout << a.empty << std::endl;
(gdb) p a
$1 = {
a = {[0] = 10, [1] = 20, [2] = <optimized out>, [3] = <optimized out>, [4] = <optimized out>},
empty = <optimized out>
}
I just want to make sure since the empty is optimized out, its value should also be random, is it right?

I understand why you might think that it should be random, but what you're asking for is to have a garbage value with certain characteristics. In other words: You want undefined behavior to be defined. Sorry, but you can't ask for that! It's random when you average over all computers in the world. Not on your specific, small program on your tiny laptop. It's the same reason why random errors are not easily discovered on a single computer, because they're random over many uses over many computers.

Uninitialized values are just that: uninitalized. They have the value that is in that location in memory. It may be the same for multiple runs, or it may be "random". It depends on where in memory it is located, and what value resides in that memory and, quite plausibly, what other calls have been made prior to the call to the current function, and what data they operated on. In this case, your code is in main, so there is not much variation in the calls that go on before main (unless you change the system in general, swap compilers etc).
On some systems, for some data types, that could lead to a trap - either because uninitialized data has the wrong parity values, or because the value itself is invalid (e.g. trying to load a floating point number that has an invalid combination of bits, loading a "far" pointer on a protected mode x86-system when the segment part of the address is not a valid segment descriptor index, etc)
It is "non deterministic" more than "random" - there is no way to look at the code to figure out what the value should be. But thinking that it should be a good source of random numbers is incorrect. It's very often "guessable" based on what happened before - just that you have to know "how did we get here, and what happened on the way", rather than just reading the value in the source-code to determine what it's value is.

Suppose I declare an int but don't initialize it; what value is it? Can someone clear this up for me?

Can someone help me get a better understanding of creating variables in C++? I'll state my understanding and then you can correct me.
int x;
Not sure what that does besides declare that x is an integer on the stack.
int x = 5;
Creates a new variable x on the stack and sets it equal to 5. So empty space was found the stack and then used to house that variable.
int* px = new int;
Creates an anonymous variable on the heap. px is the memory address of the variable. Its value is 0 because, well, the bits are all off at that memory address.
int* px = new int;
*px = 5;
Same thing as before, except that the value of the integer at memory address px is set to 5. (Does this happen in 1 step???? Or does the program create an integer with value 0 on the heap and then set it to 5?
I know that everything I wrote above probably sounds naive, but I really am trying to understand this stuff.

Others have answered this question from the point of view of how the C++ standard works. My only additional comment there would be with global or static variables. So if you have
int bar ()
{
static int x;
return x;
}
then x doesn't live on the stack. It will be initialised to zero at the "start of time" (this is done in a function called crt0, at least with GCC: look up "BSS" segments for more information) and bar will return zero.
I'd massively recommend looking at the assembled code to see how a compiler actually treats what you write. For example, consider this tiny snippet:
int foo (int a)
{
int x, y;
x = 3;
y = a;
return x + y;
}
I made sure to use the values of x and y (by returning their sum) to ensure the compiler didn't just elide them completely. If you stick that code in a file called tmp.cc and then compile it with
$ g++ -O2 -c -o tmp.o tmp.cc
then ask for the disassembled code with objdump, you get:
$ objdump -d tmp.o
tmp.o: file format elf32-i386
Disassembly of section .text:
00000000 <_Z3fooi>:
0: 8b 44 24 04 mov 0x4(%esp),%eax
4: 83 c0 03 add $0x3,%eax
7: c3 ret
Whoah! What happened to x and y? Well, the point is that the C and C++ standards merely require the compiler to generate code that has the same behaviour as what your program asks for. In fact, this program loads 32 bits from the stack (this is the contents of a, a fact dictated by the ABI on my particular platform) and sticks it in the eax register. Then it adds three and returns. Another important fact about the ABI on my laptop (and probably yours too) is that the return value of a function sits in eax. Notice, the function didn't allocate any memory on the stack at all!
In fact, I also put bar (from above) in my tmp.cc. Here's the resulting code:
00000010 <_Z3barv>:
10: 31 c0 xor %eax,%eax
12: c3 ret
"Huh, what happened to x?", I hear you say :-) Well, the compiler spotted that nothing in the code required x to actually exist, and it always had the value zero. So the function basically got transformed into
int bar ()
{
return 0;
}
Magic!

When a new variable is created, it does not have a value. It can be anything, pretty much depending on what was in that piece of stack or heap before. int x; will give you a warning if you try to use the value without setting it to something first. E.g. int y = x; will cause a warning unless you give x an explicit value first.
Creating an int on the heap works pretty much the same way: int *p = new int; calls the default constructor, which does nothing, leaving the value of *p up to chance until you set it to something explicit. If you want to make sure your heap value is initialized, use int *p = new int(5); to tell the constructor what value to copy into the memory it allocates.
Unless you initialize an int variable to zero explicitly, it is pretty much never initialized for you unless it is a global, namespace, or class static.

In VS2010 specifically(other compilers may treat it differently), an int is not given a default value of 0. You can see this by trying to print out a non-initialized int. It does allocate memory with a size of int but it is not initialized(just junk).
In both of your cases, the memory is allocated FIRST, and then the value is set. If a value is not set, you have a non-initialized part of memory that will have "junk data" inside of it and you will get a compiler warning and possibly an error when running it.
Yes, it has an address in memory but there is no valid(known) data inside of it unless you specifically set it. It vary well could be anything that the compiler recognizes as available memory to be overwritten. Since it is unknown and not reliable, it is considered junk and useless and why compilers warn you about it.
Compilers WILL set static int and global int to 0.
EDIT: Due to Peter Schneider's comment.

can anyone explain where the constants or const variables stored?

As we all know, C++'s memory model can be divided to five blocks: stack, heap, free blocks, global/static blocks, const blocks. I can understand the first three blocks and I also know variables like static int xx are stored in the 4th blocks, and also the "hello world"-string constant, but what is stored in the 5th blocks-const blocks? and , like int a = 10, where does the "10" stored? Can someone explain this to me?
Thanks a lot.

There is a difference between string literals and primitive constants. String literals are usually stored with the code in a separate area (for historical reasons this block is often called the "text block"). Primitive constants, on the other hand, are somewhat special: they can be stored in the "text" block as well, but their values can also be "baked" into the code itself. For example, when you write
// Global integer constant
const int a = 10;
int add(int b) {
return b + a;
}
the return expression could be translated into a piece of code that does not reference a at all. Instead of producing binary code that looks like this
LOAD R0, <stack>+offset(b)
LOAD R1, <address-of-a>
ADD R0, R1
RET
the compiler may produce something like this:
LOAD R0, <stack>+offset(b)
ADD R0, #10 ; <<== Here #10 means "integer number 10"
RET
Essentially, despite being stored with the rest of the constants, a is cut out of the compiled code.
As far as integer literals constants go, they have no address at all: they are always "baked" into the code: when you reference them, instructions that load explicit values are generated, in the same way as shown above.

and , like int a = 10, where does the "10" stored?
It's an implementation detail. Will likely be a part of generated code and be turned into something like
mov eax, 10
in assembly.
The same will happen to definitions like
const int myConst = 10;
unless you try to get address of myConst like that:
const int *ptr = &myConst;
in which case the compiler will have to put the value of 10 into the dedicated block of memory (presumably 5th in your numeration).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js