Can I get valgrind to tell me _which_ value is uninitialized? - c++

I ran valgrind on some code as follows:
valgrind --tool=memcheck --leak-check=full --track-origins=yes ./test
It returns the following error:
==24860== Conditional jump or move depends on uninitialised value(s)
==24860== at 0x4081AF: GG::fl(M const&, M const&) const (po.cpp:71)
==24860== by 0x405CDB: MO::fle(M const&, M const&) const (m.cpp:708)
==24860== by 0x404310: M::operator>=(M const&) const (m.cpp:384)
==24860== by 0x404336: M::operator<(M const&) const (m.cpp:386)
==24860== by 0x4021FD: main (test.cpp:62)
==24860== Uninitialised value was created by a heap allocation
==24860== at 0x4C2EBAB: malloc (vg_replace_malloc.c:299)
==24860== by 0x40653F: GODA<unsigned int>::allocate_new_block() (goda.hpp:82)
==24860== by 0x406182: GODA<unsigned int>::GODA(unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) (goda.hpp:103)
==24860== by 0x402A0E: M::init(unsigned long) (m.cpp:63)
==24860== by 0x403831: M::M(std::initializer_list<unsigned int>, MO const*) (m.cpp:248)
==24860== by 0x401B56: main (test.cpp:31)
So line 71 has an error. OK, great. Here are the lines leading up to line 71 of po.cpp (line 71 is last):
DEG_TYPE dtk = t.ord_deg();
DEG_TYPE duk = u.ord_deg();
bool searching = dtk == duk;
NVAR_TYPE n = t.nv();
NVAR_TYPE k = 0;
for (/* */; searching and k < n; ++k) { // this is line 71
OK, so which value of line 71 is uninitialized?
certainly not k;
I manually checked (= "stepping through gdb") that t's constructor initializes the value that is returned by t.nv(), so certainly not n (in fact n is set to 6, the correct value);
searching is determined by dtk and duk, but I also manually checked that t's and u's constructors initialize the values that are returned by .ord_deg() (in fact both dtk and duk are set to 3, the correct value).
I'm at a complete loss here. Is there some option that will tell valgrind to report which precise value it thinks is uninitialized?
Update
In answer to one question, here is line 61 of test.cpp:
M s { 1, 0, 5, 2, 0 };
So it constructs using an initializer list. Here's that constructor:
M::M(
initializer_list<EXP_TYPE> p, const MO * ord
) {
common_init(ord);
init_e(p.size());
NVAR_TYPE i = 0;
last = 0;
for (
auto pi = p.begin();
pi != p.end();
++pi
) {
if (*pi != 0) {
e[last] = i;
e[last + 1] = *pi;
last += 2;
}
++i;
}
ord->set_data(*this);
}
Here's the data in the class, adding comments showing where it's initialized:
NVAR_TYPE n; // init_e()
EXP_TYPE * e; // common_init()
NVAR_TYPE last; // common_init()
DEG_TYPE od; // common_init(), revised in ord->set_data()
const MO * o; // common_init()
MOD * o_data; // common_init(), revised in ord->set_data()

Is there some option that will tell valgrind to report which precise
value it thinks is uninitialized?
The best you can do is to use --track-origins=yes (you already using this option). Valgrind will tell you only approximate location of uninitialised values (origin in terms of Valgrind), but not exact variable name. See Valgrind manual for --track-origins:
When set to yes, Memcheck keeps track of the origins of all
uninitialised values. Then, when an uninitialised value error is
reported, Memcheck will try to show the origin of the value. An origin
can be one of the following four places: a heap block, a stack
allocation, a client request, or miscellaneous other sources (eg, a
call to brk).
For uninitialised values originating from a heap block, Memcheck shows
where the block was allocated. For uninitialised values originating
from a stack allocation, Memcheck can tell you which function
allocated the value, but no more than that -- typically it shows you
the source location of the opening brace of the function. So you
should carefully check that all of the function's local variables are
initialised properly.

You can use gdb+vgdb+valgrind to debug your program under valgrind.
Then, when valgrind stops on the error reported above, you can examine the
definedness of the variables you are interested in using the monitor request
'xb' or 'get_vbits' by asking the adress of the variable, and then examining
the vbits for the size of the variable.
For example:
p &searching
=> 0xabcdef
monitor xb 0xabcdef 1
=> will show you the value of searching and the related vbits.
For more details, see 'Debugging your program using Valgrind gdbserver and GDB' http://www.valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.gdbserver
and 'Memcheck Monitor Commands' http://www.valgrind.org/docs/manual/mc-manual.html#mc-manual.monitor-commands

From the Valgrind documentation,
4.2.2. Use of uninitialised values
...
Sources of uninitialised data tend to be:
- Local variables in procedures which have not been initialised, as in the example above.
- The contents of heap blocks (allocated with malloc, new, or a similar function) before you (or a constructor) write something there.
You have:
==24860== Uninitialised value was created by a heap allocation
==24860== at 0x4C2EBAB: malloc (vg_replace_malloc.c:299)
==24860== by 0x40653F: GODA<unsigned int>::allocate_new_block() (goda.hpp:82)
So it is very likely that the malloc used by GODA<unsigned int>::allocate_new_block() is causing this error.

You could use clang-tidy as an alternative to find uninitialized variables.
QtCreator 4.7 has full integration of clang-tidy, select "Clang-Tidy and Clazy" in the debug pane, press run, select the file(s) you want to test.

You need to understand how memcheck works. In order to avoid generating excessive errors, uninitialized values don't get flagged until they have a possible impact on your code. The uninitialized information gets propagated by assignments.
// if ord_deg returns something that is uninitialized, dtk and/or duk will be
// flagged internally as uninitialized but no error issued
DEG_TYPE dtk = t.ord_deg();
DEG_TYPE duk = u.ord_deg();
// again transitively if either dtk or duk is flagged as uninitialized then
// searching will be flagged as uninitialized, and again no error issued
bool searching = dtk == duk;
// if nv() returns something that is uninitialized, n will be
// flagged internally as unintialized
NVAR_TYPE n = t.nv();
// k is flagged as initialized
NVAR_TYPE k = 0;
// OK now the values of searching and n affect your code flow
// if either is uninitialized then memcheck will issue an error
for (/* */; searching and k < n; ++k) { // this is line 71

You are looking at wrong stack trace.
Valgrind tells you that uninitialized value was created by a heap allocation:
==24860== Uninitialised value was created by a heap allocation
==24860== at 0x4C2EBAB: malloc (vg_replace_malloc.c:299)
==24860== by 0x40653F: GODA<unsigned int>::allocate_new_block() (goda.hpp:82)
==24860== by 0x406182: GODA<unsigned int>::GODA(unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) (goda.hpp:103)
==24860== by 0x402A0E: M::init(unsigned long) (m.cpp:63)
==24860== by 0x403831: M::M(std::initializer_list<unsigned int>, MO const*) (m.cpp:248)
==24860== by 0x401B56: main (test.cpp:31)
You can omit a few top stack frames from third party library code, because it is less likely that the error is in third party code. You should look more closely at this stack frame, which appears to be your code:
==24860== by 0x402A0E: M::init(unsigned long) (m.cpp:63)
Most likely uninitialized variable should be in m.cpp:63 line of code.

Related

Segmentation Fault before even the first line of `main()` is executed and there are no non-local variables

In the C++ code below, a segmentation fault occurs before the first line of main() is executed.
This happens even though there are no objects to be constructed before entering main() and it does not happen if I remove a (large) variable definition at the second line of main().
I assume the segmentation fault occurs because of the size of the variable being defined. My question is why does this occur before the prior line is executed?
It would seem this shouldn't be occurring due to instruction reordering by the optimizer. I say this based on the compilation options selected and based on debug output.
Is the size of the (array) variable being defined blowing the stack / causing the segfault?
It would seem so since using a smaller array (e.g. 15 elements) does not result in a segmentation fault and since the expected output to stdout is seen.
#include <array>
#include <iostream>
#include <vector>
using namespace std;
namespace {
using indexes_t = vector<unsigned int>;
using my_uint_t = unsigned long long int;
constexpr my_uint_t ITEMS{ 52 };
constexpr my_uint_t CHOICES{ 5 };
static_assert(CHOICES <= ITEMS, "CHOICES must be <= ITEMS");
constexpr my_uint_t combinations(const my_uint_t n, my_uint_t r)
{
if (r > n - r)
r = n - r;
my_uint_t rval{ 1 };
for (my_uint_t i{ 1 }; i <= r; ++i) {
rval *= n - r + i;
rval /= i;
}
return rval;
}
using hand_map_t = array<indexes_t, combinations(ITEMS, CHOICES)>;
class dynamic_loop_functor_t {
private:
// std::array of C(52,5) = 2,598,960 (initially) empty vector<unsigned int>
hand_map_t hand_map;
};
}
int main()
{
cout << "Starting main()..." << endl
<< std::flush;
// "Starting main()..." is not printed if and only if the line below is included.
dynamic_loop_functor_t dlf;
// The same result occurs with either of these alternatives:
// array<indexes_t, 2598960> hand_map;
// indexes_t hand_map[2598960];
}
OS: CentOS Linux release 7.9.2009 (Core)
Compiler: g++ (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
Compile command:
g++ -std=c++14 -Wall -Wpedantic -Og -g -o create_hand_map create_hand_map.cpp
No errors or warnings are generated at compile time.
Static analysis:
A static analysis via cppcheck produces no unexpected results.
Using check-config as suggested in the command output below yields only: Please note: Cppcheck does not need standard library headers to get proper results.
$ cppcheck --enable=all create_hand_map.cpp
create_hand_map.cpp:136:27: style: Unused variable: dlf [unusedVariable]
dynamic_loop_functor_t dlf;
^
nofile:0:0: information: Cppcheck cannot find all the include files (use --check-config for details) [missingIncludeSystem]
Attempted debug with GDB:
$ gdb ./create_hand_map
GNU gdb (GDB) Red Hat Enterprise Linux 8.0.1-36.el7
<snip>
This GDB was configured as "x86_64-redhat-linux-gnu".
<snip>
Reading symbols from ./create_hand_map...done.
(gdb) run
Starting program: ./create_hand_map
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400894 in std::operator<< <std::char_traits<char> > (__s=0x4009c0 "Starting main()...",
__out=...) at /opt/rh/devtoolset-7/root/usr/include/c++/7/ostream:561
561 __ostream_insert(__out, __s,
(gdb) bt
#0 0x0000000000400894 in std::operator<< <std::char_traits<char> > (
__s=0x4009c0 "Starting main()...", __out=...)
at /opt/rh/devtoolset-7/root/usr/include/c++/7/ostream:561
#1 main () at create_hand_map.cpp:133
(gdb)
This is definitely a stack overflow. sizeof(dynamic_loop_functor_t) is nearly 64 MiB, and the default stack size limit on most Linux distributions is only 8 MiB. So the crash is not surprising.
The remaining question is, why does the debugger identify the crash as coming from inside std::operator<<? The actual segfault results from the CPU exception raised by the first instruction to access to an address beyond the stack limit. The debugger only gets the address of the faulting instruction, and has to use the debug information provided by the compiler to associate this with a particular line of source code.
The results of this process are not always intuitive. There is not always a clear correspondence between instructions and source lines, especially when the optimizer may reorder instructions or combine code coming from different lines. Also, there are many cases where a bug or problem with one source line can cause a fault in another section of code that is otherwise innocent. So the source line shown by the debugger should always be taken with a grain of salt.
In this case, what happened is as follows.
The compiler determines the total amount of stack space to be needed by all local variables, and allocates it by subtracting this number from the stack pointer at the very beginning of the function, in the prologue. This is more efficient than doing a separate allocation for each local variable at the point of its declaration. (Note that constructors, if any, are not called until the point in the code where the variable's declaration actually appears.)
The prologue code is typically not associated with any particular line of source code, or maybe with the line containing the function's opening {. But in any case, subtracting from the stack pointer is a pure register operation; it does not access memory and therefore cannot cause a segfault by itself. Nonetheless, the stack pointer is now pointing outside the area mapped for the stack, so the next attempt to access memory near the stack pointer will segfault.
The next few instructions of main execute the cout << "Starting main". This is conceptually a call to the overloaded operator<< from the standard library; but in GCC's libstdc++, the operator<< is a very short function that simply calls an internal helper function named __ostream_insert. Since it is so short, the compiler decides to inline operator<< into main, and so main actually contains a call to __ostream_insert. This is the instruction that faults: the x86 call instruction pushes a return address to the stack, and the stack pointer, as noted, is out of bounds.
Now the instructions that set up arguments and call __ostream_insert are marked by the debug info as corresponding to the source of operator<<, in the <ostream> header file - even though those instructions have been inlined into main. Hence your debugger shows the crash as having occurred "inside" operator<<.
Had the compiler not inlined operator<< (e.g. if you compile without optimization), then main would have contained an actual call to operator<<, and this call is what would have crashed. In that case the traceback would have pointed to the cout << "Starting main" line in main itself - misleading in a different way.
Note that you can have GCC warn you about functions that use a large amount of stack with the options -Wstack-usage=NNN or -Wframe-larger-than=NNN. These are not enabled by -Wall, but could be useful to add to your build, especially if you expect to use large local objects. Specifying either of them, with a reasonable number for NNN (say 4000000), I get a warning on your main function.
You must raise the stack size limit before putting the huge object on stack.
In Linux you can achieve that by calling setrlimit() from main(). From then on you can invoke functions with huge stack objects. E.g.:
struct huge_t { /* something really huge lives here */ };
int main () {
struct rlimit rlim;
rlim.rlim_cur = sizeof (huge_t) + 1048576;
setrlimit (RLIMIT_STACK, &rlim);
return worker ();
}
int worker () {
struct huge_t huge;
/* do something with huge */
return EXIT_SUCCESS;
}
Because local objects are allocated on stack before you have the chance to call setrlimit() the huge object must be in worker().

Why does gmp crash with "invalid next size" to realloc here?

I have a simple function using the gmp C++ bindings:
#include <inttypes.h>
#include <memory>
#include <gmpxx.h>
mpz_class f(uint64_t n){
std::unique_ptr<mpz_class[]> m = std::make_unique<mpz_class[]>(n + 1);
m[0] = 0;
m[1] = 1;
for(uint64_t i = 2; i <= n; ++i){
m[i] = m[i-1] + m[i-2];
}
return m[n];
}
int main(){
mpz_class fn;
for(uint64_t n = 0;; n += 1){
fn = f(n);
}
}
Presumably make_unique should allocate a fresh array and free it when the function returns since the unique pointer owning it has its lifetime end. Presumably the mpz_class object returned should be a copy and not affected by this array getting deleted. The program crashes with the error:
realloc(): invalid next size
and if I look at the core dump in gdb I get the stack trace:
#0 raise()
#1 abort()
#2 __libc_message()
#3 malloc_printerr()
#4 _int_realloc()
#5 realloc()
#6 __gmp_default_reallocate()
#7 __gmpz_realloc()
#8 __gmpz_add()
#9 __gmp_binary_plus::eval(v, w, z)
#10 __gmp_expr<...>::eval(this, this, p)
#11 __gmp_set_expr<...>(expr, z)
#12 __gmp_expr<...>::operator=<...>(expr, this)
#13 f(n)
#14 main(argc, argv)
This isn't helpful to me, except that it suggests maybe the problem is coming from gmpxx using expression templates (stack frames 9-12 indicate this, valgrind and stack frame 12 put the last line of my code executed before the error at m[1] = 1;). Valgrind says there is an invalid read of size 8 at this line but lists stack entries corresponding to the rest of the trace after it, and then says there is an invalid write at the next instruction. The invalid read is 8 bytes after "a block of size 24 alloc'd [by make_unique]" while the invalid write is to null. Obviously this line should not cause either though as it should only be reading a pointer and then writing to part of the buffer it points to which definitely does not have address 0x0. I decided to use the C++ bindings even though I always use gmp from C because I thought it would be faster to write but this error ensured that was not the case. Is this a problem with gmp or am I allocating the array wrong? I get similar errors if I used new and delete directly or if I manually inline the function call. I feel like the problem may have to do with mpz_class actually storing an expression template and not a proper concretized value.
I'm using GCC 9.2.0 with g++ -std=c++17 -O2 -g -Wall ... and GMP 6.1.2-3.
Neither Clang nor GCC report any errors.
If we run under Valgrind, we see:
==1948514== Invalid read of size 8
==1948514== at 0x489B0F0: __gmpz_set_si (in /usr/lib/x86_64-linux-gnu/libgmp.so.10.3.2)
==1948514== by 0x10945E: __gmp_expr<__mpz_struct [1], __mpz_struct [1]>::assign_si(long) (gmpxx.h:1453)
==1948514== by 0x1094E3: __gmp_expr<__mpz_struct [1], __mpz_struct [1]>::operator=(int) (gmpxx.h:1538)
==1948514== by 0x109248: f(unsigned long) (59678712.cpp:8)
==1948514== by 0x109351: main (59678712.cpp:18)
==1948514== Address 0x4e08ca0 is 8 bytes after a block of size 24 alloc'd
==1948514== at 0x483650F: operator new[](unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1948514== by 0x10953F: std::_MakeUniq<__gmp_expr<__mpz_struct [1], __mpz_struct [1]> []>::__array std::make_unique<__gmp_expr<__mpz_struct [1], __mpz_struct [1]> []>(unsigned long) (unique_ptr.h:855)
==1948514== by 0x10920C: f(unsigned long) (59678712.cpp:6)
==1948514== by 0x109351: main (59678712.cpp:18)
This demonstrates that when we call f(0), we write to m[1], which is out of bounds. That's undefined behaviour, so anything could happen. Luckily you got a crash, rather than something more subtle.
Simple fix:
mpz_class f(uint64_t n) {
if (!n) return 0;
BTW, prefer <cstdint> to <inttypes.h>, and write as std::uint64_t etc.

Finding the "random pointer" with valgrind "possibly lost"

Valgrind is complaining of bytes being possibly lost in one part of our large C++ program. After reading this SO question and the manual on "possibly lost" I think I understand what the error is about, but I don't understand the error in this specific case.
mpConsDbKey is not modified anywhere in the class (or derived classes), so the pointer is not being advanced or reassigned. The valgrind manual says "possibly lost" could also be declared if a "random value in memory points into the block"
Is there any way to know, via Valgrind, why it thinks it is lost?
If it is a random block of memory, best to not waste time looking for it...
Valgrind Manual
"Possibly lost". This covers cases 5--8 (for the BBB blocks) above.
This means that a chain of one or more pointers to the block has been
found, but at least one of the pointers is an interior-pointer. This
could just be a random value in memory that happens to point into a
block, and so you shouldn't consider this ok unless you know you have
interior-pointers.
Source Code & Error
DvDaTestBC::DvDaTestBC()
{
mpConsDbKey = new DvDbConsolidator; // <<== error here
}
DvDaTestBC::~DvDaTestBC()
{
delete mpConsDbKey;
}
Type Hierarchy:
Error Text:
==2745== 88 bytes in 1 blocks are possibly lost in loss record 5,750 of 8,333
==2745== at 0x4007D47: operator new(unsigned int) (vg_replace_malloc.c:292)
==2745== by 0x50B5ED8: DvDaTestBC::DvDaTestBC(unsigned long, unsigned short, unsigned long, std::string const&, short) (DvDaTestBC.cpp:68)
==2745== by 0x50BCCCE: DvDaTest::DvDaTest(unsigned long, unsigned short, unsigned long, std::string const&, DvDbAlarmConfigSet const&, Dv::ETestState, short) (DvDaTest.cpp:23)
==2745== by 0x52157F1: DvDaMipErrorReporter::createTests() (DvDaMipErrorReporter.cpp:177)
==2745== by 0x521427F: DvDaMipErrorReporter::DvDaMipErrorReporter(unsigned short, unsigned short) (DvDaMipErrorReporter.cpp:65)
==2745== by 0x554DE25: DvCfgTspModuleMgrBC::createDAs() (DvCfgTspModuleMgrBC.cpp:742)

Level 3 complex BLAS functions throwing out illegal value error

I am writing a function which calls Fortran functions for complex matrix matrix multiplication. I am calling the CGEMM_ and ZGEMM_ functions for complex multiplication. Since all xGEMM_ functions are essentially the same I copied the code from SGEMM_ to CGEMM__ and ZGEMM_. The only change made were the respective data types. The SGEMM_ and DGEMM_ functions are working fine but CGEMM_ throws the error. All inputs are the same as well.
** On entry to CGEMM parameter number 13 had an illegal value
and zgemm_ throws
** On entry to ZGEMM parameter number 1 had an illegal value
I really have no idea what's going on. Is this some kind of bug in the liblapack package? I am using liblapack-dev package. I made a smaller version of my big code and i am still getting the same error with CGEMM_.
I am running a 32-bit system and was wondering if that was the problem.
Code:
#include<iostream>
using namespace std;
#include<stdlib.h>
#include<string.h>
#include<complex>
typedef complex<float> c_float;
extern "C"
{c_float cgemm_(char*,char*,int*,int*,int*,c_float*, c_float[0],int*,c_float[0],int*,c_float*,c_float[0],int*);//Single Complex Matrix Multiplication
}
c_float** allocate(int rows, int columns)
{
c_float** data;
// Allocate Space
data = new c_float*[columns]; //Allocate memory for using multidimensional arrays in column major format.
data[0] = new c_float[rows*columns];
for (int i=0; i<columns; i++)
{
data[i] = data[0] + i*rows;
}
// Randomize input
for (int i=0; i<columns; i++)
{for (int j=0; j<rows; j++)
{
data[j][i] =complex<double>(drand48()*10 +1,drand48()*10 +1); //Randomly generated matrix with values in the range [1 11)
}
}
return(data);
}
// Destructor
void dest(c_float** data)
{
delete [] data[0];
delete [] data;
}
// Multiplication
void mult(int rowsA,int columnsA, int rowsB,int columnsB)
{
c_float **matA,**matB,**matC;
char transA, transB;
int m,n,k,LDA,LDB,LDC;
c_float *A,*B,*C;
c_float alpha(1.0,0.0);
c_float beta(0.0,0.0);
matA = allocate(rowsA,columnsA);
matB = allocate(rowsB,columnsB);
matC = allocate(rowsA,columnsB);
transA = 'N';
transB = 'N';
A = matA[0];
B = matB[0];
m = rowsA;
n = columnsB;
C = matC[0];
k = columnsA;
LDA = m;
LDB = k;
LDC = m;
cout<<"Matrix A"<<endl;
for (int i=0; i<rowsA; i++)
{for (int j=0; j<columnsA; j++)
{
cout<<matA[i][j];
cout<<" ";
}cout<<endl;
}
cout<<"Matrix B"<<endl;
for (int i=0; i<rowsB; i++)
{for (int j=0; j<columnsB; j++)
{
cout<<matB[i][j];
cout<<" ";
}cout<<endl;
}
cgemm_(&transA,&transB,&m,&n,&k,&alpha,A,&LDA,B,&LDB,&beta,C,&LDC);
cout<<"Matrix A*B"<<endl;
for (int i=0; i<rowsA; i++)
{for (int j=0; j<columnsB; j++)
{
cout<<matC[i][j];
cout<<"";
}
cout<<endl;
}
dest(matA);
dest(matB);
dest(matC);
}
main()
{
mult (2,2,2,2);
}
The output and valgrind report are as follows:
-----------------------------------------
Compilation using g++ -g -o matrix Matrix_multiplication.cpp -lblas -llapack -lgfortran
./matrix gives
Matrix A
(1.00985,1) (1.91331,4.64602)
(2.76643,1.41631) (5.87217,1.92298)
Matrix B
(5.54433,6.2675) (6.6806,10.3173)
(9.31292,3.33178) (1.50832,6.56094)
** On entry to CGEMM parameter number 1 had an illegal value
Valgrind output looks like
==4710== Memcheck, a memory error detector
==4710== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==4710== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==4710== Command: ./o
==4710== Parent PID: 3337
==4710==
==4710== Conditional jump or move depends on uninitialised value(s)
==4710== at 0x46E5096: lsame_ (in /usr/lib/atlas-base/atlas/libblas.so.3.0)
==4710== by 0x46DD683: cgemm_ (in /usr/lib/atlas-base/atlas/libblas.so.3.0)
==4710== by 0x8048C7E: mult(int, int, int, int) (Matrix_multiplication.cpp:83)
==4710== by 0x8048D70: main (Matrix_multiplication.cpp:102)
==4710== Uninitialised value was created by a stack allocation
==4710== at 0x8048A18: mult(int, int, int, int) (Matrix_multiplication.cpp:43)
==4710==
==4710== Conditional jump or move depends on uninitialised value(s)
==4710== at 0x46DD686: cgemm_ (in /usr/lib/atlas-base/atlas/libblas.so.3.0)
==4710== by 0x8048C7E: mult(int, int, int, int) (Matrix_multiplication.cpp:83)
==4710== by 0x8048D70: main (Matrix_multiplication.cpp:102)
==4710== Uninitialised value was created by a stack allocation
==4710== at 0x8048A18: mult(int, int, int, int) (Matrix_multiplication.cpp:43)
==4710==
==4710== Conditional jump or move depends on uninitialised value(s)
==4710== at 0x46E5096: lsame_ (in /usr/lib/atlas-base/atlas/libblas.so.3.0)
==4710== by 0x46DD7B1: cgemm_ (in /usr/lib/atlas-base/atlas/libblas.so.3.0)
==4710== by 0x8048C7E: mult(int, int, int, int) (Matrix_multiplication.cpp:83)
==4710== by 0x8048D70: main (Matrix_multiplication.cpp:102)
==4710== Uninitialised value was created by a stack allocation
==4710== at 0x8048A18: mult(int, int, int, int) (Matrix_multiplication.cpp:43)
==4710==
==4710== Conditional jump or move depends on uninitialised value(s)
==4710== at 0x46DD7B4: cgemm_ (in /usr/lib/atlas-base/atlas/libblas.so.3.0)
==4710== by 0x8048C7E: mult(int, int, int, int) (Matrix_multiplication.cpp:83)
==4710== by 0x8048D70: main (Matrix_multiplication.cpp:102)
==4710== Uninitialised value was created by a stack allocation
==4710== at 0x8048A18: mult(int, int, int, int) (Matrix_multiplication.cpp:43)
==4710==
==4710== Conditional jump or move depends on uninitialised value(s)
==4710== at 0x46E5096: lsame_ (in /usr/lib/atlas-base/atlas/libblas.so.3.0)
==4710== by 0x46DD859: cgemm_ (in /usr/lib/atlas-base/atlas/libblas.so.3.0)
==4710== by 0x8048C7E: mult(int, int, int, int) (Matrix_multiplication.cpp:83)
==4710== by 0x8048D70: main (Matrix_multiplication.cpp:102)
==4710== Uninitialised value was created by a stack allocation
==4710== at 0x8048A18: mult(int, int, int, int) (Matrix_multiplication.cpp:43)
==4710==
==4710== Conditional jump or move depends on uninitialised value(s)
==4710== at 0x46DD85C: cgemm_ (in /usr/lib/atlas-base/atlas/libblas.so.3.0)
==4710== by 0x8048C7E: mult(int, int, int, int) (Matrix_multiplication.cpp:83)
==4710== by 0x8048D70: main (Matrix_multiplication.cpp:102)
==4710== Uninitialised value was created by a stack allocation
==4710== at 0x8048A18: mult(int, int, int, int) (Matrix_multiplication.cpp:43)
==4710==
==4710==
==4710== HEAP SUMMARY:
==4710== in use at exit: 120 bytes in 6 blocks
==4710== total heap usage: 43 allocs, 37 frees, 13,897 bytes allocated
==4710==
==4710== LEAK SUMMARY:
==4710== definitely lost: 0 bytes in 0 blocks
==4710== indirectly lost: 0 bytes in 0 blocks
==4710== possibly lost: 0 bytes in 0 blocks
==4710== still reachable: 120 bytes in 6 blocks
==4710== suppressed: 0 bytes in 0 blocks
==4710== Rerun with --leak-check=full to see details of leaked memory
==4710==
==4710== For counts of detected and suppressed errors, rerun with: -v
==4710== ERROR SUMMARY: 6 errors from 6 contexts (suppressed: 0 from 0)
EDIT: The question was modified with a code that can be run. The problem remains the same and the nature of the question has not changed.
The answer about the length of character variables in Fortran is essentially correct, but that is not your problem here. Fixed length character variables of functions inside the blas library will probably not read the length from a function argument. I checked this for a function and even at -O0 the length was a compile-time constant.
The cause of your particular problem is the definition c_float cgemm_(..., where you tell the compiler that cgemm_ returns a c_float. Normally, return values are put in a register, but when they are too large they can also go on the stack. In your case, on a 32bit system, this seems to be the case for the 8-byte c_float. Defining the function to be void cgemm_ (as it should be) or even int cgemm_ (which will use a register) solves the issue.
The take-home message is "don't do this", as this is a hackish way of calling and will cause headaches when dealing with different platforms/compilers. It is much better to use the cblas interface, or a C++ library for blas operations.
I don't see the lengths of the transA or transB strings being passed to the xgemm_ call.
Character dummies in Fortran are accompanied by a 'hidden' length argument. The convention used by GCC 4.9.0, for example, for this is described more here:
https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gfortran/Argument-passing-conventions.html.
The positioning of these hidden arguments in the argument list is platform dependent. On Linux they are placed after all the explicit arguments.
Consider s.f90
Subroutine s(c, r)
Character (*), Intent (In) :: c
Real, Intent (In) :: r
Print '(3A,I0,A)', 'c = "', c, '", (len(c)=', len(c), ')'
Print *, 'r = ', r
End Subroutine
and main.c
#include <string.h>
int main(void)
{
char c[1+1];
float r=4.2;
strcpy(c,"A");
s_(c,&r,1);
}
For running on Linux I am passing 1 as the (hidden in Fortran) third argument for s, representing the length of my string.
Compiling and running with gfortran gives me
> gfortran -g main.c s.f90 && ./a.out
c = "A", (len(c)=1)
r = 4.19999981
So probably your xgemm_ calls should be ...,&LDC,1,1);?

C/C++ - Why is the heap so big when I'm allocating space for a single int?

I'm currently using gdb to see the effects of low level code. Right now I'm doing the following:
int* pointer = (int*)calloc(1, sizeof(int));
yet when I examine the memory using info proc mappings in gdb, I see the following after what I presume is the .text section (since Objfile shows the name of the binary I'm debugging):
...
Start Addr End Addr Size Offset Objfile
0x602000 0x623000 0x21000 0x0 [heap]
How come the heap is that big when all I did was allocating space for a single int?
The weirdest thing is, even when I'm doing calloc(1000, sizeof(int)) the size of the heap remains the same.
PS: I'm running Ubuntu 14.04 on an x86_64 machine. I'm compiling the source using g++ (yes, I know I shouldn't use calloc in C++, this is just a test).
How come the heap is that big when all I did was allocating space for a single int?
I did a simple test on Linux. When one calls calloc glibc calls at some point sbrk() to get memory from OS:
(gdb) bt
#0 0x0000003a1d8e0a0a in brk () from /lib64/libc.so.6
#1 0x0000003a1d8e0ad7 in sbrk () from /lib64/libc.so.6
#2 0x0000003a1d87da49 in __default_morecore () from /lib64/libc.so.6
#3 0x0000003a1d87a0aa in _int_malloc () from /lib64/libc.so.6
#4 0x0000003a1d87a991 in malloc () from /lib64/libc.so.6
#5 0x0000003a1d87a89a in calloc () from /lib64/libc.so.6
#6 0x000000000040053a in main () at main.c:6
But glibc does not ask OS to get exactly 4 bytes that you have asked. glibc calculates its own size. This is how it is done in glibc:
/* Request enough space for nb + pad + overhead */
size = nb + mp_.top_pad + MINSIZE;
mp_.top_pad is by default 128*1024 bytes so it is the main reason why when you ask for 4 bytes the system allocates 0x21000 bytes.
You can adjust mp_.top_pad with call to mallopt. This is from mallopt's doc:
M_TOP_PAD
This parameter defines the amount of padding to employ when
calling sbrk(2) to modify the program break. (The measurement
unit for this parameter is bytes.) This parameter has an
effect in the following circumstances:
* When the program break is increased, then M_TOP_PAD bytes
are added to the sbrk(2) request.
In either case, the amount of padding is always rounded to a
system page boundary.
So I changed you progam and added mallopt:
#include <stdlib.h>
#include <malloc.h>
int main()
{
mallopt(M_TOP_PAD, 1);
int* pointer = (int*)calloc(1, sizeof(int));
return 0;
}
I set 1 byte padding and according to doc it must be be always rounded to a system page boundary.
So this is what gdb tells me for my program:
Start Addr End Addr Size Offset objfile
0x601000 0x602000 0x1000 0x0 [heap]
So now the heap is 4096 bytes. Exactly the size of my page:
(gdb) !getconf PAGE_SIZE
4096
Useful links:
http://man7.org/linux/man-pages/man3/mallopt.3.html
Since you have mentioned, C/C++, better use the following construct:
int* pointer = new int(1);