Can compiler combine multiple malloc calls into one? - c++

Let's say we have the following two pieces of code:
int *a = (int *)malloc(sizeof(*a));
int *b = (int *)malloc(sizeof(*b));
And
int *a = (int *)malloc(2 * sizeof(*a));
int *b = a + 1;
Both of them allocate two integers on the heap and (assuming the normal usage) they should be equivalent. The first seems to be slower as it calls malloc twice and can result in a more cache-friendly code. The second however is possibly insecure as we can accidentally override the value of what b points to just by incrementing a and writing to the resulting pointer (or someone malicious can instantly change the value of b just by knowing where a is).
It's possible that the above claims are not true (for example the speed is questioned here: Minimizing the amount of malloc() calls improves performance?) but my question is just: Can the compiler do this type of transformation or is there something fundamentally different between the two according to the standard? If it is possible, what compiler flags (let's say gcc) can allow it?

In reality, no, the compiler will never combine the 2 malloc() calls into a single malloc() call automatically. Each call to malloc() returns the address of a new memory block, there is no guarantee that the allocated blocks will be located anywhere close to each other, and each allocated block must be free()'d individually. So no compiler will ever assume anything about the relationship between multiple allocated blocks and try to optimize their allocations for you.
Now, it is possible that in a very simplified use-case, where the allocation and deallocation were in the same scope, and if it can be proven to be safe to do so, then the compiler vendor might decide to try to optimize, ie:
void doIt()
{
int *a = (int *)malloc(sizeof(*a));
int *b = (int *)malloc(sizeof(*b));
...
free(a);
free(b);
}
Could become:
void doIt()
{
void *ptr = malloc(sizeof(int) * 2);
int *a = (int *)ptr;
int *b = a + 1;
...
free(ptr);
}
But in reality, no compiler vendor will actually attempt to do this. It is not worth the effort, or the risk, for such little gain. And it would not work in more complex scenarios anyway, eg:
void doIt()
{
int *a = (int *)malloc(sizeof(*a));
int *b = (int *)malloc(sizeof(*b));
...
UseAndFree(a, b);
}
void UseAndFree(int *a, int *b)
{
...
free(a);
free(b);
}

No, it can't, because the compiler (in general) doesn't know when a and b might get free()'d, and if it allocates them both as part of a single allocation, then it would need to free() them both at the same time also.

There's a number of reasons why this will likely never happen, but the most important is lifetimes where these allocations, if made independently, can be freed independently. If made together they're locked to the same lifetime.
This sort of nuance is best expressed by the developer rather than determined by the compiler.
Is the second "insecure" in that you can overwrite values? In C, and by extension C++, the language does not protect you from bad programming. You are free to shoot yourself in the foot at any time, using any means necessary:
int a;
int b;
int* p = &a;
p[1] = 9; // Bullet, meet foot
(&b)[-1] = 9; // Why not?
If you want to allocate N of something by all means use calloc() to express it, or an appropriately sized malloc(). Doing individual allocations is pointless unless there's a good reason.
Normally you wouldn't allocate a single int, that's kind of useless, but there are cases where that might be the only reasonable option. Typically it's larger blocks of things, like a full struct or a character buffer.

First of all:
int *a = (int *)malloc(8);
int *b = a + 4;
Is not what you think. You want:
int *a = malloc(sizeof(*a) * 2);
int *b = a + 1;
It shows that pointer arithmetic is something you need to learn.
Secondly: the compiler does not change anything in your code, and it will not combine any function calls in one. What you try to achieve is a micro-optimization. If you want to use a larger chunk of memory simply use arrays.
int *a = malloc(sizeof(*a) * 2);
a[0] = 5;
a[1] = 6;
/* some other code */
free(a);
Do not use "magic" number is malloc only sizeof of the objects. Do not cast the result of malloc

I've done exactly that with a bignum library, but you only free the one pointer.
//initialization every time program runs
extern bignum_t *scratch00; //these are useful for taylor series, etc.
extern bignum_t *scratch01;
extern bignum_t *scratch02;
.
.
.
bignum_t *bn_malloc(int bignums)
{
return(malloc(bignums * bn_numbytes));
}
.
.
.
//bignums specific to the program being written at the moment
bignum_t *numerator;
bignum_t *denom;
bignum_t *denom_add;
bignum_t *accum;
bignum_t *term;
.
.
.
numerator = bn_malloc(1);
denom = bn_malloc(1);
denom_add = bn_malloc(1);
accum = bn_malloc(1);
term = bn_malloc(1);

Related

How to release the memory outside of a function if I forgot to release the memory before the function returns

my question can be briefly shown as the following example.
void func(int n){
char *p = (char*)malloc(n);
// some codes
memset(p,0,sizeof(name));
// free(p); // Commenting this line represents that I forget to release the allocated memory.
}
int main(){
// some codes
for (int i; i < Nl; i++){
func(100);
// How can I release the allocated memory of p outside of the func?
}
}
I whish to release the allocated memory, which is allocated in a function, outside of this function.
Thank you.
The pointer in question is not returned from the function in any way, so if you don't free it in the function then the memory is leaked. You would need to modify the function to either assign the pointer to a global, return it from the function, or assign it to a dereferenced pointer passed to the function.
If your goal is to find and fix memory leaks in your program, there are tools such as valgrind which can help you with that.
To solve the problem it's better to use std::unique_ptr. If you use smart pointer it will be released whenever it's not required anymore.
Foe example :
void my_func()
{
std::unique_ptr<int> valuePtr(new int(15));
int x = 45;
// ...
if (x == 45)
return; // no memory leak anymore!
// ...
}
You can check the link to study more :
https://en.cppreference.com/book/intro/smart_pointers
Two solutions in C
a) use VLA
void foo(int n) {
char p[n];
// use p
// no need to free, p automatically releases its memory
}
b) return the pointer to the caller
char *foo(int n) {
char *p = malloc(n);
// use p
return p;
}
int main(void) {
char *bar = foo(100);
free(bar);
}
If I understand you correctly, you are struggling with the basic concept of memory management.
Looking at your code you are malloc, which is a core C memory management aspect, although you tagged the question as C and C++. I can go into depth on why there ain't a thing like C/C++, though it's better explained here: https://cor3ntin.github.io/posts/c/
One of the elements C and C++ programmers disagree is the use of malloc, which is standard in C and and only used in exceptional cases in C++.
If we look from a C++ standpoint, I'd argue you should be learning it with a recent version. Here the answer is simple: use std::make_unique:
auto p = std::make_unique<char[]>(n);
Or in this case as you are trying to do something with strings, just use std::string. Trust me, doing so will prevent a lot of grief. Let me also remark that you often don't need memory allocations, more about that can be found here: https://stackoverflow.com/a/53898150/2466431
If you however ain't programming C++, you can use malloc. Here it is important to understand that every pointer returned by malloc ends up as an argument for free. (Exceptions on this ain't for beginners)
After you call free, you can't use what the pointer points to, not the value stored in the pointer. Calling free twice for the same pointer is also an issue.
Hence, unlike in the C++ code where the memory gets freed when no longer used. You need to keep detailed attention for this and call free.
In your function, uncommenting the free is the correct solution.
If you have the intention to let the data outlive the function call, you should be returning the pointer to the caller, this is than responsible for the ownership:
char * func(int n){
char *p = (char*)malloc(n);
// some codes
memset(p,0,sizeof(name));
return p;
}
int main(){
// some codes
for (int i; i < Nl; i++){
char *s = func(100);
free(s);
}
}
Let me show the same with the previously mentioned C++:
#include <memory>
std::unique_ptr<char[]> func(int n){
auto p = std::make_unique<char[]>(n);
// some codes
return p;
}
int main(){
// some codes
for (int i; i < Nl; i++){
auto s = func(100);
}
}
Or using std::string
#include <string>
std::string func(int n){
auto p = std::string(n, '\0');
// some codes
return p;
}
int main(){
// some codes
for (int i; i < Nl; i++){
auto s = func(100);
}
}

Difference between creating a pointer with 'new' and without 'new' apart from memory allocation?

What is the difference between these pointers?
I know that this one is going to be stored on the heap, even though a pointer is only 8 bytes anyways, so the memory is not important for me.
int* aa = new int;
aa = nullptr;
and this one is going to be stored on the stack.
int* bb = nullptr;
They both seem to work the same in my program. Is there any difference apart from memory allocation? I have a feeling that the second one is bad for some reason.
2) Another question which is somewhat related:
Does creating a pointer like that actually take more memory? If we take a look at the first snippet, it creates an int somewhere (4 bytes) and then creates a pointer to it (8 bytes), so is it 12 bytes in total? If yes are they both in the heap then? I can do this, so it means an int exists:
*aa = 20;
Pointers are integers that just indicate a memory position, and a type (so they can only point to variables of that type).
So in your examples, all pointers are stored in the stack (unless they are global variables, but that is another question). What they are pointing to is in the heap, as in the next example.
void foo()
{
int * ptr = new int(42);
// more things...
delete ptr;
}
You can have a pointer pointing into the stack, for example, this way:
void foo()
{
int x = 5;
int * ptr = &x;
// more things...
}
The '&' operator obtains the memory position of the variable x in the example above.
nullptr is the typed equivalent to old NULL. They are a way to initialize a pointer to a known and secure value, meaning that they are not pointing to anything else, and that you can compare whether they are NULL or not.
The program will accept pointers pointing to the stack or the heap: it does not matter.
void addFive(int * x)
{
*x += 5;
}
void foo()
{
int x = 5;
int * ptr1 = &x;
int * ptr2 = new int(42);
addFive( ptr1 );
addFive( ptr2 );
addFive( &x );
printf( "%d\n", *ptr1 );
printf( "%d\n", *ptr2 );
// more things...
delete ptr2;
}
The only difference is that the C runtime will keep structures telling how much memory has been spent in the heap, and therefore storing variables in the heap comes at a cost in performance. On the other hand, the stack is always limited to a fixed amount of memory (relatively small), while the heap is much larger, allowing you to store big arrays, for example.
You could take a look at C-Sim, which simulates memory in C (disclaimer: I wrote it).
Hope this helps.

Why pointer to pointer?

A very general question: I was wondering why we use pointer to pointer?
A pointer to pointer will hold the address of a pointer which in turn will point to another pointer. But, this could be achieved even by using a single pointer.
Consider the following example:
{
int number = 10;
int *a = NULL;
a = &number;
int *b = a;
int *pointer1 = NULL;
pointer1 = b; //pointer1 points to the address of number which has value 10
int **pointer2 = NULL;
pointer2 = &b; //pointer2 points to the address of b which in turn points to the address of number which has value 10. Why **pointer2??
return 0;
}
I think you answered your own question, the code is correct, what you commented isn't.
int number = 10; is the value
int *pointer1 = b; points to the address where int number is kept
int **pointer2 = &b; points to the address where address of int number is kept
Do you see the pattern here??
address = * (single indirection)
address of address = ** (double indirection)
The following expressions are true:
*pointer2 == b
**pointer2 == 10
The following is not!
*pointer2 == 10
Pointer to pointer can be useful when you want to change to what a pointer points to outside of a function. For example
void func(int** ptr)
{
*ptr = new int;
**ptr = 1337;
}
int main()
{
int* p = NULL;
func(&p);
std::cout << *p << std::endl; // writes 1337 to console
delete p;
}
A stupid example to show what can be achieved :) With just a pointer this can not be done.
First of all, a pointer doesn't point to a value. It point to a memory location (that is it contains a memory address) which in turn contains a value. So when you write
pointer1 = b;
pointer1 points to the same memory location as b which is the variable number. Now after that is you execute
pointer2 = &b;
Then pointer2 point to the memory location of b which doesn't contains 10 but the address of the variable number
Your assumption is incorrect. pointer2 does not point to the value 10, but to the (address of the) pointer b. Dereferencing pointer2 with the * operator produces an int *, not an int.
You need pointers to pointers for the same reasons you need pointers in the first place: to implement pass-by-reference parameters in function calls, to effect sharing of data between data structures, and so on.
In c such construction made sense, with bigger data structures. The OOP in C, because of lack of possibility to implement methods withing structures, the methods had c++ this parameter passed explicitly. Also some structures were defined by a pointer to one specially selected element, which was held in the scope global to the methods.
So when you wanted to pass whole stucture, E.g. a tree, and needed to change the root, or 1st element of a list, you passes a pointer-to-a-pointer to this special root/head element, so you could change it.
Note: This is c-style implementation using c++ syntax for convienience.
void add_element_to_list(List** list, Data element){
Data new_el = new Data(element); // this would be malloc and struct copy
*list = new_el; //move the address of list, so it begins at new element
}
In c++ there is reference mechanismm and you generally you can implement nearly anything with it. It basically makes usage of pointers at all obsolete it c++, at least in many, many cases. You also design objects and work on them, and everything is hidden under the hood those two.
There was also a nice question lately "Why do we use pointers in c++?" or something like that.
A simple example is an implementation of a matrix (it's an example, it's not the best way to implement matrices in C++).
int nrows = 10;
int ncols = 15;
double** M = new double*[nrows];
for(unsigned long int i = 0; i < nrows; ++i)
M[i] = new double[ncols];
M[3][7] = 3.1416;
You'll rarely see this construct in normal C++ code, since C++ has references. It's useful in C for "passing by reference:"
int allocate_something(void **p)
{
*p = malloc(whatever);
if (*p)
return 1;
else
return 0;
}
The equivalent C++ code would use void *&p for the parameter.
Still, you could imagine e.g. a resource monitor like this:
struct Resource;
struct Holder
{
Resource *res;
};
struct Monitor
{
Resource **res;
void monitor(const Holder &h) { res = &h.res; }
Resource& getResource() const { return **res; }
}
Yes, it's contrived, but the idea's there - it will keep a pointer to the pointer stored in a holder, and correctly return that resource even when the holder's res pointer changes.
Of course, it's a dangling dereference waiting to happen - normally, you'd avoid code like this.

How to avoid dynamic allocation of memory C++

[edit] Outside of this get method (see below), i'd like to have a pointer double * result; and then call the get method, i.e.
// Pull results out
int story = 3;
double * data;
int len;
m_Scene->GetSectionStoryGrid_m(story, data, len);
with that said, I want to a get method that simply sets the result (*&data) by reference, and does not dynamically allocate memory.
The results I am looking for already exist in memory, but they are within C-structs and are not in one continuous block of memory. Fyi, &len is just the length of the array. I want one big array that holds all of the results.
Since the actual results that I am looking for are stored within the native C-struct pointer story_ptr->int_hv[i].ab.center.x;. How would I avoid dynamically allocating memory like I am doing above? I’d like to point the data* to the results, but I just don’t know how to do it. It’s probably something simple I am overlooking… The code is below.
Is this even possible? From what I've read, it is not, but as my username implies, I'm not a software developer. Thanks to all who have replied so far by the way!
Here is a snippet of code:
void GetSectionStoryGrid_m( int story_number, double *&data, int &len )
{
std::stringstream LogMessage;
if (!ValidateStoryNumber(story_number))
{
data = NULL;
len = -1;
}
else
{
// Check to see if we already retrieved this result
if ( m_dStoryNum_To_GridMap_m.find(story_number) == m_dStoryNum_To_GridMap_m.end() )
{
data = new double[GetSectionNumInternalHazardVolumes()*3];
len = GetSectionNumInternalHazardVolumes()*3;
Story * story_ptr = m_StoriesInSection.at(story_number-1);
int counter = 0; // counts the current int hv number we are on
for ( int i = 0; i < GetSectionNumInternalHazardVolumes() && story_ptr->int_hv != NULL; i++ )
{
data[0 + counter] = story_ptr->int_hv[i].ab.center.x;
data[1 + counter] = story_ptr->int_hv[i].ab.center.y;
data[2 + counter] = story_ptr->int_hv[i].ab.center.z;
m_dStoryNum_To_GridMap_m.insert( std::pair<int, double*>(story_number,data));
counter += 3;
}
}
else
{
data = m_dStoryNum_To_GridMap_m.find(story_number)->second;
len = GetSectionNumInternalHazardVolumes()*3;
}
}
}
Consider returning a custom accessor class instead of the "double *&data". Depending on your needs that class would look something like this:
class StoryGrid {
public:
StoryGrid(int story_index):m_storyIndex(story_index) {
m_storyPtr = m_StoriesInSection.at(story_index-1);
}
inline int length() { return GetSectionNumInternalHazardVolumes()*3; }
double &operator[](int index) {
int i = index / 3;
int axis = index % 3;
switch(axis){
case 0: return m_storyPtr->int_hv[i].ab.center.x;
case 1: return m_storyPtr->int_hv[i].ab.center.y;
case 2: return m_storyPtr->int_hv[i].ab.center.z;
}
}
};
Sorry for any syntax problems, but you get the idea. Return a reference to this and record this in your map. If done correctly the map with then manage all of the dynamic allocation required.
So you want the allocated array to go "down" in the call stack. You can only achieve this allocating it in the heap, using dynamic allocation. Or creating a static variable, since static variables' lifecycle are not controlled by the call stack.
void GetSectionStoryGrid_m( int story_number, double *&data, int &len )
{
static g_data[DATA_SIZE];
data = g_data;
// continues ...
If you want to "avoid any allocation", the solution by #Speed8ump is your first choice! But then you will not have your double * result; anymore. You will be turning your "offline" solution (calculates the whole array first, then use the array elsewhere) to an "online" solution (calculates values as they are needed). This is a good refactoring to avoid memory allocation.
This answer to this question relies on the lifetime of the doubles you want pointers to. Consider:
// "pointless" because it takes no input and throws away all its work
void pointless_function()
{
double foo = 3.14159;
int j = 0;
for (int i = 0; i < 10; ++i) {
j += i;
}
}
foo exists and has a value inside pointless_function, but ceases to exist as soon as the function exits. Even if you could get a pointer to it, that pointer would be useless outside of pointless_function. It would be a dangling pointer, and dereferencing it would trigger undefined behavior.
On the other hand, you are correct that if you have data in memory (and you can guarantee it will live long enough for whatever you want to do with it), it can be a great idea to get pointers to that data instead of paying the cost to copy it. However, the main way for data to outlive the function that creates it is to call new, new[], or malloc. You really can't get out of that.
Looking at the code you posted, I don't see how you can avoid new[]-ing up the doubles when you create story. But you can then get pointers to those doubles later without needing to call new or new[] again.
I should mention that pointers to data can be used to modify the original data. Often that can lead to hard-to-track-down bugs. So there are times that it's better to pay the price of copying the data (which you're then free to muck with however you want), or to get a pointer-to-const (in this case const double* or double const*, they are equivalent; a pointer-to-const will give you a compiler error if you try to change the data being pointed to). In fact, that's so often the case that the advice should be inverted: "there are a few times when you don't want to copy or get a pointer-to-const; in those cases you must be very careful."

Initialization of c++ heap objects

I'am wondering if built-in types in objects created on heap with new will be initialized to zero? Is it mandated by the standard or is it compiler specific?
Given the following code:
#include <iostream>
using namespace std;
struct test
{
int _tab[1024];
};
int main()
{
test *p(new test);
for (int i = 0; i < 1024; i++)
{
cout << p->_tab[i] << endl;
}
delete p;
return 0;
}
When run, it prints all zeros.
You can choose whether you want default-initialisation, which leaves fundamental types (and POD types in general) uninitialised, or value-initialisation, which zero-initialises fundamental (and POD) types.
int * garbage = new int[10]; // No initialisation
int * zero = new int[10](); // Initialised to zero.
This is defined by the standard.
No, if you do something like this:
int *p = new int;
or
char *p = new char[20]; // array of 20 bytes
or
struct Point { int x; int y; };
Point *p = new Point;
then the memory pointed to by p will have indeterminate/uninitialized values.
However, if you do something like this:
std::string *pstring = new std::string();
Then you can be assured that the string will have been initialized as an empty string, but that is because of how class constructors work, not because of any guarantees about heap allocation.
It's not mandated by the standard. The memory for the primitive type members may contain any value that was last left in memory.
Some compilers I guess may choose to initialize the bytes. Many do in debug builds of code. They assign some known byte sequence to give you a hint when debugging that the memory wasn't initialized by your program code.
Using calloc will return bytes initialized to 0, but that's not standard-specific. calloc as been around since C along with malloc. However, you will pay a run-time overhead for using calloc.
The advice given previously about using the std::string is quite sound, because after all, you're using the std, and getting the benefits of class construction/destruction behaviour. In other words, the less you have to worry about, like initialization of data, the less that can go wrong.