Using the std::realloc function:
If the new size is smaller, does it always have warranty to keep the memory block on the same position and only make it smaller, or it can move sometimes the whole block?
The reason to ask this, is that we'are writing a large and very hard code, and it is useful to make read only all the variables we need to leave unchanged, to obtain compiler's errors, when we try to change the wrong variable.
#include<cstdlib>
#include<iostream>
using namespace std;
int main(){
//From 10,000,000 unsigned ints to 10 unsigned ints
unsigned int * const array=new unsigned int[10000000];
cout<<array<<endl;
realloc(array,10*sizeof(unsigned int));
cout<<array<<endl;
delete array;
return 0;
}
Although I agree with the other answers in that you should not depend on it, there is an answer to be found in the glibc source. (I am assuming that you are using glibc, as you have not (yet) answered my comment asking which C library you are using)
EDIT: Using realloc on memory allocated by new is indeed disallowed, as other answers have mentioned.
Memory allocated without internally using mmap
If a block of memory is not allocated with mmap, __libc_realloc calls the _int_realloc function, which contains the following snippet of code:
if ((unsigned long) (oldsize) >= (unsigned long) (nb))
{
/* already big enough; split below */
newp = oldp;
newsize = oldsize;
}
This makes the pointer to the new memory equal the pointer to the old memory and sets the size accordingly. Note the split below comment; the old memory block may be resized to the requested size, but is not moved.
Memory allocated internally using mmap
If the memory was internally allocated using mmap, there are two ways of resizing the memory area; mremap_chunk or a series of calls to malloc, memcpy and free. If the mremap_chunk function is available, it is used instead of the latter option.
Memory reallocated using mremap_chunk
The function mremap_chunk contains this snippet of code
/* No need to remap if the number of pages does not change. */
if (size + offset == new_size)
return p;
If the number of pages does not change from the old size to the new size, there is no need to remap and the old pointer is returned.
Memory reallocated using malloc, memcpy and free
If mremap_chunk is not available, the __libc_realloc source continues with the following:
/* Note the extra SIZE_SZ overhead. */
if (oldsize - SIZE_SZ >= nb)
return oldmem; /* do nothing */
If the oldsize variable minus the chunk size is more than or equal to the new size, just return the old memory.
Well then, here we are. In all cases, glibc returns a pointer to the old memory, not moving it (but possibly resizing it). If you are using glibc (and can somehow guarantee that the only C library you are using it with is glibc, and can guarantee that it won't change at some point in the future), you are able to rely on the behavior that realloc does not move a block of memory if the requested size is equal to or less than the old size.
No!! If realloc succeeds, the old pointer (unless it was a nullpointer) is indeterminate.
Also, do not mix incompatible memory-management-functions (assume incompatibility unless guaranteed otherwise).
realloc only has the guarantees explicitly given in the standard:
If return-value is non-0: New pointer points to of at least size byte, the first min(oldsize, newsize) being equal to the passed block.
Else if size is non-0, nothing happened to the passed block.
Else the old block may have been deallocated, or not.
Moral: Never pass a 0 size to realloc, and only use the old pointer for anything (including comparison to the new pointer), if realloc failed (or you passed a nullpointer).
7.22.3.5 The realloc function
#include <stdlib.h>
void *realloc(void *ptr, size_t size);
2 The realloc function deallocates the old object pointed to by ptr and returns a
pointer to a new object that has the size specified by size. The contents of the new
object shall be the same as that of the old object prior to deallocation, up to the lesser of
the new and old sizes. Any bytes in the new object beyond the size of the old object have
indeterminate values.
3 If ptr is a null pointer, the realloc function behaves like the malloc function for the
specified size. Otherwise, if ptr does not match a pointer earlier returned by a memory
management function, or if the space has been deallocated by a call to the free or
realloc function, the behavior is undefined. If memory for the new object cannot be
allocated, the old object is not deallocated and its value is unchanged.
Returns
4 The realloc function returns a pointer to the new object (which may have the same
value as a pointer to the old object), or a null pointer if the new object could not be
allocated.
C99 draft 7.20.3.4 says:
[#4] The realloc function returns a pointer to the new
object (which may have the same value as a pointer to the
old object), or a null pointer if the new object could not
be allocated.
You should not assume it.
And also: don't mix new and realloc as πάντα already wrote in the comments.
Nothing's guaranteed about realloc. It might shrink the block in place, or it might allocate a new one and copy the data. It might also fail.
An important point: realloc is only for reallocating memory that was previously allocated by malloc. In your code above, you are using new which has no equivalent for reallocation.
Also realloc actually returns the address of the new memory block, so in your code above you will be a) leaking this and b) referencing/freeing potentially already de-allocated memory.
Related
I have following piece of code.
char* p = malloc(10);
p = p + 1;
free(p);
In above code,
How does malloc return the memory address when call malloc(10)?
How many bytes will be deallocated with free(p)?
How does free() know how many bytes to be deallocated?
As the man page for free will tell you, any argument except a pointer returned from malloc has undefined behaviour:
The free() function frees the memory space pointed to by ptr, which must have been returned by a previous call to malloc(), calloc() or realloc(). Otherwise, or if free(ptr) has already been called before, undefined behavior occurs. If ptr is NULL, no operation is performed
Regarding how free knows the size of the block: a typical memory allocator implementation has a header for each block (containing size, freelist pointers, etc.) and free knows the size of this header and the offset from the pointer returned by malloc.
This also answers your first question: malloc allocates such a block and returns a pointer to the start of the actual object.
This question was asked as part of Does delete[] deallocate memory in one shot after invoking destructors? but moved out as a separate question.
It seems (Correct me if wrong) that the only difference between delete and delete[] is that delete[] will get the array size information and invoke destructors on all of them, while delete will destruct the only first one. In particular, delete also has access to the info on how much total memory is allocated by new[].
If one doesn't care about destructing the dynamically allocated array elements, and only care that the memory allocated either by new or new[] be deallocated, delete seems to be able to do the same job.
This How does delete[] "know" the size of the operand array? question's accepted answer has one comment from #AnT and I quote
Also note that the array element counter is only needed for types with non-trivial destructor. For types with trivial destructor the counter is not stored by new[] and, of course, not retrieved by delete[]
This comment suggests that in general delete expression knows the amount of the entire memory allocated and therefore knows how much memory to deallocate in one shot in the end, even if the memory hold an array of elements. So if one writes
auto pi = new int[10];
...
delete pi;
Even though the standard deems this as UB, on most implementations, this should not leak memory (albeit it is not portable), right?
Under the C++ standard, calling delete on something allocated with new[] is simply undefined behavior, as is calling delete[] on something allocated with new.
In practice, new[] will allocate the memory through something like malloc as will new. delete will destroy the pointed-to object, then send the memory to something like free. delete[] will destroy all of the objects in the array, then send the memory to something like free. Some extra memory may be allocated by new[] to pass to delete[] to give delete[] the number of elements to be destroyed, or not.
If actual malloc/free is used, then some implementations will allow a pointer to anywhere in the malloc'd block to be used. Others won't. The exact same value is required to be passed to free as you got from malloc for this to be defined. There is an issue here in that if new[] malloced some extra room for the array size/element stride and stuck it before the block, then delete is passed the pointer-to-the-first element, then delete will pass free a different pointer than new[] got from malloc. (I think there is an architecture where something like this happens.)
Like most undefined behavior, you can no longer rely on auditing the code you write, but instead you are now committed to auditing both the produced assembly, and the C/C++ standard libraries you interact with, before you can determine if the behavior you want to do is correct. In practice, that is a burden that will not be fulfilled, so your code ends up having negative value, even if you check that things work the way you expect the one time you actually checked. How will you ensure that an identical check (of the resulting binary and its behavior) will occur every time the compiler version, standard library version, OS version, system libraries, or compiler is changed?
This is correct. Difference between delete and delete[] is that the latter knows the number of items allocated in the array and calls destructor on every object on them. To be 100% correct, both actually 'know' it - the number of items allocated for an array is equal to the allocated memory size (which both know) divided by the size of the object.
One might ask, why do we need delete[] and delete than - why can't delete perform the same calculations? The answer is polymorphism. The size of the allocated memory will not be equal to the sizeof static objec when deletion is done through the pointer to the base class.
On the other hand, delete[] does not take into account a possibility of object being polymorphed, and this is why dynamic arrays should never be treated as polymorphic objects (i.e. allocated and stored as a pointer to the base class).
As for leaking memory, delete will not leak memory in case of POD types when used on arrays.
A concrete reason to avoid all constructs provoking undefined behavior, even if you cannot see how they could possibly go wrong, is that the compiler is entitled to assume that undefined behavior never happens. For instance, given this program...
#include <iostream>
#include <cstring>
int main(int argc, char **argv)
{
if (argc > 0) {
size_t *x = new size_t[argc];
for (int i = 0; i < argc; i++)
x[i] = std::strlen(argv[i]);
std::cout << x[0] << '\n';
delete x;
}
return 0;
}
... the compiler might emit the same machine code as it would for ...
int main(void) { return 0; }
... because the undefined behavior on the argc > 0 control path means the compiler may assume that path is never taken.
I am working in C++ and have been using pointers a lot lately. I found that there are a few ways to initialize the chunks of memory that I need to use.
void functioncall(int* i)
{
*i = *i + 1;
}
int main(){
int* a = (int*)malloc(sizeof(int));
int az = 0;
functioncall(a);
functioncall(&az);
}
Notice that the first variable int* a is declared as a pointer and then I malloc the memory for it. But, with az it is not a pointer but when calling the function I get the address of the memory.
So, my question is: is there a preferred way =, or is there any penalties one over the other?
int* a = (int*)malloc(sizeof(int));
This allocates memory on the heap. You have to deallocate it on your own, or you'll run into memory leaks. You deallocate it by calling free(a);. This option is most definitely slower (since the memory has to be requested and some other background stuff has to be done) but the memory may be available as long as you call free.
int az = 0;
This "allocates" memory on the stack, which means it gets automatically destroyed when you leave the function it is declared (unless for some really rare exceptions). You do not have to tidy up the memory. This option is faster, but you do not have control over when the object gets destroyed.
a is put onto the heap, az is on the stack. The heap you are responsible to freeing the memory. With the stack when it goes out of scope it is automatically free. So the answer is when you want the data to be placed and if you require if at the end of the scope.
PS You should use new in C++
In general you should avoid dynamic memory allocations (malloc, calloc, new) when it's reasonably easy: they are slower than stack allocations, but, more importantly, you must remember to free (free, delete) manually the memory obtained with dynamic allocation, otherwise you have memory leaks (as happens in your code).
I'm not sure what you're trying to do, but there is almost never a
reason for allocating a single int (nor an array of int, for that
matter). And there are at least two errors in your functioncall:
first, it fails to check for a null pointer (if the pointer can't be
null, pass by reference), and second, it doesn't do anything: it
increments the copy of the pointer passed as an argument, and then
dereferences the initial value and throws out the value read.
Allocating small variables directly on the stack is generally faster since you don't have to do any heap operations. There's also less chance of pointer-related screwups (e.g., double frees). Finally, you're using less space. Heap overheads aside, you're still moving a pointer and an int around.
The first line (int* a = ...) is called dynamically-allocated variable, it is usually used if you don't know before the runtime that how much variables you needed, or if you need it at all.
The second line (int az = 0) is called automatic variable, it is used more regularly.
int az = 0;
functioncall(a);
This is okay, as far as behavior is concerned.
int* a = (int*)malloc(sizeof(int));
functioncall(&az);
This invokes undefined-behaviour (UB), inside the function, when you do *i++. Because malloc only allocates the memory, it does not initialize it. That means, *i is still uninitialized, and reading an uninitialized memory invokes UB; that explains why *i++ is UB. And UB, if you know, is the most dangerous thing in C++, for it means, anything can happen.
As for the original question, what would you prefer? So the answer is, prefer automatic variable over pointer (be it allocated with malloc or new).
Automatic means Fast, Clean and Safe.
func(typename* p)
pointer is call value
*p++ is *p and p++
if change this pointer , not change original.
i am using void *realloc(void *pointer, size_t size); to increase the size of my pointer. how does realloc work?
does it create a nre address space, and copy the old value to the new address space and returns a pointer this address? or it just allocates more memory and binds it to the old one?
#Iraklis has the right answer: it does the second (if it can get away with it) or the first (but only if it has to).
However, sometimes it can do neither, and will fail. Be careful: If it can't resize your data, it will return NULL, but the memory will NOT be freed. Code like this is wrong:
ptr = realloc(ptr, size);
If realloc returns NULL, the old ptr will never get freed because you've overwritten it with NULL. To do this properly you must do:
void *tmp = realloc(ptr, size);
if(tmp) ptr = tmp;
else /* handle error, often with: */ free(ptr);
On BSD systems, the above is turned into a library function called reallocf, which can be implemented as follows:
void *reallocf(void *p, size_t s)
{
void *tmp = realloc(p, s);
if(tmp) return tmp;
free(p);
return NULL;
}
Allowing you to safely use:
ptr = reallocf(ptr, size);
Note that if realloc has to allocate new space and copy the old data, it will free the old data. Only if it can't resize does it leave your data intact, in the event that a resize failure is a recoverable error.
It depends! If its unable to resize the memory region in place then it allocates a new memeory region, copy the old data and free the old memory.
You're misusing the term "address space". All of the memory of your process exists within a single address space. The memory not used by your program, its global variables, and its stack are known as the "heap". malloc and realloc (and calloc, which is just malloc and clear) allocate memory from the heap. Most implementations of realloc will check if there is enough (size bytes) free space starting at pointer (which must point to a block previously allocated by malloc or realloc -- realloc knows how large that block is) and, if so, just increase the size of the block allocated at the location given by pointer and return, with no copying. If there isn't enough space, it will do the equivalent of newptr = malloc(size); memcpy(newptr, pointer, size_of_old_block); free(pointer); return newptr; ... that is, it will allocate a block big enough to hold size bytes, copy the data at pointer to that block, free the old block, and return the address of the new block.
I think the answer is function is dependent on the requested size and available heap.
From the programmers perspective, I think all we get guaranteed is that pointer is non-null if the new allocation is successful. The pointer may, therefore, remain unchanged even though it now points to a larger block of memory represented by size_t.
Realloc does not change the size of your pointer, the size of pointers is always the same on the same architecture. It changes the size of the allocated memory to which your pointer points. The way it works is described here: http://msdn.microsoft.com/en-us/library/xbebcx7d.aspx. In short, yes, it allocates more memory leaving your content unchanged; if the memory must be moved, it copies the content. OF course, you can specify a shorter size, in which case it trims the allocated memory, again leaving the content untouched.
I have a function which grows an array when trying to add an element if it is full. Which of the execution blocks is better or faster?
I think my second block (commented out) may be wrong, because after doubling my array I then go back and point to the original.
When creating arrays does the compiler look for a contiguous block in memory which it entirely fits into? (On the stack/heap? I don't fully understand which, though it is important for me to learn it is irrelevant to the actual question.)
If so, would this mean using the second block could potentially overwrite other information by overwriting adjacent memory? (Since the original would use 20 adjacent blocks of memory, and the latter 40.)
Or would it just mean the location of elements in my array would be split, causing poor performance?
void Grow()
{
length *= 2; // double the size of our stack
// create temp pointer to this double sized array
int* tempStack = new int[length];
// loop the same number of times as original size
for(int i = 0; i < (length / 2); i++)
{
// copy the elements from the original array to the temp one
tempStack[i] = myStack[i];
}
delete[] myStack; //delete the original pointer and free the memory
myStack = tempStack; //make the original point to the new stack
//Could do the following - but may not get contiguous memory block, causing
// overwritten >data
#if 0
int* tempStack = myStack; //create temp pointer to our current stack
delete[] myStack; //delete the original pointer and free memory
myStack = new int[length *= 2]; //delete not required due to new?
myStack = tempStack;
#endif
}
The second block wouldn't accomplish what you want at all.
When you do
myStack = new int[length *= 2];
then the system will return a pointer to wherever it happens to allocate the new, larger array.
You then reassign myStack to the old location (which you've already de-allocated!), which means you're pointing at memory that's not allocated (bad!) and you've lost the pointer to the new memory you just allocated (also bad!).
Edit: To clarify, your array will be allocated on the heap. Additionally, the (new) pointer returned by your larger array allocation (new int[foo]) will be a contiguous block of memory, like the old one, just probably in a different location. Unless you go out of bounds, don't worry about "overwriting" memory.
Your second block is incorrect because of this sequence:
int* tempStack = myStack; //create temp pointer to our current stack
delete[] myStack; //delete the original pointer and free memory
tempStack and myStack and both simply pointers to the same block of memory. When you delete[] the pointer in the second line, you no longer have access to that memory via either pointer.
Using C++ memory management, if you want to grow an array, you need to create a new array before you delete the old one and copy over the values yourself.
That said, since you are working with POD, you could use C style memory management which supports directly growing an array via realloc. That can be a bit more efficient if the memory manager realizes it can grow the buffer without moving it (although if it can't grow the buffer in place, it will fall back on the way you grow your array in your first block).
C style memory management is only okay for arrays of POD. For non-POD, you must do the create new array/copy/delete old array technique.
This doesn't exactly answer your question, but you shouldn't be doing either one. Generally new[] or delete[] should be avoided in favor of using std::vector. new[] is hard to use because it requires explicit memory management (if an exception is thrown as the elements are being copied, you will need to catch the exception and delete the array to avoid a memory leak). std::vector takes care of this for you, automatically grows itself, and is likely to have an efficient implementation tuned by the vendor.
One argument for using explicit arrays is to have a contiguous block of memory that can be passed to C functions, but that also can be done with std::vector for any non-pathological implementation (and the next version of the C++ standard will require all conforming implementations to support that). (For reference, see http://www.gotw.ca/publications/mill10.htm by Herb Sutter, former convener of the ISO C++ standards committee.)
Another argument against std::vector is the weirdness with std::vector<bool>, but if you need that you can simply use std::vector<char> or std::vector<int> instead. (See: http://www.gotw.ca/publications/mill09.htm)