I am new to programming and I am trying to understand the difference between
A = (char * ) malloc(sizeof(char)*n);
and
A = (char * ) malloc(sizeof(char));
or
A = new char [n];
and
A = new char;
What is the default memory that a compiler is allocating to this pointer, when I do not specify the number of objects of particular data type.
Also when I declare
A = new char [n];
cout << A[n+1];
it does not give me a segmentation fault.
Should It not give segmentation fault because I am trying to access memory beyond what has been allocated for the Array.
Memory is not "allocated to this pointer", it's allocated and then you get a pointer to the memory.
This:
char *a = malloc(sizeof(char) * n);
is the same as
char *a = malloc(n);
since sizeof(char) is always 1. They both allocate space for n characters worth of data, and return a pointer to the location where the first character can be accessed (or NULL on failure).
Also, the casts are not needed in C, you should not have any.
Since sizeof(char) is 1, the second call is equivalent to:
char *a = malloc(1);
which means it allocates a memory block of size 1. This is of course distinct from the pointer to that memory block (the value that gets stored in the pointer variable a). The pointer is most likely larger than 1 char, but that doesn't affect the size of the block.
The argument to malloc() specifies how many chars to allocate space for.
I ignored the new usage, since that is C++ and the question is tagged C.
A = (char * ) malloc(sizeof(char)*n);
This allocates space for n characters.
A = (char * ) malloc(sizeof(char));
This allocates memory for 1 character.
Every call to malloc allocates memory in the heap.
The other code is C++, and it's exactly the same, except that it will use stack memory if A is a local variable. Accessing A[n+1] may or may not yield a segfault. A[n+1] can reference a memory address that you are allowed to use. Segfault happens when you go out of the region of memory you can access, and the way it works is that there is a "red zone" from which it is considered you accessed invalid memory. It may be the case that A[n+1] just isn't "invalid enough" to trigger a segfault.
allocate space for N characters (N should be some positive integer value here)
char *ma = (char * ) malloc(N);
char *na = new char [N];
don't forget to release this memory ...
delete [] na;
free(ma);
allocate space for a single character
char *mc = (char * ) malloc(sizeof(char));
char *nc = new char;
Now, as the others have pointed out, you tagged this C, but half your code is C++. If you were writing C, you couldn't use new/delete, and wouldn't need to cast the result of malloc.
Oh, and the reason you don't get a segmentation fault when you read off the end of your array is that this is undefined behaviour. It certainly could cause a SEGV, but it isn't required to check, so may appear to work, at least some of the time, or fail in a completely different way.
Well, the compiler doesn't allocate memory for the data. Only the pointer which is either 4 or 8 bytes depending on your architecture.
There is no difference between the first two and the last two in terms on functionality. Most C++ libraries I've seen use malloc internally for new.
When you run the code to allocate n characters and you print out the n + 1th character, you aren't getting a segmentation fault most likely because n isn't a multiple of some number, usually 8 or 16. Here's some code on how it might do that:
void* malloc(size_t size) {
if (size & 0x7 != size)
size = size & 0x7 + 1;
return _malloc(size);
}
So, if you requested, say, 5 bytes, malloc would actually allocate, with that code, 8 bytes. So, if you request the 6th byte (n + 1), you would get garbage, but it is still valid memory that your program can access.
Related
Because segmentation fault related to malloc/free happens, I would like to convert malloc/free to new/delete.
Error occurred when malloc/free is converted to below.
Let me know how to solve it.
(original)
char *only_valid_data = static_cast<char*> (malloc (data_size));
(converted)
char *only_valid_data = new static_cast<char*> [data_size];
Just do
char* only_valid_data = new char[data_size];
to allocate.
To free you do
delete[] only_valid_data;
Important note: When you allocate memory with new it will allocate data_size elements, not data_size bytes (like malloc does). The size of an element is the size of the non-pointer base type, in your case e.g. sizeof(*only_valid_data). In this case the element size and the byte size is the same (as sizeof(char) is specified to always be 1), but if the base type is something else it will make a big difference.
For example, if you do
int* int_ptr = new int[10];
then ten integers will be allocated, not ten bytes. This is equivalent to
int* int_ptr = reinterpret_cast<int*>(malloc(10 * sizeof(*int_ptr)));
Also note that for complex types, like structures and classes, allocating with new (or new[]) does more than just allocating memory, it will also make sure that the allocated object(s) constructor is called. The malloc function only allocates memory, it doesn't call the constructor.
Final note: The problem you have with the segmentation fault is probably not caused by your allocation, no matter how you allocate the memory. The problem is more likely because of something else, something you do not show in your question, like writing out of bounds of the allocated memory or dereferencing a null-pointer.
You need run your program in a debugger to catch the crash in action, it will allow you to examine the function call stack, and if the crash doesn't happen in your code then you walk up the call stack until you reach your code. There you can examine the values of variables, to help you understand why the problem occurred.
The malloc family (malloc, realloc, calloc, free) is used almost always in C code, as C++ provides the new and delete operators which are a lot more reliable to use.
A problem with malloc for allocation is that you must specify the size of the type in bytes that you want to allocate. For example:
int* ptr = malloc(5);
Will not allocate space for 5 integers in memory; it will allocate 5 bytes of memory (the size of an integer is 4 bytes, so this would obviously cause problems when assigning).
To do it properly, it must be written as
int* ptr = malloc(5 * sizeof(int));
So that 20 bytes are allocated.
However, there are some exceptions to the case. char, for example only requires one byte of memory, so doing
char* ptr = malloc(5);
Will allocate enough memory to hold 5 characters, and in a way is more valid that writing:
char* ptr = malloc(5 * sizeof(char)); //5 * sizeof(char) == 5 * 1 == 5
However, the free function does not need to know the size of the pointer to be deallocated; a void* is only needed.
Note that in C++, the return of malloc must be cast properly to the type wanted; malloc returns a void* type, but C++ does not allow any pointer to assign a void* to any pointer type like C does:
int* ptr = malloc(5 * sizeof(int)); //valid C code, invalid C++
int* ptr2 = (int*)malloc(5 * sizeof(int)); //valid C code, valid C++
In C++, the new[] operator resolves the issue of remembering to add the sizeof operator.
int* ptr = new int[5];//allocates 5 integers
int* ptr2 = new int(5);//be careful: this allocates a single integer with value of 5
Note that if the new[] operator has been used, the delete[] operator must be used. Otherwise the delete operator must be used:
int* ptr = new int[5];//allocates 5 integers
delete[] ptr;//deallocate the 5 integers
int* ptr2 = new int(5);//be careful: this allocates a single integer with value of 5
delete ptr;//deallocate the integer
The problem with your code is that it does not fit the syntax of the new[] operator
The syntax could be described as:
T* p = new T[size];
Thus your code:
char *only_valid_data = new static_cast<char*> [data_size];
Should be corrected to:
char *only_valid_data = new char[data_size];
As static_cast<char*> is not a type.
Hope this helps :)
I have a piece of code:
#include<iostream>
using namespace std;
int main()
{
char * str = new char;
cin >> str;
cout << str;
delete str;
}
vs.
#include<iostream>
using namespace std;
int main()
{
char * str = new char[30];
cin >> str;
cout << str;
delete []str;
}
When I give input to either program using STDIN, which program ensures that there are no memory leaks?
This doubt arose as our professor told us that a char * is basically equivalent to an array of chars only. So if I allocate heap memory as in the first case and let str 'hold' an array of chars, if I then delete str, does it delete the array completely? I know that the second case manages to do so.
I have already been through ->
delete vs delete[] operators in C++
delete vs delete[]
And thus I know that delete[] deletes memory allocated by new[] and delete does the same for new. But what if new itself allocates contiguous memory locations??
Your both first code example is wrong.
char * str = new char;
cin >> str;
You've only allocated memory for a single character. If you read anything other than an empty string, you'll write into unallocated memory and will have undefined behaviour.
if I then delete str, does it delete the array completely?
It will only delete the one character that you allocated. The rest of the string that you wrote in unallocated memory won't be directly affected by the delete. It's not a memory leak, it's a memory corruption.
vs.
char * str = new char[];
This is not legal c++. Array size must be specified.
EDIT: After your fix, the second code is correct as long as you read a string of 29 characters or shorter. If you read a longer string, you'll get undefined behaviour again.
But what if new itself allocates contiguous memory locations?
It doesn't. new (as opposed to new[]) allocates and constructs exactly one object. And delete destroys and deallocates exactly one object.
TLDR Neither program has memory leaks but the first one has undefined behaviour due to memory corruption.
I believe that you are misunderstanding what the difference is between a memory leak and a buffer overflow.
What is a buffer overflow?
A buffer overflow occurs when we have some piece of memory that we are going to store some data in. And when we store that data, we put too much data there. For example:
int x[4];
x[0] = 7;
x[1] = 8;
x[2] = 9;
x[3] = 10;
x[4] = 11; // <-- Buffer Overflow!
Your code exhibits a potential buffer overflow because cin doesn't know how much memory you've allocated. And there's no real method to tell it that when using char * arguments. So in your first example, if you were to write any string longer than the empty string, you would cause a buffer overflow. Likewise, if you were to write more than 30 characters (including the null character) to the second example, you would also cause a buffer overflow.
What is a memory leak?
A memory leak is traditionally represented this way:
char *x = new char[30];
x[0] = 'a';
x[1] = '\0';
x = new char[10]; // <-- Memory Leak!
At this point in the code, you have no ability to call delete[] on the first allocation. You have no variable that points to that pointer. That is a memory leak.
What does delete[] do?
Let's consider that there is some bucket somewhere that can give us chunks of memory. We can grab chunks of memory from that bucket via new and new[]. When we use delete and delete[], we return those chunks of memory back to the bucket.
The agreement that we make with new and delete is that once we call delete on a piece of memory, we don't continue to use it. We don't do this, because the system may reuse that piece of memory, or it may have removed all ability to access that pointer all together.
How could this possibly work?
You have this piece of code:
char *x = new char;
cin >> x;
I'd like to tell you that it's basically the same as this piece of code:
char y;
cin >> &y;
In both cases, you've allocated space for only one char. So when we call delete on x, we're only deleteing one char. The part of the code there that will likely break is that cin will think that there is enough memory allocated for whatever string it is going to try and write to that pointer.
The fact is, there probably isn't enough space. There's only space for one char. And even the string "a", takes up 2 chars.
You can't do such things.
char *ptr = new char;
means that you have allocated sizeof(char) bytes.
If you will execute cin >> ptr, and pass there more than 1 character, you will get segfault as the memory isn`t allocated.
To allocate an array of chars you need to do it in next way:
char *ptr = new char[ size ];
It will allocate size * sizeof(char) bytes.
And then you can use cin >> ptr to fill it with data.
int main()
{
char* p = new char('a');
*reinterpret_cast<int*>(p) = 43523;
return 0;
}
This code runs fine but how safe is it? It should have only allocated one byte of memory but it doesn't seem to be having any problem filling 4 bytes with data. What can happen with the other 3 bytes that are not allocated?
char* p = new char('a');
ASCII code of 'a' is 97, so this is equivalent to:
char* p = new char(97);
Single character (1 byte) space is allocated with *p='a'
Now, you are trying to put more than 1-byte value, that's certainly risky, even if this works, I mean runs without any segmentation fault. You are overwriting some other parts of memory that you don't own or even if own, must be for some other purpose.
The code is not safe. Your program has undefined behaviour. Anything at all could happen.
The code causes heap buffer overflow. You're overriding memory which you don't own and can be in use in other parts of the program.
Consider the following code
struct foo
{
const int txt_len;
const int num_len;
char * txt;
int * num;
foo(int tl, int nl): txt_len(tl), num_len(nl)
{
char * tmp = new char[txt_len * sizeof(char) + num_len * sizeof(int)];
txt = new (tmp) char [txt_len * sizeof(char)];
num = new (tmp + txt_len * sizeof(char)) int[num_len * sizeof(int)];
// is this the same as above?
// txt = tmp;
// num = (int *) (tmp + txt_len * sizeof(char));
}
~foo()
{
delete[] txt; // is this the right way to free the memory?
}
};
I want *txt and *num to be contiguous, is that the best way to do it?
also is there any difference between placement new and pointer arithmetic? which one should I use?
If you want a contiguous block of memory, you have to allocate it whole with a single call to operator new[] or malloc() or similar. Multiple calls to these functions do not guarantee any contiguity of allocated blocks whatsoever. You may allocate a big block and then carve parts from it as needed.
And you should delete and free() all blocks previously allocated with new and malloc(), otherwise you'll leak memory and probably make your program unstable (it will fail to allocate more memory at some point) and exert unnecessary pressure on memory in the OS, possibly slowing down other programs or making them unstable as well.
Placement new, however, does not actually allocate any memory. It simply constructs an object at the specified location and so you don't need to free that memory twice.
One problem that I see in your code is that it doesn't align ints. On some platforms reading or writing integers bigger than 1 byte from/to the memory must be aligned and if it's not, you can either read/write values from/to wrong locations or get CPU exceptions leading to termination of your program. The x86 is very permissive in this regard and won't mind, though may tax you with degraded performance.
You'll need to put the int data first, due to the alignment issues. But we can't then do delete num[] as the type is wrong - it must be cast to a char* before deleting.
char * tmp = new char[num_len * sizeof(int) + txt_len * sizeof(char)];
num = new (tmp) int[num_len];
txt = new (tmp + num_len * sizeof(int)) char [txt_len];
(This makes liberal use of the fact that sizeof(char)==1)
You might be tempted to do delete[] num, but num is of type int*, and it was new'ed as a char*. So you need to do;
delete[] (char*) num;
This is the same as long as you use POD types. And your delete is fine.
However, as David's comment states, you need to consider alignment problems.
Placement new is mostly use when you want to call constructor of class/struct on some preallocated memory blocks.
But for native types it makes no different to use placement new & pointer arithmetic.
Please correct me if I was wrong.
If txt and num always point to int and char, other built in types or other types not requiring construction, then no. You don't need placement new.
If on the other hand you were to change one of them to a class which requires construction, i.e. changes txt to type std::string, then using placement new is necessary.
Placement new allows you to call the constructor, building, if you like, the object at that address. Built in types have default constructors that do nothing if your not initializing.
In both cases you need to do pointer arithmetic, just one way you store the answer in a pointer, the other you pass the answer to placement new which gives it back to you for storage in the pointer, and then calls the constructor.
I have a program in C++ that has a BYTE array that stores some values. I need to find the length of that array i.e. number of bytes in that array. Please help me in this regard.
This is the code:
BYTE *res;
res = (BYTE *)realloc(res, (byte_len(res)+2));
byte_len is a fictitious function that returns the length of the BYTE array and I would like to know how to implement it.
Given your code:
BYTE *res;
res = (BYTE *)realloc(res, (byte_len(res)+2));
res is a pointer to type BYTE. The fact that it points to a contiguous sequence of n BYTES is due to the fact that you did so. The information about the length is not a part of the pointer. In other words, res points to only one BYTE, and if you point it to the right location, where you have access to, you can use it to get BYTE values before or after it.
BYTE data[10];
BYTE *res = data[2];
/* Now you can access res[-2] to res[7] */
So, to answer your question: you definitely know how many BYTEs you allocated when you called malloc() or realloc(), so you should keep track of the number.
Finally, your use of realloc() is wrong, because if realloc() fails, you leak memory. The standard way to use realloc() is to use a temporary:
BYTE *tmp;
tmp = (BYTE *)realloc(res, n*2);
if (tmp == NULL) {
/* realloc failed, res is still valid */
} else {
/* You can't use res now, but tmp is valid. Reassign */
res = tmp;
}
If the array is a fixed size array, such as:
BYTE Data[200];
You can find the length (in elements) with the commonly used macro:
#define ARRAY_LENGTH(array) (sizeof(array)/sizeof((array)[0]))
However, in C++ I prefer to use the following where possible:
template<typename T, size_t N>
inline size_t array_length(T data[N])
{
return N;
};
Because it prevents this from occurring:
// array is now dynamically allocated
BYTE* data = new BYTE[200];
// oops! this is now 4 (or 8 on 64bit)!
size_t length = ARRAY_LENGTH(data);
// this on the other hand becomes a compile error
length = array_length(data);
If the array is not a fixed size array:
In C++, raw pointers (like byte*) are not bounded. If you need the length, which you always do when working with arrays, you have to keep track of the length separately. Classes like std::vector help with this because they store the length of the array along with the data.
In the C way of doing things (which is also relevant to C++) you generally need to keep a record of how long your array is:
BYTE *res;
int len = 100
res = (BYTE *)realloc(res, (byte_len(len)));
len += 2;
res = (BYTE *)realloc(res, (byte_len(len)));
An alternative in the C++ way of doing things s to use the std::vector container class; a vector has the ability to manage the length of the array by itself, and also deals with the issues of memory management..
EDIT: as others have pointed out the use of realloc here is incorrect as it will lead to memory leaks, this just deals with keeping track of the length. You should probably accept one of the other replies as the best answer
Given the information you seem to have available, there is no way to do what you want. When you are working with arrays allocated on the heap, you need to save the size somewhere if you need to work with it again. Neither new nor malloc will do this for you.
Now, if you have the number of items in the array saved somewhere, you can do this to get the total size in characters, which is the unit that realloc works with. The code would look like this:
size_t array_memsize = elems_in_array * sizeof(BYTE);
If you are really working with C++ and not C I would strongly suggest that you use the vector template for this instead of going to malloc and realloc. The vector template is fast and not anywhere near as error prone as rolling your own memory management. In addition, it tracks the size for you.
When you allocate the pointer initially you also need to keep track of the length:
size_t bufSize = 100;
BYTE* buf = malloc(sizeof(BYTE ) * bufSize);
When you re-allocate you should be carefull with the re-alloc:
BYTE* temp = realloc(buf,sizeof(BYTE ) * (bufSize+2));
if (temp != NULL)
{
bufSize += 2;
buf = temp;
}
If it is a local variable allocated on the stack you can calculate it like this:
BYTE array[] = { 10, 20, 30, ... };
size_t lenght = sizeof(array) / sizeof(BYTE);
If you receive a pointer to the beginning of the array or you allocate it dynamically(on the heap), you need to keep the length as well as the pointer.
EDIT: I also advise you use STL vector for such needs because it already implements dynamic array semantics.