Why are we able to access unallocated memory in a class? - c++

I am sorry if I may not have phrased the question correctly, but in the following code:
int main() {
char* a=new char[5];
a="2222";
a[7]='f'; //Error thrown here
cout<<a;
}
If we try to access a[7] in the program, we get an error because we haven't been assigned a[7].
But if I do the same thing in a class :
class str
{
public:
char* a;
str(char *s) {
a=new char[5];
strcpy(a,s);
}
};
int main()
{
str s("ssss");
s.a[4]='f';s.a[5]='f';s.a[6]='f';s.a[7]='f';
cout<<s.a<<endl;
return 0;
}
The code works, printing the characters "abcdfff".
How are we able to access a[7], etc in the code when we have only allocated char[5] to a while we were not able to do so in the first program?

In your first case, you have an error:
int main()
{
char* a=new char[5]; // declare a dynamic char array of size 5
a="2222"; // assign the pointer to a string literal "2222" - MEMORY LEAK HERE
a[7]='f'; // accessing array out of bounds!
// ...
}
You are creating a memory leak and then asking why undefined behavior is undefined.
Your second example is asking, again, why undefined behavior is undefined.

As others have said, it's undefined behavior. When you write to memory out of bounds of the allocated memory for the pointer, several things can happen
You overwrite an allocated, but unused and so far unimportant location
You overwrite a memory location that stores something important for your program, which will lead to errors because you've corrupted your own memory at that point
You overwrite a memory location that you aren't allowed to access (something out of your program's memory space) and the OS freaks out, causing an error like "AccessViolation" or something
For your specific examples, where the memory is allocated is based on how the variable is defined and what other memory has to be allocated for your program to run. This may impact the probability of getting one error or another, or not getting an error at all. BUT, whether or not you see an error, you shouldn't access memory locations out of your allocated memory space because like others have said, it's undefined and you will get non-deterministic behavior mixed with errors.

int main() {
char* a=new char[5];
a="2222";
a[7]='f'; //Error thrown here
cout<<a;
}
If we try to access a[7] in the program, we get an error because we
haven't been assigned a[7].
No, you get a memory error from accessing memory that is write-protected, because a is pointing to the write-only memory of "2222", and by chance two bytes after the end of that string is ALSO write-protected. If you used the same strcpy as you use in the class str, the memory access would overwrite some "random" data after the allocated memory which is quite possibly NOT going to fail in the same way.
It is indeed invalid (undefined behaviour) to access memory outside of the memory you have allocated. The compiler, C and C++ runtime library and OS that your code is produced with and running on top of is not guaranteed to detect all such things (because it can be quite time-consuming to check every single operation that accesses memory). But it's guaranteed to be "wrong" to access memory outside of what has been allocated - it just isn't always detected.

As mentioned in other answers, accessing memory past the end of an array is undefined behavior, i.e. you don't know what will happen. If you are lucky, the program crashes; if not, the program continues as if nothing was wrong.
C and C++ do not perform bounds checks on (simple) arrays for performance reasons.
The syntax a[7] simply means go to memory position X + sizeof(a[0]), where X is the address where a starts to be stored, and read/write. If you try to read/write within the memory that you have reserved, everything is fine; if outside, nobody knows what happens (see the answer from #reblace).

Related

Accessing freed pointers can cause data corruption if malloc() allocates memory in the same spot unless the freed pointer is set to NULL

This Question statement is came in picture due to statement made by user (Georg Schölly 116K Reputation) in his Question Should one really set pointers to `NULL` after freeing them?
if this Question statement is true
Then How data will corrupt I am not getting ?
Code
#include<iostream>
int main()
{
int count_1=1, count_2=11, i;
int *p=(int*)malloc(4*sizeof(int));
std::cout<<p<<"\n";
for(i=0;i<=3;i++)
{
*(p+i)=count_1++;
}
for(i=0;i<=3;i++)
{
std::cout<<*(p+i)<<" ";
}
std::cout<<"\n";
free(p);
p=(int*)malloc(6*sizeof(int));
std::cout<<p<<"\n";
for(i=0;i<=5;i++)
{
*(p+i)=count_2++;
}
for(i=0;i<=3;i++)
{
std::cout<<*(p+i)<<" ";
}
}
Output
0xb91a50
1 2 3 4
0xb91a50
11 12 13 14
Again it is allocating same memory location after freeing (0xb91a50), but it is working fine, isn't it ?
You do not reuse the old pointer in your code. After p=(int*)malloc(6*sizeof(int));, p point to a nice new allocated array and you can use it without any problem. The data corruption problem quoted by Georg would occur in code similar to that:
int *p=(int*)malloc(4*sizeof(int));
...
free(p);
// use a different pointer but will get same address because of previous free
int *pp=(int*)malloc(6*sizeof(int));
std::cout<<p<<"\n";
for(i=0;i<=5;i++)
{
*(pp+i)=count_2++;
}
p[2] = 23; //erroneouly using the old pointer will corrupt the new array
for(i=0;i<=3;i++)
{
std::cout<<*(pp+i)<<" ";
}
Setting the pointer to NULL after you free a block of memory is a precaution with the following advantages:
it is a simple way to indicate that the block has been freed, or has not been allocated.
the pointer can be tested, thus preventing access attempts or erroneous calls to free the same block again. Note that free(p) with p a null pointer is OK, as well as delete p;.
it may help detect bugs: if the program tries to access the freed object, a crash is certain on most targets if the pointer has been set to NULL whereas if the pointer has not been cleared, modifying the freed object may succeed and result in corrupting the heap or another object that would happen to have been allocated at the same address.
Yet this is not a perfect solution:
the pointer may have been copied and these copies still point to the freed object.
In your example, you reuse the pointer immediately so setting it to NULL after the first call to free is not very useful. As a matter of fact, if you wrote p = NULL; the compiler would probably optimize this assignment out and not generate code for it.
Note also that using malloc() and free() in C++ code is frowned upon. You should use new and delete or vector templates.

Memory leak on deallocating char * set by strcpy?

I have a memory leak detector tool which tells me below code is leaking 100 bytes
#include <string>
#include <iostream>
void setStr(char ** strToSet)
{
strcpy(*strToSet, "something!");
}
void str(std::string& s)
{
char* a = new char[100]();
setStr(&a);
s = a;
delete[] a;
}
int main()
{
std::string s1;
str(s1);
std::cout << s1 << "\n";
return 0;
}
According to this point number 3 it is leaking the amount I allocated (100) minus length of "something!" (10) and I should be leaking 90 bytes.
Am I missing something here or it is safe to assume the tool is reporting wrong?
EDIT: setStr() is in a library and I cannot see the code, so I guessed it is doing that. It could be that it is allocating "something!" on the heap, what about that scenario? Would we have a 90 bytes leak or 100?
This code does not leak and is not the same as point number 3 as you never overwrite variables storing pointer to allocated memory. The potential problems with this code are that it is vulnerable to buffer overflow as if setStr prints more than 99 symbols and it is not exception-safe as if s = a; throws then delete[] a; won't be called and memory would leak.
Updated: If setStr allocates new string and overwrites initial pointer value then the pointer to the 100 byte buffer that you've allocated is lost and those 100 bytes leak. You should initialize a with nullptr prior to passing it to setStr and check that it is not null after setStr returns so assignment s = a; won't cause null pointer dereference.
Summing up all the comments, it is clear what the problem is. The library you are using is requesting a char **. This is a common interface pattern for C functions that allocate memory and return a pointer to that memory, or that return a pointer to memory they own.
The memory you are leaking is allocated in the line char* a = new char[100]();. Because setStr is changing the value of a, you can no longer deallocate that memory.
Unfortunately, without the documentation, we cannot deduce what you are supposed to do with the pointer.
If it is from a call to new[] you need to call delete[].
If it is from a call to malloc you need to call std::free.
If it is a pointer to memory owned by the library, you should do nothing.
You need to find the documentation for this. However, if it is not available, you can try using your memory leak detection tool after removing the new statement and see if it detects a leak. I'm not sure if it is going to be reliable with memory allocated from a library function but it is worth a try.
Finally, regarding the question in your edit, if you leak memory you leak the whole amount, unless you do something that is undefined behavior, which is pointless to discuss anyway. If you new 100 chars and then write some data on them, that doesn't change the amount of memory leaked. It will still be 100 * sizeof(char)

delete[] operator causes segmentation fault in very simple case

I have a very strange segmentation fault that occurs when I call delete[] on an allocated dynamic array (created with the new keyword). At first it occurred when I deleted a global pointer, but it also happens in the following very simple case, where I delete[] arr
int main(int argc, char * argv [])
{
double * arr = new double [5];
delete[] arr;
}
I get the following message:
*** Error in `./energy_out': free(): invalid next size (fast): 0x0000000001741470 ***
Aborted (core dumped)
Apart from the main function, I define some fairly standard functions, as well as the following (defined before the main function)
vector<double> cos_vector()
{
vector<double> cos_vec_temp = vector<double>(int(2*pi()/trig_incr));
double curr_val = 0;
int curr_idx = 0;
while (curr_val < 2*pi())
{
cos_vec_temp[curr_idx] = cos(curr_val);
curr_idx++;
curr_val += trig_incr;
}
return cos_vec_temp;
}
const vector<double> cos_vec = cos_vector();
Note that the return value of cos_vector, cos_vec_temp, gets assigned to the global variable cos_vec before the main function is called.
The thing is, I know what causes the error: cos_vec_temp should be one element bigger, as cos_vec_temp[curr_idx] ends up accessing one element past the end of the vector cos_vec_temp. When I make cos_vec_temp one element larger at its creation, the error does not occur. But I do not understand why it occurs at the delete[] of arr. When I run gdb, after setting a breakpoint at the start of the main function, just after the creation of arr, I get the following output when examining contents of the variables:
(gdb) p &cos_vec[6283]
$11 = (__gnu_cxx::__alloc_traits<std::allocator<double> >::value_type *) 0x610468
(gdb) p arr
$12 = (double *) 0x610470
In the first gdb command, I show the memory location of the element just past the end of the cos_vec vector, which is 0x610468. The second gdb command shows the memory location of the arr pointer, which is 0x610470. Since I assigned a double to the invalid memory location 0x610468, I understand it must have wrote partly over the location that starts at 0x610470, but this was done before arr was even created (the function is called before main). So why does this affect arr? I would have thought that when arr is created, it does not "care" what was previously done to the memory location there, since it is not registered as being in use.
Any clarification would be appreciated.
NOTE:
cos_vec_temp was previously declared as a dynamic double array of size int(2*pi()/trig_incr) (same size as the one in the code, but created with new). In that case, I also had the invalid access as above, and it also did not give any errors when I accessed the element at that location. But when I tried to call delete[] on the cos_vec global variable (which was of type double * then) it also gave a segmentation fault, but it did not give the message that I got for the case above.
NOTE 2:
Before you downvote me for using a dynamic array, I am just curious as to why this occurs. I normally use STL containers and all their conveniences (I almost NEVER use dynamic arrays).
Many heap allocators have meta-data stored next to the memory it allocates for you, before or after (or both) the memory. If you write out of bounds of some heap-allocated memory (and remember that std::vector dynamically allocates off the heap) you might overwrite some of this meta-data, corrupting the heap.
None of this is actually specified in the C++ specifications. All it says that going out of bounds leads to undefined behavior. What the allocators do, or store, and where it possibly store meta-data, is up to the implementation.
As for a solution, well most people tell you to use push_back instead of direct indexing, and that will solve the problem. Unfortunately it will also mean that the vector needs to be reallocated and copied a few times. That can be solved by reserving an approximate amount of memory beforehand, and then let the extra stray element lead to a reallocation and copying.
Or, or course, make better predictions for the actual amount of elements the vector will contain.
It looks like you are writing past the end of the vector allocated in the function executing before main, causing undefined behavior later on.
You should be able to fix the problem by rounding the number up when allocating the vector (casting to int rounds the number down), or using push_back instead of indexing:
cos_vec_temp.push_back(cos(curr_val));

Unable to spot Memory leak issue in below code

I am very new to C++. I am facing memory leak issue in my c++ code. Please see the below mentioned piece of code, which is causing the issue.
void a()
{
char buffer[10000];
char error_msg[10000];
char log_file[FILENAME_L] = "error_log.xml";
FILE *f;
f = fopen(log_file,"r");
while (fgets(buffer, 1000, f) != NULL)
{
if (strstr(buffer, " Description: ") != NULL)
{
strcpy(error_msg, buffer);
}
}
fclose(f);
actual_error_msg = trimwhitespace(error_msg);
}
Can anyone please suggest on this. Do I need to use malloc instead of hardcoded size of array?
It seems that there is undefined behaviour if variable actual_error_msg is a global variable and function trimwhitespace does not dynamically alocate memory for a copy of error_msg
actual_error_msg = trimwhitespace(error_msg);
So when the function finishes its execution pointer actual_error_msg will be invalid.
Can anyone please suggest on this
I am suggesting to allocate dynamically memory for a copy of error_msg within function trimwhitespace. Or if you already do it yourself then check whether the memory is freed in time.:)
Take into account that it looks strange that buffer is declared with the size equal to 10000 while in the fgets there is used magic number 1000.
char buffer[10000];
//,,,
while (fgets(buffer, 1000, f) != NULL)
TL;DR - In the code snippet shown above, there is no memory leak.
Do I need to use malloc instead of hardcoded size of array?
I think, you got confused by the possible underuse of char buffer[10000]; and char error_msg[10000];. These arrays are not allocated dynamically. Even the arrays are not used to their fullest capacities, there is no memory leak here.
Also, as Mr. #Vlad metioned rightly about another much possible issue in your case, actual_error_msg being a global, if the trimwhitespace() function does not have a return value which is having a global scope, (i.e., stays valid after the a() has finished execution), it may possibly lead to undefined behaviour.
To avoid that, make sure, trimwhitespace() function is either returning (assuming return type is char *)
a pointer with dynamic memory allocation (Preferred)
base address of a static array. (bad practice, but will work)
To elaborate, from the Wikipedia article about "memory leak"
In computer science, a "memory leak" is a type of resource leak that occurs when a computer program incorrectly manages memory allocations in such a way that memory which is no longer needed is not released. ...
and
.. Typically, a memory leak occurs because dynamically allocated memory has become unreachable. ...
When memory is being allocated by your compiler, there is no scope of memory leak, as the memory (de)allocation is managed by the compiler.
OTOH, with dynamic memory allocation, allocation of memory is performed at runtime. Compiler has no information about the allocation, memory is allocated programatically, hence need to be freed programatically, also. Failing to do so leads to the "memory leak".

C++ stack and heap corruption

I was recently reading about stack & heap corruption in C & C++. The author of the website demonstrates stack corruption using below example.
#include<stdio.h>
int main(void)
{
int b = 10;
int a[3];
a[0] = 1;
a[1] = 2;
a[2] = 3;
printf(" b = %d \n",b);
a[3] = 12; // oops it is invalid, behaviour is undefined
printf(" b = %d \n",b);
printf("address of b= %x\n",&b);
printf("address of a[3]= %x\n",&a[3]);
return 0;
}
I tested above program on visual studio 2010 compiler (VC++) & it gives me runtime error that says:
stack around variable a gets corrupted
Now my question: is stack corrupted for lifetime or it is only for the time during when above erroneous program was being executed?
Same way, I know that deleting same pointer twice might do really bad things like heap corruption.
The following code:
int* p=new int();
delete p;
delete p; // oops disaster here, undefined behaviour
When the above code fragment executes the VC++ shows heap corruption error at runtime.
It is Undefined Behaviour. You cannot know what will happen if you do 'forbidden' things. You have no guarantee that your program will work well.
You have to be careful with terminology here. Will the stack be "corrupted" for the remainder of your program's life? It may be; it may not be. In this instance you've only corrupted data within the current stack frame, so once you're out of that function call, in practice your "corruption" will have gone.
But that's not quite the whole story. Since you've overwritten a variable with bytes that aren't supposed to be there, what knock-on effects might that have on your program? The consequences of this memory corruption could feasibly be logically passed on to other function scopes, or even other computers if you're sending this data over a network connection and the data is no longer in the expected form. (Typically, your data protocol will have safety features built into it to detect and discard unexpected forms of data; but, that's up to you.)
The same is true of heap corruption. Any time you overwrite the bytes of something that is not supposed to be overwritten, and any time you do so with arbitrary or unknowable data, you run the risk of potentially catastrophic consequences that may logically last well beyond the lifetime of your program.
Within the scope of C++ as a language, this condition is summed up in a specific phrase: undefined behaviour. It states that you can't really rely on anything at all after you've corrupted your memory. Once you've invoked UB, all bets are off.
The one guarantee that you usually have in practice is that your OS will not allow you to directly overwrite any memory that does not belong to your program. That is, corrupting the memory of other processes or of the OS itself is very difficult. The memory model of modern OSs is deliberately designed that way in order to keep programs isolated and prevent this kind of damage from broken programs and/or viruses.
C++ as well as C does not have array boundary overflow or underflow check. However, you can abstract out, you may define an array with overloaded index operator (operator []) where you can check for array index out of bounds and act accordingly. When you delete a pointer using delete ptr (when ptr is allocated through new), the space allocated before is returned back to heap space, however the value of the pointer becomes same as before. So, it;s a good programming practice that you should make the ptr NULL after delete, e.g.
int* p=new int();
...
if (p) {
delete p;
p = (int *) NULL;
}
// double deletion is prevented, and ptr us not dangling any more
if (p) {
delete p;
p = (int *) NULL;
}
However, stack or heap corruption, if at all, lies confined within the program space and when the program terminates, normally or abnormally, all memory occupies are released back to the operating system