Expected behavior for using pointer beyond malloc allocated memory - c++

Just wondering, because I can't figure out a way to test this. Imagine the scenario whereby I have 10 bytes of memory
I malloc varA with 5 bytes
Assign a string with 7 characters (which use up 8 bytes)
I malloc varB with 5 bytes
Will the program run into an error? Or just end up with gibberish memory?
Does the behaviour varies from a c/c++ and a cuda program?

That's not a memory leak, it's a buffer overflow. And those leads to undefined behavior, which will most likely give you weird problems (or even crashes) during run-time.
Unless you mean point 2 literally, like in
char *str = malloc(5);
str = "foobar";
Then you do have a memory leak, and not a buffer overflow.

It is an undefined behavior to write beyond allocated memory.

Related

I don't understand memory allocation and strcpy [duplicate]

This question already has answers here:
No out of bounds error
(7 answers)
Accessing an array out of bounds gives no error, why?
(18 answers)
Closed 3 years ago.
Here's a sample of my code:
char chipid[13];
void initChipID(){
write_to_logs("Called initChipID");
strcpy(chipid, string2char(twelve_char_string));
write_to_logs("Chip ID: " + String(chipid));
}
Here's what I don't understand: even if I define chipid as char[2], I still get the expected result printed to the logs.
Why is that? Shouldn't the allocated memory space for chipid be overflown by the strcpy, and only the first 2 char of the string be printed?
Here's what I don't understand: even if I define chipid as char[2], I
still get the expected result printed to the logs.
Then you are (un)lucky. You are especially lucky if the undefined behavior produced by the overflow does not manifest as corruption of other data, yet also does not crash the program. The behavior is undefined, so you should not interpret whatever manifestation it takes as something you should rely upon, or that is specified by the language.
Why is that?
The language does not specify that it will happen, and it certainly doesn't specify why it does happen in your case.
In practice, the manifest ation you observe is as if the strcpy writes the full data into memory at the location starting at the beginning of your array and extending past its end, overwriting anything else your program may have stored in that space, and that the program subsequently reads it back via a corresponding overflowing read.
Shouldn't the allocated memory space for chipid be
overflown by the strcpy,
Yes.
and only the first 2 char of the string be
printed?
No, the language does not specify what happens once the program exercises UB by performing a buffer overflow (or by other means). But also no, C arrays are represented in memory simply as a flat sequence of contiguous elements, with no explicit boundary. This is why C strings need to be terminated. String functions do not see the declared size of the array containing a string's elements, they see only the element sequence.
You have it part correct: "the allocated memory space for chipid be overflowed by the strcpy" -- this is true. And this is why you get the full result (well, the result of an overflow is undefined, and could be a crash or other result).
C/C++ gives you a lot of power when it comes to memory. And with great power comes great responsibility. What you are doing gives undefined behaviour, meaning it may work. But it will definitely give problems later on when strcpy is writing to memory it is not supposed to write to.
You will find that you can get away with A LOT of things in C/C++. However, things like these will give you headaches later on when your program unexpectedly crashes, and this could be in an entire different part of your program, which makes it difficult to debug.
Anyway, if you are using C++, you should use std::string, which makes things like this a lot easier.

Is it safe to call memcpy with count greater than memory allocated for src?

std::string str = "hello world!";
char dest[16];
memcpy(dest, str.c_str(), 16);
str.c_str() will return a null-terminated char array. However, if I call memcpy with a count greater than 13, what would happen? Is dest going to be a null-terminated char array? Is there anything I need to be cautious about?
Your code has undefined behaviour. When using memcpy, you need to make sure that the number of bytes being copied is not greater than the min(size_of_recepient, size_of_source).
In your case, the size of your source is 13 bytes, so copying more than that is not ok.
Is dest going to be a null-terminated char array? Is there anything I
need to be cautious about?
Unfortunately, the first parts of dest will be a null-terminated character array since str.c_str() returns a null terminated character array... But the remaining parts of dest will surely contain some additional garbage.
You are accessing additional memory that you have no idea about... Your code could reformat your PC ...its called Undefined behavior,
If you give it a count that's exactly how many sequential bytes will be copied. This means that all those bytes are read from after your string if you pass in a count larger than your actual source data. If this memory does not belong to your program to use, this might cause a crash due to a Segmentation Fault.
strcpy works differently in that sense since it checks for '\0'. Also if you want to copy bytes from a std::string to a char[], you could simply not rely on dangerous C functions and use std::copy.
It's not safe. You should use strncpy, which was made for this. What's after the end of the "hello world!" std::string is undefined. It could be an undefined part of your heap, in which case you'll be copying what might be called garbage, or it could be venturing into an unallocated memory page, in which case you'll get a segfault and your program dies. (Unless it has some clever magic to handle segfaults, which, judging by your question, it probably does not have.)

This code seems to append characters outside allocated range

I'm playing with some basic stuff of cpp. I'm new in this language... so I'm warning that my question maybe was not correctly formulated. I appreciate any help.
The thing is that after saw the example in www.cplusplus.com/reference/cstdlib/malloc/ I found my self with this code:
#include <stdio.h>
int main (void) {
char *str;
str = (char*) malloc(2);
str[0] ='8';
str[1] ='8';
str[2] ='6';
str[3] ='\0';
printf ("%s\n",str);
}
And compiling with:
gcc -O0 -pedantic -Wall test2.cpp
(gcc version 4.7.2)
I get no errors and the output 886. Why I get no errors? Have I not passed the boundary of the allocated space?
I didn't get no errors and I got the output 886. Why no errors? Have I not passed the boundary of the allocated space?
In the case that code is ok... Why the example in the reference?
In the other (more probable) case... What are the risks?
Thanks!
You don't get any errors because C and C++ don't do bounds checking. You overwrote sections of memory that you weren't using, but you got lucky and it wasn't anything important. Compare it to putting a row of nails into a wall where you know there's a stud. If you miss the stud, most of the time, you just put a hole in the plaster, but it's dangerous to keep doing it because eventually, you're going to hit one of the live wires instead.
You have passed over the boundary of the allocated memory.
However, printf does not bother what size of a memory you have declared. All it cares is it will start from the start and continue till it finds a 0.
The case you created is an undefined behaviour. There can be some other data right after your allocated region (maybe another variable) in which case it will get corrupted. If the next part is unallocated memory you might escape without a visible problem. And if the memory right after your allocated memory belongs to another process, you will see the nice and tidy Segmentation Fault. The consequences can be even worse, so better not try this anywhere.
the following can be found in comments in malloc.c of glibc:
Minimum overhead per allocated chunk: 4 or 8 bytes Each malloced
chunk has a hidden word of overhead holding size and status
information.
Minimum allocated size: 4-byte ptrs: 16 bytes (including 4
overhead)
8-byte ptrs: 24/32 bytes (including, 4/8 overhead)
When a chunk is freed, 12 (for 4byte ptrs) or 20 (for 8 byte
ptrs but 4 byte size) or 24 (for 8/8) additional bytes are needed;
4 (8) for a trailing size field and 8 (16) bytes for free list
pointers. Thus, the minimum allocatable size is 16/24/32 bytes.
Since minimum allocated size would be 16/24/32, since it is greater than 3 bytes your program ran without errors. This is one of the possibility executing your program correctly.

C++ Dereference the Non-allocated Memory but Without Segmentation Fault

I have encountered a problem which I don't understand, the following is my code:
#include <iostream>
#include <stdio.h>
#include <string.h>
#include <cstdlib>
using namespace std;
int main(int argc, char **argv)
{
char *format = "The sum of the two numbers is: %d";
char *presult;
int sum = 10;
presult = (char *)calloc(sizeof(format) + 20, 1); //allocate 24 bytes
sprintf(presult, format, sum); // after this operation,
// the length of presult is 33
cout << presult << endl;
presult[40] = 'g'; //still no segfault here...
delete(presult);
}
I compiled this code on different machines. On one machine the sizeof(format) is 4 bytes and on another, the sizeof(format) is 8 bytes; (On both machines, the char only takes one byte, which means sizeof(*format) equals 1)
However, no matter on which machine, the result is still confusing to me. Because even for the second machine, the allocated memory for use is just 20 + 8 which is 28 bytes and obviously the string has a length of 33 meaning that at least 33 bytes are needed. But there is NO segmentation fault occurring after I run this program. As you can see, even if I tried to dereference the presult at position 40, the program doesn't crash and show any segfault information.
Could anyone help to explain why? Thank you so much.
Accessing unallocated memory is undefined behavior, meaning you might get a segfault (if you're lucky) or you might not.
Or your program is free to display kittens on the screen.
Speculating on why something happens or doesn't happen in undefined behavior land is usually counter-productive, but I'd imagine what's happening to you is that the OS is actually assigning your application a larger block of memory than it's asking for. Since your application isn't trying to dereference anything outside that larger block, the OS doesn't detect the problem, and therefore doesn't kill your program with a segmentation fault.
Because undefined behavior is undefined. It's not "defined to crash".
There is no seg fault because there is no reason for there to be one. You are very likely stil writing into the heap since you got memory from the heap, so the memory isn't read only. Also, the memory there is likely to exist and be allocated for you(or at least the program), so it's not an access violation. Normally you would get a seg fault because you might try to access memory that is not given to you or you may be trying to write to memory that is read only. Neither of these appears to be the case here, so nothing goes wrong.
In fact, writing past the end of a buffer is a common security problem, known as the buffer overflow. It was the most common security vulnerability for some time. Nowadays people are using higher level languages which check for out of index bounds, so this is not as big of a problem anymore.
To respond to this: "the result is still confusing to me. Because even for the second machine, the allocated memory for use is just 20 + 8 which is 28 bytes and obviously the string has a length of 33 meaning that at least 33 bytes are needed."
sizeof(some_pointer) == sizeof(size_t) on any infrastructure. You were testing on a 32bit machine (4B) and on a 64bit machine (8B).
You have to give malloc the number of bytes to allocate; sizeof(ptr_to_char) will not give you the length of the string (the number of chars until '\0').
Btw, strlen does what you want: http://www.cplusplus.com/reference/cstring/strlen/

Was I just lucky that malloc returned a zero-filled buffer?

I tried running the code below:
int main(){
char *ptr = (char*)malloc(sizeof(char)*20);
ptr[5] = 'W';
ptr[0] = 'H';
std::cout<<ptr<<std::endl;
return 0;
}
When I use GCC 4.3.4 and I get "H" as output. The book C Programming Language by Kernighan & Ritchie says malloc() returns uninitialized space, in which case the output should have been some undefined value. Is my output just a result of a coincidence of ptr[1] being '\0'?
Yes, it's just a coincidence. Don't rely on it.
It may be coincidence, or it may be an feature of your particular system's memory allocator, it is certainly not a language or library requirement to initialise dynamically allocated memory.
Don't confuse non-deterministic with random, even if the memory is uninitialised, the chances of that byte being zero are probably far greater than 1/256 due to typical memory usage patterns. For example open up a memory window in your debugger on ptr and scroll through the adjacent memory in either direction, you are likely to to see zero rather a lot.
It's platform (and compile mode) dependent, eg in debug compile might get filled with some pattern, like 0x55 0xaa 0x55, in production you might get 0x00-s, or any memory garbage.
So yes, you might declare it as a conincidence.
You might easily got a bus error (or invalid memory access), as your string is not always terminated.
Yes. But be careful in terminology, ptr[1] is int 0, not a pointer.