cout prints char[] containing more characters than set length? [duplicate] - c++

This question already has answers here:
What is a buffer overflow and how do I cause one?
(12 answers)
Closed 5 years ago.
#include<iostream>
using namespace std;
int main(void)
{
char name[5];
cout << "Name: ";
cin.getline(name, 20);
cout << name;
}
Output:
Name: HelloWorld
HelloWorld
Shouldn't this give an error or something?
Also when I write an even longer string,
Name: HelloWorld Goodbye
HelloWorld Goodbye
cmd exits with an error.
How is this possible?
Compiler: G++ (GCC 7), Nuwen
OS: Windows 10

It's called buffer overflow and is a common source of code bugs and exploits. It's the developers responsibility to ensure it doesn't happen. character strings wil be printed until they reach the first '\0' character

The code produces "undefined behavior". This means, anything might happen. In your case, the program works unexpectedly. It might however do something completely different with different compiler flags or on a different system.
Shouldn't this give an error or something.
No. The compiler cannot know that you will input a long string, thus there cannot be any compiler error. You also don't throw any runtime exception here. It is up to you to make sure the program can handle long strings.

Your code has encountered UB, also known as undefined behaviour, which, as Wikipedia defines, the result of executing computer code whose behavior is not prescribed by the language specification to which the code adheres. It usually occurs when you do note define variables properly, in this case a too small char array.

Even -Wall flag will not give any warning. So you can use tools like valgrind and gdb to detect memory leaks and buffer overflows

You can check those questions:
Array index out of bound in C
No out of bounds error
They have competent answers.
My short answer, based on those already given in the questions I posted:
Your code implements an Undefined Behavior(buffer overflow) so it doesn't give an error when you run it once. But some other time it may give. It's a chance thing.
When you enter a longer string, you actually corrupt the memory (stack) of the program (i.e you overwrite the memory which should contain some program-related data, with your data) and so the return code of your program ends up being different than 0 which interprets as an error. The longer the string, the higher the chance of screwing things up (sometimes even short strings screw things up)
You can read more here: https://en.wikipedia.org/wiki/Buffer_overflow

Related

I don't understand memory allocation and strcpy [duplicate]

This question already has answers here:
No out of bounds error
(7 answers)
Accessing an array out of bounds gives no error, why?
(18 answers)
Closed 3 years ago.
Here's a sample of my code:
char chipid[13];
void initChipID(){
write_to_logs("Called initChipID");
strcpy(chipid, string2char(twelve_char_string));
write_to_logs("Chip ID: " + String(chipid));
}
Here's what I don't understand: even if I define chipid as char[2], I still get the expected result printed to the logs.
Why is that? Shouldn't the allocated memory space for chipid be overflown by the strcpy, and only the first 2 char of the string be printed?
Here's what I don't understand: even if I define chipid as char[2], I
still get the expected result printed to the logs.
Then you are (un)lucky. You are especially lucky if the undefined behavior produced by the overflow does not manifest as corruption of other data, yet also does not crash the program. The behavior is undefined, so you should not interpret whatever manifestation it takes as something you should rely upon, or that is specified by the language.
Why is that?
The language does not specify that it will happen, and it certainly doesn't specify why it does happen in your case.
In practice, the manifest ation you observe is as if the strcpy writes the full data into memory at the location starting at the beginning of your array and extending past its end, overwriting anything else your program may have stored in that space, and that the program subsequently reads it back via a corresponding overflowing read.
Shouldn't the allocated memory space for chipid be
overflown by the strcpy,
Yes.
and only the first 2 char of the string be
printed?
No, the language does not specify what happens once the program exercises UB by performing a buffer overflow (or by other means). But also no, C arrays are represented in memory simply as a flat sequence of contiguous elements, with no explicit boundary. This is why C strings need to be terminated. String functions do not see the declared size of the array containing a string's elements, they see only the element sequence.
You have it part correct: "the allocated memory space for chipid be overflowed by the strcpy" -- this is true. And this is why you get the full result (well, the result of an overflow is undefined, and could be a crash or other result).
C/C++ gives you a lot of power when it comes to memory. And with great power comes great responsibility. What you are doing gives undefined behaviour, meaning it may work. But it will definitely give problems later on when strcpy is writing to memory it is not supposed to write to.
You will find that you can get away with A LOT of things in C/C++. However, things like these will give you headaches later on when your program unexpectedly crashes, and this could be in an entire different part of your program, which makes it difficult to debug.
Anyway, if you are using C++, you should use std::string, which makes things like this a lot easier.

Why does this code with a character array, which was given a variable as size, compile? [duplicate]

This question already has answers here:
Variable Length Array (VLA) in C++ compilers
(2 answers)
Closed 3 years ago.
Playing around with templates, I ran into an interesting phenomenon w.r.t to array and defining its size, which I thought is not allowed in C++.
I used a global variable to define the size of an array inside main(), and it worked for some reason (see code below)
1) Why does this even compile? I thought only constexpr may be used for array size
2) Suppose the above is valid, it still does not explain why it works even when sz = 8 which is clearly less than the size of the character string
3) Why are we getting that weird '#' and '?'. I tried different combination of strings, for example, only number characters ("123456789") and it did not appear.
Appreciate any help. Thanks.
Here is my code
#include <iostream>
#include <cstring>
using namespace std;
int sz = 8;
int main()
{
const char temp[sz] = "123456abc"; //9 characters + 1 null?
cout << temp << endl;
return 0;
}
Output:
123456ab�
#
1) Why does this even compile? I thought only constexpr may be used for array size
The asker is correct that Variable Length Arrays (VLAs) are not Standard C++. The g++ compiler includes support for VLAs by extension. How can this be? A compiler developer can add pretty much anything it wants so long as the behaviour required by the Standard is met. It is up to the developer to document the behaviour.
2) Suppose the above is valid, it still does not explain why it works even when sz = 8 which is clearly less than the size of the character string
Normally g++ emits an error if an array is initialized with values that exceed the size of the array. I have to confirm with documentation to see whether the C++ Standard requires an error in this case. It seems like a good place for an error, but I can't remember if an error is required.
In this case it appears that the VLA extension has a side effect that eliminates the error g++ normally spits out for trying to overfill an array. This makes a lot of sense since the compiler doesn't know the size of the array, in the general case, at compile time and cannot perform the test. No test, no error.
None of this is covered by the C++ Standard because VLA is not supported.
A quick look through the C standard, which does permit VLAs, didn't turn up any guidance for this case. C the rules for initializing VLAs are pretty simple: You can't. The compiler doesn't know how big the array will be at compile time, so it can't initialize. There may be an exception for string literals, but I haven't found one.
I also have not found a good description of the behaviour in GCC documentation.
clang produces the error I expect based on my read of the C standard: error: variable-sized object may not be initialized
Addendum: Probably should have checked Stack Overflow first and saved myself a lot of time: Initializing variable length array .
3) Why are we getting that weird '#' and '?'. I tried different combination of strings, for example, only number characters ("123456789") and it did not appear.
What appears to be happening, and since none of this is standard I have no quotes to back it up, is the string literal is copied into the VLA up to the size of the VLA. The portions of the literal past the end of the VLA are silently discarded. This includes the null terminator, and the behaviour is undefined when printing an unterminated char array.
Solution:
Stick to Standardized behaviour where possible. The major compilers have options to warn you of non-Standard code. Use -pedantic with compilers using the gcc options and /permissive- with recent versions of Visual Studio. When forced to step outside the Standard, consult the compiler documentation or the implementers themselves for the sticky-or-undocumented bits.
If you don't get good answers, try to find another path.

What do the memory operations malloc and free exactly do?

Recently I met a memory release problem. First, the blow is the C codes:
#include <stdio.h>
#include <stdlib.h>
int main ()
{
int *p =(int*) malloc(5*sizeof (int));
int i ;
for(i =0;i<5; i++)
p[i ]=i;
p[i ]=i;
for(i =0;i<6; i++)
printf("[%p]:%d\n" ,p+ i,p [i]);
free(p );
printf("The memory has been released.\n" );
}
Apparently, there is the memory out of range problem. And when I use the VS2008 compiler, it give the following output and some errors about memory release:
[00453E80]:0
[00453E84]:1
[00453E88]:2
[00453E8C]:3
[00453E90]:4
[00453E94]:5
However when I use the gcc 4.7.3 compiler of cygwin, I get the following output:
[0x80028258]:0
[0x8002825c]:1
[0x80028260]:2
[0x80028264]:3
[0x80028268]:4
[0x8002826c]:51
The memory has been released.
Apparently, the codes run normally, but 5 is not written to the memory.
So there are maybe some differences between VS2008 and gcc on handling these problems.
Could you guys give me some professional explanation on this? Thanks In Advance.
This is normal as you have never allocated any data into the mem space of p[5]. The program will just print what ever data was stored in that space.
There's no deterministic "explanation on this". Writing data into the uncharted territory past the allocated memory limit causes undefined behavior. The behavior is unpredictable. That's all there is to it.
It is still strange though to see that 51 printed there. Typically GCC will also print 5 but fail with memory corruption message at free. How you managed to make this code print 51 is not exactly clear. I strongly suspect that the code you posted is not he code you ran.
It seems that you have multiple questions, so, let me try to answer them separately:
As pointed out by others above, you write past the end of the array so, once you have done that, you are in "undefined behavior" territory and this means that anything could happen, including printing 5, 6 or 0xdeadbeaf, or blow up your PC.
In the first case (VS2008), free appears to report an error message on standard output. It is not obvious to me what this error message is so it is hard to explain what is going on but you ask later in a comment how VS2008 could know the size of the memory you release. Typically, if you allocate memory and store it in pointer p, a lot of memory allocators (the malloc/free implementation) store at p[-1] the size of the memory allocated. In practice, it is common to also store at address p[p[-1]] a special value (say, 0xdeadbeaf). This "canary" is checked upon free to see if you have written past the end of the array. To summarize, your 5*sizeof(int) array is probably at least 5*sizeof(int) + 2*sizeof(char*) bytes long and the memory allocator used by code compiled with VS2008 has quite a few checks builtin.
In the case of gcc, I find it surprising that you get 51 printed. If you wanted to investigate wwhy that is exactly, I would recommend getting an asm dump of the generated code as well as running this under a debugger to check if 5 is actually really written past the end of the array (gcc could well have decided not to generate that code because it is "undefined") and if it is, to put a watchpoint on that memory location to see who overrides it, when, and why.

Weird behavior of C++ in this simple 3 line program, regarding arrays? [duplicate]

This question already has answers here:
Accessing an array out of bounds gives no error, why?
(18 answers)
Closed 9 years ago.
Why does this work without any errors?
int array[2];
array[5] = 21;
cout << array[5];
It printed out 21 just fine. But check this out! I changed 5 to 46 and it still worked. But when I put 47, it didn't print anything. And showed no errors anywhere. What's up with that!?!?
Because it's simply undefined behaviour (there is no checks for bounds of arrays in C++). Anything can happen.
Simply array[5] is equivalent to *(&array[0] + 5), you are trying to write/read to memory, that you are not allocate.
In the C and C++ language there are very few runtime errors.
When you make a mistake what happens instead is that you get "undefined behavior" and that means that anything may happen. You can get a crash (i.e. the OS will stop the process because it's doing something nasty) or you can just corrupt memory in your program and things seems to work anyway until someone needs to use that memory. Unfortunately the second case is by far the most common so when a program writes outside an array normally the program crashes only one million instructions executed later, in a perfectly innocent and correct part.
The main philosophical assumption of C and C++ is that a programmer never makes mistakes like accessing an array with an out-of-bounds index, deallocating twice the same pointer, generating a signed integer overflow during a computation, dereferencing a null pointer and so on.
This is also the reason for which trying to learn C/C++ just using a compiler and experimenting by writing code is a terrible idea because you will not be notified of this pretty common kind of error.
The array has 2 elements, but you are assigning array[5] = 21; which means 21 is in memory outside the array. Depends on your system and environment array[46] is a valid memory to hold a number but array[47] isn't.
You should do this
int array[47];
array[47] = 21;
cout << array[47];

Related to strings

//SECTION I:
void main()
{
char str[5] = "12345"; //---a)
char str[5] = "1234"; //---b)
cout<<"String is: "<<str<<endl;
}
Output: a) Error: Array bounds Overflow.
b) 1234
//SECTION II:
void main()
{
char str[5];
cout<<"Enter String: ";
cin>>str;
cout<<"String is: "<<str<<endl;
}
I tried with many different input strings, and to my surprise, I got strange result:
Case I: Input String: 1234, Output: 1234 (No issue, as this is expected behavior)
Case II: Input String: 12345, Output: 12345 (NO error reported by compiler But I was expecting an Error: Array bounds Overflow.)
Case III: Input String: 123456, Output: 123456 (NO error reported by compiler But I was expecting an Error: Array bounds Overflow.)
.................................................
.................................................
Case VI: Input String: 123456789, Output: 123456789(Error: unhandeled exception. Access Violation.)
My doubt is, When I assigned more characters than its capacity in SECTION I, compiler reported ERROR: Array bounds Overflow.
But, when I am trying the same thing in SECTION II, I am not geting any errors. WHY it is so ?? Please note: I executed this on Visual Studio
char str[5] = "12345";
This is a compiletime error. You assign a string of length 6 (mind the appended null-termination) to an array of size 5.
char str[5];
cin>>str;
This may yield a runtime error. Depending on how long the string is you enter, the buffer you provide (size 5) may be too small (again mind the null-termination).
The compiler of course can't check your user input during runtime. If you're lucky, you're notified for an access violation like this by Segmentation Faults. Truly, anything can happen.
Throwing exceptions on access violations is not mandatory. To address this, you can implement array boundary checking yourself, or alternatively (probably better) there are container classes that adapt their size as necessary (std::string):
std::string str;
cin >> str;
What you are seeing is an undefined behavior. You are writing array out of bounds, anything might happen in that case ( (including seeing the output you expect).
I tried with many different input strings, and to my surprise, I got
strange result:
This phenomena is called Undefined behavior (UB).
As soon as you enter more characters than a char array can hold, you invite UB.
Sometime, it may work, it may not work sometime, it may crash. In short, there is no definite pattern.
[side note: If a compiler allows void main() to get compiled then it's not standard compliant.]
It's because in the second case, the compiler can't know. Only at runtime, there is a bug. C / C++ offer no runtime bounds checking by default, so the error is not recognized but "breaks" your program. However, this breaking doesn't have to show up immediately but while str points to a place in memory with only a fixed number of bytes reserved, you just write more bytes anyways.
Your program's behavior will be undefined. Usually, and also in your particular case, it may continue working and you'll have overwritten some other component's memeory. Once you write so much to access forbidden memory (or free the same memory twice), your program crashes.
char str[5] = "12345" - in this case you didn't leave room to the null terminator. So when the application tries to print str, it goes on and on as long as a null is not encountered, eventually stepping on foridden memory and crashing.
In the `cin case, the cin operation stuffed 0 at the end of the string, so this stops the program from going too far in memory.
Hoever, from my experience, when breaking the rules with memory overrun, things go crazy, and looking for a reason why in one case it works while in other it doesn't, doesn't lead anywhere. More often than not, the same application (with a memory overrun) can work on one PC and crash on another, due to different memory states on them.
because compiler does static check. In your section II, size is unknown in advance. It depends on your input length.
May i suggest you to use STL string ?