Buffer overflow - The changes of variables - c++

void go()
{
//{1}
char buffer[2];
gets(buffer);
//{2}
cout << allow;
}
I tried to run the procedure above in 2 cases:
-1st: I declare "int allow;' at position 1
-2nd: I declare "int allow;' at position 2
In both cases, when i tried to enter the string "123" (without the quotation marks), the allow's value was 51.
However, as I read about the memory layout, only in the first case, the position of "allow" in the stack is before buffer, which means that when the string is longer than the buffer, the value of "allow" is changed.
Then, I tried to declare "char sth[10]" in both position. This time, only when I declared sth in first position, the value of it was changed.
Can anyone explain what happened?

Since changing allow via overflow is Undefined Behavior, the compiler might even not have a variable allow at all and change your code to cout << 0 instead when compiling with optimization. This is not a valid way to check for overflow, regardless of where you put allow.
To emphasize: All changes of allow you observe are the result of UB. There are no guarantees on this in the standard what so ever. You can go ahead and speculate on why you see this output today, on you system, with this very toolchain, but the outcome might change to anything (like your program moving your lawn or stealing the crown jewels) for any reason.
Indeed, there is no way to use gets safely. This is why it is removed in both the current C++ and C standard.
You can use std::string and std::getline instead:
string buffer;
std::getline(std::cin, buffer);

Related

String input/output

I am kinda new to programming, so pardon me for any stupid questions :P, i noticed that when i write this
int main()
{
string s;
s[0]='a';
cout<<s.size();
cout<<s<<" ";
cout<<s[0];
return 0;
}
The output is 0 a , firstly why is the size 0? and why didn't anything get printed when i write cout<<s; but it gives a for cout<<s[0]. if i use push_back it gives normal out put.
int main()
{
string s;
s.push_back('a');
cout<<s.size();
cout<<s<<" ";
cout<<s[0];
return 0;
}
output :- 1a a
I might have overlooked something but i would be really appreciate if some could point out the cause.
Thank you.
EDIT: Such fast replies! thanks for your answers, i couldn't figues out how to reply to comments so edited the question body(first time using stackoverflow),
(any help on this would be appreciated as well), one more thing so when i use cout<<s[0] does it give a because a was stored on the next address of string
s?
and once again thanks for clearing that up!
What you've overlooked is that in C++ strings don't automatically grow when you assign characters to them. So
string s;
s[0]='a';
is an error because the s has size zero so there is no 'room' for the character 'a'. The correct way to add a character to a string is to use push_back which is why your second example works.
Because of the error your first example has what's called undefined behaviour (UB for short). This means the output of your program is not predictable at all, and it's more or less a waste of time asking why it outputs what it does. It could just as easily crash.
This:
string s;
creates a string containing no characters. Then this:
s[0]='a';
attempts to make a change to one of those non-existent characters. The result of this is undefined behaviour - your program goes into an unknown state.
If you would like to make your compiler warn you about this problem, you can use the at() member function of string:
s.at(0) = 'a';
Now your program will throw an std::out_of_range exception when you try to change that non-existant character.
Containers don't automatically allocate storage, so you are writing outside the allocated storage. In other words, that's a bug in your program. One advise: Many C++ implementations have a way to activate diagnostics for debugging programs, those would have caught this error.
when I write this
string s;
s[0]='a';
the output is 0, firstly why is the size 0?
The output is zero because operator[i] assumes that the string has sufficient space to store i+1 characters. In your case, string's size is zero, so accessing element at index 0 is undefined behavior.
and why didn't anything get printed when I write to cout
The same thing reason applies: after undefined behavior the program could output anything, but it happens to output nothing.
int main()
{
string s;
s='a';
cout<<s.size();
cout<<s<<" ";
cout<<s[0];
return 0;
}
Just take off [0] after s in initialisation because s is of type string not type char.
Just write s and it will work.

strange behavior of std::string assign,clear and operator[]

I am observing some strange behavior of string operation.
Ex :
int main()
{
std::string name("ABCDEFGHIJ");
std::cout << "Hello, " << name << "!\n";
name.clear();
std::cout << "Hello, " << name << "!\n";
name.assign("ABCDEF",6);
std::cout << "Hello, " << name << "!\n";
std::cout << "Hello, " << name[8] << "!\n";
}
Output:
Hello, ABCDEFGHIJ!
Hello, !
Hello, ABCDEF!
Hello, I!
string::clear is actually not clearing because I am able to access the data even after clear. As per documentation when we are accessing something out of bound the result is undefined. But here I am getting the same result every time.
Can somebody explains how it works at memory level when we call clear or opeartor[].
Welcome to C++'s amazing attraction called "undefined behavior".
When name contains a six-character string, "ABCDEF", name[8] attempts to access a nonexistent member of the string, which is undefined behavior.
Which means that the result of this operation are completely meaningless.
The C++ standard does not define the result of accessing a nonexistent member character of the string; hence the undefined behavior. The potential results of this operation can be:
Some previous value that was in the string, at the given position.
Some garbage, random character.
Your program crashes.
Anything else.
A result that's different every time you execute the program, selected from options 1 through 4.
name.assign("ABCDEF",6);
Now the string has length 6. So you may legally only access elements 0 through 5.
std::cout << "Hello, " << name[8] << "!\n";
Therefore this is Undefined Behaviour. The compiler is free to do whatever the hell it pleases. Not just with the statement, but with the whole program, even the preceding lines!
At this time, it returned the character that used to be at that position earlier. It could have returned anything else, it could have crashed, it could have skipped that statement altogether, it could have skipped the assignment and many other funny things (up to and including making daemons fly out of your nose!).
And I am saying that because all that behaviour (except the daemons) can be actually observed in the wild in various circumstances.
As others said, accessing an std::string outside it's logical boundaries (i.e. [0, size()], notice that size() is included) is undefined behavior, so the compiler can make anything happen.
Now, the particular flavor of UB you are seeing is nothing particularly unexpected.
clear() just zeroes the logical length of the string, but the memory that it used is retained (it's actually required by the standard, and quite some code would work way slower without this behavior).
Given that there's no good reason to waste time in zeroing out the old data, by accessing the string out of bounds you are seeing what was at that index previously.
This may change if you e.g. call the shrink_to_fit() method after clear(), which asks to the string to free all the extra memory it's keeping.
I'd like to add to the other answers that you can use std::string::at instead of using the operator[].
std::string::at does boundary checking and will throw a std::out_of_range when you try to access an element that is out of range.
[I ran your code through a debugger. Take note of the capacity of the string. It is still 15. "assign" did not change the capacity. SO you won't get "garbage" value as everyone is saying. You're getting the exact same data which is stored in the same location. As stated the string is just a pointer to a memory address. It will go over x bytes to access the element. name[8] is a constant value it will go to the exact same memory location.
Here is a picture of the string in debugger

Overflowing char array overwrites the exact same string every time - why?

I have the following code that shows the dangers of using char arrays over strings:
int main(){
char password[] = "SECRET";
char msg[10], ch;
int i = 0;
cout << "Please enter your name:";
while((ch = getchar()) != '\n'){
msg[i++] = ch;
}
msg[i] = '\0';
cout << "\n\nHello " << msg << endl;
cout << "The password is " << password;
}
When I enter a name (stored in char msg[10]) that is longer than 16 characters, everything after those 16 characters replaces the value stored in char password[] ("SECRET").
Why is this the case? (a general curiosity)
Why 16 characters and not 10 - the size of the array?
Why is it always password that gets overwritten and not some other variable or some other part of the memory where I wouldn't notice immediately?
What's the benefit of using char[] over strings then?
EDIT: Updated with follow up questions:
5. In response to the argument that password and msg are declared next to each other, I shuffled the declaration block as follows:
char password[] = "SECRET";
char ch;
int i = 0;
char msg[10];
However, no change.
6. In response to the argument that it was chance that caused the gap between msg and password to be 6 (bytes?) long, I have recompiled the code many times, including the reshuffling above. Still, no change.
Any suggestions as to why?
The answer for your first three questions is the same: because that's how your compiler chose to lay out these variables on the stack. Nothing in the standard guarantees that - in fact, what you're doing is undefined behavior - anything could happen.
Change compilers, or even compiler settings, and other things might happen. Or not. There's no telling.
As for 4, except for interoperability with C code, or other APIs that require C-style strings, essentially none.
1 . In your case, memory is stored like that:
msg | |i |password
| | | | | | | | | | |1|2|3|4|5|6|S|E|C|R|E|T|\0
Then you write on msg progressively:
msg | |password
|A|Z|E|R|T|Y|U|I|O|P|1|2|3|4|5|6|S|E|C|R|E|T|\0
But if you continue:
msg | |password
|A|Z|E|R|T|Y|U|I|O|P|1|2|3|4|5|6|Q|W|E|R|T|Y|
Because char array doesn t check for length. (Search for overflow).
2 .You write on memory, you erase everything in between, maybe i or something that doesn t belong to your program.
3 .So it take 6 char before you overwrite password. It could have been 0char as well as millions.
4 .Unless you store a defined array of byte... Nothing, that is the point that code prove.
UPDATE:
Changing the place of code won t change padding, add variable, array, or better: use a different compiler, so that even after optimisation, the binary change.
Recompiling will not change the binary produced, because the compiler wil do the exact same thing.
Your two arrays msg and password are static, and therefore have been placed on the stack, meaning they're near each other.
The specifics are implementation dependent and are likely to change between compilers and optimisation levels. It's possible that the compiler has padded the stack a bit when allocating memory and there is a 16 byte gap between msg[0] and password[0].
password gets overwritten everytime because it just happens to be above msg on your stack. If you used a different compiler, or swapped their positions around in code, it might not be. How things are allocated on the stack isn't going to change between executions; it's determined at compile time (it's static), not runtime.
Note that, in principle, the compiler is free to do anything it wants! We can only make educated guesses about what'll happen given typical compiler behaviour.
If you really want to know what's going on, you have to look at the ouput assembly.
std::string (for C++) is usually preferable to char[] - it's far safer as it implements bound checking and manages its own memory.
1) writing outside an array will access something else.
2) alignment probably.
3) chance. anything can happen.
4) nothing!

Pointer increment and decrement

I was solving a question my teacher gave me and hit a little snag.
I am supposed to give the output of the following code:(It's written in Turbo C++)
#include<iostream.h>
void main()
{
char *p = "School";
char c;
c=++(*(p++));
cout<<c<<","<<p<<endl;
cout<<p<<","<<++(*(p--))<<","<<++(*(p++))<<endl;
}
The output the program gives is:
T,chool
ijool,j,i
I got the part where the pointer itself increments and then increments the value which the pointer points to. But i don't get the part where the string prints out ijool
Can someone help me out?
The program you showed is non-standard and ill-formed (and should not compile).
"Small" problems:
The proper header for input/output streams in C++ is <iostream>, not <iostream.h>
main() returns an int, not a void.
cout and endl cannot be used without a using namespace std; at the beginning of the file, or better: use std::cout and std::endl.
"Core" problems:
char* p = "School"; is a pointer to string litteral. This conversion is valid in C++03 and deprecated in C++11. Aside from that, normally string litterals are read only, and attempts to modify them often result in segfaults (and modifying a string litteral is undefined behvior by the standard). So, you have undefined behavior everytime you use p, because you modify what it points to, which is the string litteral.
More subtle (and the practical explanation): you are modifying p several times in the line std::cout<<p<<","<<++(*(p--))<<","<<++(*(p++))<<std::endl;. It is undefined behavior. The order used for the operations on p is not defined, here it seems the compiler starts from the right. You can see sequence points, sequence before/after for a better explanation.
You might be interested with the live code here, which is more like what you seemed to expect from your program.
Let's assume you correct:
the header to <iostream> - there is no iostream.h header
your uses of cout and endl with std::cout and std::endl respectively
the return type of main to int
Okay,
char *p = "School";
The string literal "School" is of type "array of 7 const char." The conversion to char* was deprecated in C++03. In C++11, this is invalid.
c=++(*(p++));
Here we hit undefined behaviour. As I said before, the chars in a string literal are const. You simply can't modify them. The prefix ++ here will attempt to modify the S character in the string literal.
So from this point onwards, there's no use making conjectures about what should happen. You have undefined behaviour. Anything can happen.
Even if the preceding lines were legal, this line is also undefined behavior, which means that you cannot accurately predict what the output will be:
cout<<p<<","<<++(*(p--))<<","<<++(*(p++))<<endl;
Notice how it modifies the value of p multiple times on that line (really between sequence points)? That's not allowed. At best you can say "on this compiler with this run-time library and this environment at this moment of execution I observed the following behavior", but because it is undefined behavior you can't count on it to do the same thing every time you run the program, or even if the same code is encountered multiple times within the same run of the program.
There are at least three problems with this code (and maybe more; I'm not a C++ expert).
The first problem is that string constants like should not be modified as they can be placed in read-only parts of the program memory that the OS maps directly to the exe file on disk (the OS may share them between several running instances of that same program for example, or avoid those parts of memory needing to be written to the swap file when RAM is low, as it knows it can get the original from the exe). The example crashes on my compiler, for example. To modify the string you should allocate a modifiable duplicate of the string, such as with strdup.
The second problem is it's using cout and endl from the std namespace without declaring that. You should prefix their accesses with std:: or add a using namespace std; declaration.
The third problem is that the order in which the operations on the second cout line happen is undefined behavior, leading to the apparently mysterious change of the string between the time it was displayed at the end of the first cout line and the next line.
Since this code is not intended to do anything in particular, there are different, valid ways you could fix it. This will probably run:
#include <iostream>
#include <string.h>
#include <stdlib.h>
using namespace std;
int main()
{
char *string = strdup("School");
char *p = string;
char c;
c=++(*(p++));
cout<<c<<","<<p<<endl;
cout<<p<<","<<++(*(p--))<<","<<++(*(p++))<<endl;
free(string);
}
(On my compiler this outputs: T,chool, diool,i,d.)
It still has undefined behavior though. To fix that, rework the second cout line as follows:
cout << p << ",";
cout << ++(*(p--)) << ",";
cout << ++(*(p++)) << endl;
That should give T,chool, chool,d,U (assuming a character set that has A to Z in order).
p++ moves the position of p from "School" to "chool". Before that, since it is p++, not ++p, it increments the value of the char. Now c = "T" from "S"
When you output p, you output the remainder of p, which we identified before as "chool".
Since it is best to learn from trial and error, run this code with a debugger. That is a great tool which will follow you forever. That will help for the second set of cout statements. If you need help with gdb or VS debugger, we can walk through it.

How to prevent copying a wild pointer string

My program is crash intermittently when it tries to copy a character array which is not ended by a NULL terminator('\0').
class CMenuButton {
TCHAR m_szNode[32];
CMenuButton() {
memset(m_szNode, '\0', sizeof(m_szNode));
}
};
int main() {
....
CString szTemp = ((CMenuButton*)pButton)->m_szNode; // sometime it crashes here
...
return 0;
}
I suspected someone had not copied the character well ended by '\0', and it ended like:
Stack
m_szNode $%#^&!&!&!*#*#&!(*#(!*##&#&*&##!^&*&#(*!#*((*&*SDFKJSHDF*(&(*&(()(**
Can you tell me what is happening and what should i do to prevent the copying of wild pointer? Help will be very much appreciated!
I guess I'm unable to check if the character array is NULL before copying...
I suspect that your real problem could be that pButton is a bad pointer, so check that out first.
The only way to be 100% sure that a pointer is correct, and points to a correctly sized/allocated object is to never use pointers you didn't create, and never accept/return pointers. You would use cookies, instead, and look up your pointer in some sort of cookie -> pointer lookup (such as a hash table). Basically, don't trust user input.
If you are more concerned with finding bugs, and less about 100% safety against things like buffer overrun attacks, etc. then you can take a less aggressive approach. In your function signatures, where you currently take pointers to arrays, add a size parameter. E.g.:
void someFunction(char* someString);
Becomes
void someFunction(char* someString, size_t size_of_buffer);
Also, force the termination of arrays/strings in your functions. If you hit the end, and it isn't null-terminated, truncate it.
Make it so you can provide the size of the buffer when you call these, rather than calling strlen (or equivalent) on all your arrays before you call them.
This is similar to the approach taken by the "safe string functions" that were created by Microsoft (some of which were proposed for standardization). Not sure if this is the perfect link, but you can google for additional links:
http://msdn.microsoft.com/en-us/library/ff565508(VS.85).aspx
There are two possibilities:
pButton doesn't point to a CMenuButton like you think it does, and the cast is causing undefined behavior.
The code that sets m_szNode is incorrect, overflowing the given size of 32 characters.
Since you haven't shown us either piece of code, it's difficult to see what's wrong. Your initialization of m_szNode looks OK.
Is there any reason that you didn't choose a CString for m_szNode?
My approach would be to make m_szNode a private member in CMenuButton, and explicitly NULL-terminate it in the mutator method.
class CMenuButton {
private:
TCHAR m_szNode[32];
public:
void set_szNode( TCHAR x ) {
// set m_szNode appropriately
m_szNode[ 31 ] = 0;
}
};