Reading char* from stream - another buffer overflow fiasco? - c++

Today I've discovered that the following compiles and prints 42:
#include <iostream>
#include <sstream>
int main()
{
std::stringstream s;
s << 42;
char c[8];
s >> c;
std::cout << c;
}
But this is a potential buffer overflow attack, right? If we are reading from the user-supplied stream, we can't easily know the size of the data and therefore can't allocate enough storage. std::gets was removed, maybe this should be too?

Well, you can prevent buffer overflow in this case by writing:
s >> setw(sizeof c) >> c;
So I think it is more akin to the case of fgets, which can be used to shoot yourself in the foot, but can also be used correctly and it is a perfectly viable option when used correctly.
I expect there is still enough live code that uses this overload of operator>> that it's not really viable to deprecate it, e.g.:
void func(char *buf, size_t buf_len)
{
std::cin >> setw(buf_len) >> buf;
}
But for writing new code my advice would be to avoid using arrays entirely (C-style arrays, that is). Instead use std::string, or std::array, or other such containers which are harder to cause buffer overflows on.

Related

scanf function for strings

The problem is simple, the code below does not work. it says Process finished with exit code -1073740940 (0xC0000374). Removing ampersand does not change anything.
int main(){
string x;
scanf("%s",&x);
cout << x;
}
scanf() with the %s format specifier reads bytes into a preallocated character array (char[]), to which you pass a pointer.
Your s is not a character array. It is a std::string, a complex object.
A std::string* is not in any way the same as a char*. Your code overwrites the memory of parts of a complex object in unpredictable ways, so you end up with a crash.
Your compiler should have warned about this, since it knows that a char* is not a std::string*, and because compilers are clever and can detect mistakes like this despite the type-unsafe nature of C library functions.
Even if this were valid via some magic compatibility layer, the string is empty.
Use I/O streams instead.
You cannot pass complex objects through the ... operator of printf/scanf. Many compilers print a warning for that.
scanf requires a pointer of type char* pointing to sufficient storage for an argument of %s. std::string is something completely different.
In C++ the iostream operators are intended for text input and output.
cin >> x;
will do the job.
You should not use scanf in C++. There are many pitfalls, you found one of them.
Another pitfall: %s at scanf is almost always undefined behavior unless you you really ensure that the source stream can only contain strings of limited size. In this case a buffer of char buffer[size]; is the right target.
In any other case you should at least restrict the size of the string to scan. E.g. use %20s and of course a matching char buffer, char buffer[21];in this case. Note the size +1.
You should use cin. But if you want to use scanf() for whatever reason and still manipulate your strings with std::string, then you can read the C-string and use it to initialize your C++ string.
#include <iostream>
#include <cstdio>
#include <string>
using std::cout;
using std::string;
int main()
{
char c_str[80];
scanf("%s", c_str);
string str(c_str);
cout << str << "\n";
}
If you want to use strings, use cin (or getline).
string s;
cin>>s; //s is now read
If you want to use scanf, you want to have a char array (and don't use &):
char text[30];
scanf("%s", text); //text is now read
You can use char[] instead of string
include <iostream>
using namespace std;
int main()
{
char tmp[101];
scanf("%100s", tmp);
cout << tmp;
}

Using a character array as a string stream buffer

I'm looking for a clean STL way to use an existing C buffer (char* and size_t) as a string stream. I would prefer to use STL classes as a basis because it has built-in safeguards and error handling.
note: I cannot use additional libraries (otherwise I would use QTextStream)
You can try with std::stringbuf::pubsetbuf. It calls setbuf, but it's implementation defined whether that will have any effect. If it does, it'll replace the underlying string buffer with the char array, without copying all the contents like it normaly does. Worth a try, IMO.
Test it with this code:
std::istringstream strm;
char arr[] = "1234567890";
strm.rdbuf()->pubsetbuf(arr, sizeof(arr));
int i;
strm >> i;
std::cout << i;
Live demo.

Read in data of a dynamic size into char*?

I was wondering how the following code works.
#include <iostream>
using namespace std;
int main()
{
char* buffer = new char(NULL);
while(true)
{
cin >> buffer;
cout << buffer;
cout << endl;
}
return 0;
}
I can input any amount of text of any size and it will print it back out to me. How does this work? Is it dynamically allocating space for me?
Also, if I enter in a space, it will print the next section of text on a new line.
This however, is fixed by using gets(buffer); (unsafe).
Also, is this code 'legal'?
It's not safe at all. It's rewriting whatever memory happens to lie after the buffer, and then reading it. The fact that this is working is coincidental. This is because your cin/cout operations don't say "oh, a pointer to one char, I should just write one char" but "oh, you have enough space allocated for me."
Improvement #1:
char* buffer = new char(10000) or simply char buffer[10000];
Now you can safely write long-ish paragraphs with no issue.
Improvement #2:
std::string buffer;
To answer your question in the comment, C++ is all for letting you make big memory mistakes. As noted in comment this is because it's a "don't pay for what you don't need" language. There are some people who really need this level of optimization in their code although you are probably not one of them.
However, it also gives you plenty of ways to do it where you don't have to think about memory at all. I will say firmly: if you are using new and delete or char[] and not because you are using a design pattern with which you've familiarized that require them, or because you are using 3rd-party or C libraries that require them, there is a safer way to do it.
Some guidelines that will save you 80% of the time:
-Don't use char[]. Use string.
-Don't use pointers to pass or return argument. Pass by reference, return by value.
-Don't use arrays (e.g. int[]). Use vectors. You still have to check your own bounds.
With just those three you'll be writing "pretty safe", non-C-like code.
This is what std::string is for:
std::string s;
while (true)
{
std::cin >> s;
std::cout << s << std::endl;
}
std::string WILL dynamically allocate space for you, so you don't have to worry about overwriting memory elsewhere.

Segmentation fault while using "ifstream"

I'm trying to get a part of text in a file.
I used "ifstream":
#include <fstream>
void func(){
char *myString;
ifstream infile;
infile.open("/proc/cpuinfo");
while (infile >> myString){
if (!strcmp(myString,"MHz"))
break;
}
}
and I get a "Segmentation fault". does anyone know why?
You have not allocated memory for the myString. Use std::string. Or better any other language, python, perl, or unix utils such as grep, awk, sed.
Because the target value should be:
std::string myString;
and not char*. It's possible to use char*, but you have to ensure that it points to something big enough first. (In your case, it doesn't point anywhere—you forgot to initialize it.) And defining “big enough” is non-trivial, given that you don't know the size of the next word until you've read it.
There's a reason why C++ has a string class, you know. It's because using char pointers is cumbersome and error-prone.
infile >> myString
will read from the file into *wherever myString points to. And it is an uninitialized pointer, it points to some random garbage address.
If you absolutely do want to use char pointers instead of strings, you'll have to allocate a buffer you can read data into.
But the sane solution is to replace it entirely by std::string.
Because you did not allocate memory for myString. The quick solution to this is to use std::string instead of the C-style char* strings, which does memory management so you don't have to.
Here's why your error occurs:
When you declare char *myString you are creating a pointer to a character. However you do not initialize this pointer to anything, so it can point absolutely anywhere in memory. When you do infile >> myString you are going to write a bunch of characters at an unspecified location in memory. It just so happens that this location was a vital part of your program, which caused it to crash.
char myString[256] compiles fine just as well too.
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
void func()
{
char myString[256] ;
ifstream infile;
infile.open("/proc/cpuinfo");
while ( ! infile.eof() )
{
infile >> myString;
cout<<myString<<" \n";
if (! strcmp(myString,"MHz") )
{
infile.close();
break;
}
}
infile.close();
cout<<" \n";
}
int main()
{
func();
return 0;
}

Why does this C++ char array seem to be able to hold more than its size?

#include <iostream>
using namespace std;
typedef struct
{
char streetName[5];
} RECORD;
int main()
{
RECORD r;
cin >> r.streetName;
cout << r.streetName << endl;
}
When I run this program, if I enter in more than 5 characters, the output will show the whole string I entered. It does not truncate at 5 characters. Why is that?
How can I get this to work correctly?
You are overflowing the buffer. Put another char array after streetName and you will likely find that it gets the rest of the characters. Right now you are just corrupting some memory on your stack.
In order to limit the input to the size of the receiving array you need to use the length-limiting facilities provided by your input method. In your case you are using cin, which means that you can specify the limit by using its width method
cin.width(5);
cin >> r.streetName;
Because cin sees streetName as a char * and writes to memory and there is nothing to stop writing to *(streetName + 6) and further. This is a form of buffer overrun
The best code in this case is define streetName as a std::string
i.e.
typedef struct
{
std::string streetName;
} RECORD;
Because you're overruning the end of your buffer and in this particular case you're getting away with it. C and C++ make it very easy to "shoot yourself in the foot", but that doesn't mean that you should.
It's a buffer overrun.
C++ does not perform bounds checking on array accesses, and memory does not simply stop at the end of the array. You are writing data to memory that is not part of the array, the consequences of which are non-deterministic, and may sometimes even appear to work.
It is quite likely that if you placed that code into a function, the program would crash when you tried to return from the function, because one likely possibility is that you will have dumped on the function return address on the stack. You may also have corrupted data belonging to the calling function.
The way to do this correctly in c++ is to use a std::string.
#include<iostream>
#include<string>
....
std::string r;
getline(cin, r);
std::cout << r <<std::endl;
For truncated input(with suitably defined and inited values).
while(cin.peek() != EOF && i < len)
{
cin >> arr[i];
++i;
}
You will want to do something after this to flush the buffer and not leave the rest of the line sitting on the input stream if you plan on doing other things with it.