stack around the variable...was corrupted - c++

I have a simple function that writes some data to a new file. It works, and the file is written, but I get the above mentioned error while debugging in MSVS Express 2013.
void writeSpecToFile(const char *fname); //in header file.
char myChar [20];
sprintf(myChar, "aa%03daa%daa", i1, i2);
const char* new_char = myChar;
writeSpecToFile(myChar);
As seen, I simply insert some variables into a string using sprintf (works fine). Now whether I pass myChar or new_char, it still gives me the corruption error.
What went wrong?

Why did you declare you character buffer a size of 20? More than likely the sprintf placed more characters than that can fit in myChar.
Instead, use
safer constructs such as std::ostringstream or
at the very least, declare you char arrays much bigger than you would expect (not the best way, but would at least not have had the error occur).
If you're going along the "guess the biggest size for my array"
route, the last thing you want to do is attempt to count, right down
to the last character, how big to make the buffer. If you're off by a single byte, that can cause a crash.

Assuming 32-bit int, printing one with %d will yield a maximum of 8 visible characters.
Your format-string also contains 6 literal a-characters, and we should not forget the 0-terminator.
All in all: 2*8+6+1 = 23 > 20 !!
Your buffer must be at least 23 byte big, unless there are other undisclosed input-restrictions.
Personally, I would give it a round 32.
Also, better use snprintf and optionally verify the full string did actually fit (if it does not fit you get a shortened string, so no catastrophe).
char myChar [32];
snprintf(myChar, sizeof myChar, "aa%03daa%daa", i1, i2);
Beware that the Microsoft implementation is non-conforming and does not guarantee 0-termination.

Related

C++ Can programs run out of stack memory even when plenty of memory is available?

I'm writing a parser for a large file and one of my functions responsible for reading from the input file has a char buffer called peek. Basically, as main repeatedly calls this function, peek is eventually getting over-written with some odd values. Here's the function that's being called by main. bufferAsInt:
void bufferAsInt(ifstream &inf, int &i)
{
char peek[3];
inf.read(peek, 3);
i = atoi(peek);
//I'm not using the >> operator to read an int because the int is just
//3 chars long in the input file and two consecutive integer values can
//be written like this: 123456 for 123 and 456.
}
I found that as I wrote these values to an output file, when reading an int value that was only two digits long, the third digit (or some other number) would be left over in the char buffer peek and the value would be written incorrectly to the output file (this only happened after reading a very very large amount of data from the input file.) So after tens of thousands of iterations, when reading a number like 15, the value that would get written to my output file might have been something like 156.
To solve the problem I changed my implementation of bufferAsInt to this:
void bufferAsInt(ifstream &inf, int &i)
{
char *peek = new char[3];
inf.read(peek, 3);
i = atoi(peek);
delete [] peek;
}
(Of course I was guessing at what the issue was). What I'd like to know is if the fact that my problem was solved is some sort of weird consequence of declaring this char buffer on the heap or if the issue actually was that my program was running out of stack memory.
I have 6GB of RAM in my computer and at the time of running, no other programs would have been using enough memory to cause this issue to the best of my knowledge.
You're off by one.
atoi expects a null-terminated string. So a three-digit number needs a char[4] to be properly stored. Also, read doesn't put a null on the end.
Try this:
void bufferAsInt(ifstream &inf, int &i)
{
char peek[4];
inf.read(peek, 3);
peek[3] = 0;
i = atoi(peek);
}
atoi() expects a C 'NUL terminated string' as input, that is, ASCII characters followed by an ASCII zero byte. That's the only way the function knows where to stop converting.
In your first code listing, you read three bytes into a three byte buffer, but you have no control over the byte that follows in memory. I believe that's undefined behavior in C++, so literally anything can happen. Typically, though, if the following byte happens to be a zero or a non-digit, the string will convert properly; if it happens to be a digit, you get a different number when you were expecting.
The proper fix is to use your first example, but:
char peek[4]; // 4 char buffer instead of 3
inf.read(peek, 3);
peek[3] = '\0'; // ensure the 4th char is zero
i = atoi(peek);
Most likely the only thing that changed was that the new, with your compiler and options, zeroes the array.
To guarantee that you could write
char *peek = new char[3]();
But the dynamic allocation serves no purpose, so instead do it like this:
char peek[3] = {};
Note: if the file contains 3 digits, then you should instead use four digits array, in order to have room for terminating zero.

Mistake using scanf

could you say me what is the mistake in my following code?
char* line="";
printf("Write the line.\n");
scanf("%s",line);
printf(line,"\n");
I'm trying to get a line as an input from the console.But everytime while using "scanf" the program crashes. I don't want to use any std, I totally want to avoid using cin or cout. I'm just trying to learn how to tak a full line as an input using scanf().
Thank you.
You need to allocate the space for the input string as sscanf() cannot do that itself:
char line[1024];
printf("Write the line.\n");
scanf("%s",line);
printf(line,"\n");
However this is dangerous as it's possible to overflow the buffer and is therefore a security concern. Use std::string instead:
std::string line;
std::cout << "Write the line." << std::endl;
std::cin >> line;
std::cout << line << std::endl;
or:
std::getline (std::cin, line);
Space not allocated for line You need to do something like
char *line = malloc();
or
Char line[SOME_VALUE];
Currently line is a poor pointer pointing at a string literal. And overwriting a string literal can result in undefined behaviour.
scanf() doesn't match lines.
%s matches a single word.
#include <stdio.h>
int main() {
char word[101];
scanf("%100s", word);
printf("word <%s>\n", word);
return 0;
}
input:
this is a test
output:
word <this>
to match the line use %100[^\n"] which means 100 char's that aren't newline.
#include <stdio.h>
int main() {
char word[101];
scanf("%100[^\n]", word);
printf("word <%s>\n", word);
return 0;
}
You are trying to change a string literal, which in C results in Undefined behavior, and in C++ is trying to write into a const memory.
To overcome it, you might want to allocate a char[] and assign it to line - or if it is C++ - use std::string and avoid a lot of pain.
You should allocate enough memory for line:
char line[100];
for example.
The %s conversion specifier in a scanf call expects its corresponding argument to point to a writable buffer of type char [N] where N is large enough to hold the input.
You've initialized line to point to the string literal "". There are two problems with this. First is that attempting to modify the contents of a string literal results in undefined behavior. The language definition doesn't specify how string literals are stored; it only specifies their lifetime and visibility, and some platforms stick them in a read-only memory segment while others put them in a writable data segment. Therefore, attempting to modify the contents of a string literal on one platform may crash outright due to an access violation, while the same thing on another platform may work fine. The language definition doesn't mandate what should happen when you try to modify a string literal; in fact, it explicitly leaves that behavior undefined, so that the compiler is free to handle the situation any way it wants to. In general, it's best to always assume that string literals are unwritable.
The other problem is that the array containing the string literal is only sized to hold 1 character, the 0 terminator. Remember that C-style strings are stored as simple arrays of char, and arrays don't automatically grow when you add more characters.
You will need to either declared line as an array of char or allocate the memory dynamically:
char line[MAX_INPUT_LEN];
or
char *line = malloc(INITIAL_INPUT_LEN);
The virtue of allocating the memory dynamically is that you can resize the buffer as necessary.
For safety's sake, you should specify the maximum number of characters to read; if your buffer is sized to hold 21 characters, then write your scanf call as
scanf("%20s", line);
If there are more characters in the input stream than what line can hold, scanf will write those extra characters to the memory following line, potentially clobbering something important. Buffer overflows are a common malware exploit and should be avoided.
Also, %s won't get you the full line; it'll read up to the next whitespace character, even with the field width specifier. You'll either need to use a different conversion specifier like %[^\n] or use fgets() instead.
The pointer line which is supposed to point to the start of the character array that will hold the string read is actually pointing to a string literal (empty string) whose contents are not modifiable. This leads to an undefined behaviour manifested as a crash in your case.
To fix this change the definition to:
char line[MAX]; // set suitable value for MAX
and read atmost MAX-1 number of characters into line.
Change:
char* line="";
to
char line[max_length_of_line_you_expect];
scanf is trying to write more characters than the reserved by line. Try reserving more characters than the line you expect, as been pointed out by the answers above.

Why doesn't memcpy work when copying a char array into a struct?

#define buffer 128
int main(){
char buf[buffer]="";
ifstream infile("/home/kevin/Music/test.mp3",ios::binary);
infile.seekg(-buffer,ios::end);
if(!infile || !infile.read(buf,buffer)){
cout<<"fail!"<<endl;
}
ID3v1 id3;
cout<<sizeof(id3)<<endl;
memcpy(&id3,buf,128);
cout<<id3.header<<endl;
}
struct ID3v1{
char header[3];
char title[30];
char artist[30];
char album[30];
char year[4];
char comment[28];
bool zerobyte;
bool track;
bool genre;
};
When I do the memcpy, it seems to be pushing too much data into the header field. Do I need to go through each of the structs members and copy the data in? I'm also using c++, but this seems more of a "C" strategy. Is there a better way for c++?
As noted in all the comments (you are missing the '\0' character, or when printing C-Strings the operator<< is expecting the sequence of characters to be '\0' terminated).
Try:
std::cout << std::string(id3.header, id3.header+3) << std::endl;
This will print the three characters in the header field.
The problem is most likely in that the memcpy does what it does.
It copies the 128bytes into your structure.
Then you try to print out the header. It prints the 1st character, 2nd, 3rd.. and continues to print until it finds the '\0' (string termination character).
Basically, when printing things out, copy the header to another char array and append the termination character (or copy to an c++ string).
other problems you may encounter when using memcpy:
your struct elements may be align to word boundaries by the compiler.
most compilers have some pragma or commandline switch to specify the alignment to use.
some cpu's require shorts or longs to be stored on word boundaries, in that case modifying the alignment will not help you, as you will not be able to read from the unaligned addresses.
if you copy integers larger than char ( like short, or long ) you have to make sure to correct the byte order depending on your cpu architecture.

understanding the dangers of sprintf(...)

OWASP says:
"C library functions such as strcpy
(), strcat (), sprintf () and vsprintf
() operate on null terminated strings
and perform no bounds checking."
sprintf writes formatted data to string
int sprintf ( char * str, const char * format, ... );
Example:
sprintf(str, "%s", message); // assume declaration and
// initialization of variables
If I understand OWASP's comment, then the dangers of using sprintf are that
1) if message's length > str's length, there's a buffer overflow
and
2) if message does not null-terminate with \0, then message could get copied into str beyond the memory address of message, causing a buffer overflow
Please confirm/deny. Thanks
You're correct on both problems, though they're really both the same problem (which is accessing data beyond the boundaries of an array).
A solution to your first problem is to instead use std::snprintf, which accepts a buffer size as an argument.
A solution to your second problem is to give a maximum length argument to snprintf. For example:
char buffer[128];
std::snprintf(buffer, sizeof(buffer), "This is a %.4s\n", "testGARBAGE DATA");
// std::strcmp(buffer, "This is a test\n") == 0
If you want to store the entire string (e.g. in the case sizeof(buffer) is too small), run snprintf twice:
int length = std::snprintf(nullptr, 0, "This is a %.4s\n", "testGARBAGE DATA");
++length; // +1 for null terminator
char *buffer = new char[length];
std::snprintf(buffer, length, "This is a %.4s\n", "testGARBAGE DATA");
(You can probably fit this into a function using va or variadic templates.)
Both of your assertions are correct.
There's an additional problem not mentioned. There is no type checking on the parameters. If you mismatch the format string and the parameters, undefined and undesirable behavior could result. For example:
char buf[1024] = {0};
float f = 42.0f;
sprintf(buf, "%s", f); // `f` isn't a string. the sun may explode here
This can be particularly nasty to debug.
All of the above lead many C++ developers to the conclusion that you should never use sprintf and its brethren. Indeed, there are facilities you can use to avoid all of the above problems. One, streams, is built right in to the language:
#include <sstream>
#include <string>
// ...
float f = 42.0f;
stringstream ss;
ss << f;
string s = ss.str();
...and another popular choice for those who, like me, still prefer to use sprintf comes from the boost Format libraries:
#include <string>
#include <boost\format.hpp>
// ...
float f = 42.0f;
string s = (boost::format("%1%") %f).str();
Should you adopt the "never use sprintf" mantra? Decide for yourself. There's usually a best tool for the job and depending on what you're doing, sprintf just might be it.
Yes, it is mostly a matter of buffer overflows. However, those are quite serious business nowdays, since buffer overflows are the prime attack vector used by system crackers to circumvent software or system security. If you expose something like this to user input, there's a very good chance you are handing the keys to your program (or even your computer itself) to the crackers.
From OWASP's perspective, let's pretend we are writing a web server, and we use sprintf to parse the input that a browser passes us.
Now let's suppose someone malicious out there passes our web browser a string far larger than will fit in the buffer we chose. His extra data will instead overwrite nearby data. If he makes it large enough, some of his data will get copied over the webserver's instructions rather than its data. Now he can get our webserver to execute his code.
Your 2 numbered conclusions are correct, but incomplete.
There is an additional risk:
char* format = 0;
char buf[128];
sprintf(buf, format, "hello");
Here, format is not NULL-terminated. sprintf() doesn't check that either.
Your interpretation seems to be correct. However, your case #2 isn't really a buffer overflow. It's more of a memory access violation. That's just terminology though, it's still a major problem.
The sprintf function, when used with certain format specifiers, poses two types of security risk: (1) writing memory it shouldn't; (2) reading memory it shouldn't. If snprintf is used with a size parameter that matches the buffer, it won't write anything it shouldn't. Depending upon the parameters, it may still read stuff it shouldn't. Depending upon the operating environment and what else a program is doing, the danger from improper reads may or may not be less severe than that from improper writes.
It is very important to remember that sprintf() adds the ASCII 0 character as string terminator at the end of each string. Therefore, the destination buffer must have at least n+1 bytes (To print the word "HELLO", a 6-byte buffer is required, NOT 5)
In the example below, it may not be obvious, but in the 2-byte destination buffer, the second byte will be overwritten by ASCII 0 character. If only 1 byte was allocated for the buffer, this would cause buffer overrun.
char buf[3] = {'1', '2'};
int n = sprintf(buf, "A");
Also note that the return value of sprintf() does NOT include the null-terminating character. In the example above, 2 bytes were written, but the function returns '1'.
In the example below, the first byte of class member variable 'i' would be partially overwritten by sprintf() (on a 32-bit system).
struct S
{
char buf[4];
int i;
};
int main()
{
struct S s = { };
s.i = 12345;
int num = sprintf(s.buf, "ABCD");
// The value of s.i is NOT 12345 anymore !
return 0;
}
I pretty much have stated a small example how you could get rid of the buffer size declaration for the sprintf (if you intended to, of course!) and no snprintf envolved ....
Note: This is an APPEND/CONCATENATION example, take a look at here

Is there any way to determine how many characters will be written by sprintf?

I'm working in C++.
I want to write a potentially very long formatted string using sprintf (specifically a secure counted version like _snprintf_s, but the idea is the same). The approximate length is unknown at compile time so I'll have to use some dynamically allocated memory rather than relying on a big static buffer. Is there any way to determine how many characters will be needed for a particular sprintf call so I can always be sure I've got a big enough buffer?
My fallback is I'll just take the length of the format string, double it, and try that. If it works, great, if it doesn't I'll just double the size of the buffer and try again. Repeat until it fits. Not exactly the cleverest solution.
It looks like C99 supports passing NULL to snprintf to get the length. I suppose I could create a module to wrap that functionality if nothing else, but I'm not crazy about that idea.
Maybe an fprintf to "/dev/null"/"nul" might work instead? Any other ideas?
EDIT: Alternatively, is there any way to "chunk" the sprintf so it picks up mid-write? If that's possible it could fill the buffer, process it, then start refilling from where it left off.
The man page for snprintf says:
Return value
Upon successful return, these functions return the number of
characters printed (not including the trailing '\0' used to end
output to strings). The functions snprintf and vsnprintf do not
write more than size bytes (including the trailing '\0'). If
the output was truncated due to this limit then the return value
is the number of characters (not including the trailing '\0')
which would have been written to the final string if enough
space had been available. Thus, a return value of size or more
means that the output was truncated. (See also below under
NOTES.) If an output error is encountered, a negative value is
returned.
What this means is that you can call snprintf with a size of 0. Nothing will get written, and the return value will tell you how much space you need to allocate to your string:
int how_much_space = snprintf(NULL, 0, fmt_string, param0, param1, ...);
As others have mentioned, snprintf() will return the number of characters required in a buffer to prevent the output from being truncated. You can simply call it with a 0 buffer length parameter to get the required size then use an appropriately sized buffer.
For a slight improvement in efficiency, you can call it with a buffer that's large enough for the normal case and only do a second call to snprintf() if the output is truncated. In order to make sure the buffer(s) are properly freed in that case, I'll often use an auto_buffer<> object that handles the dynamic memory for me (and has the default buffer on the stack to avoid a heap allocation in the normal case).
If you're using a Microsoft compiler, MS has a non-standard _snprintf() that has serious limitations of not always null terminating the buffer and not indicating how big the buffer should be.
To work around Microsoft's non-support, I use a nearly public domain snprintf() from Holger Weiss.
Of course if your non-MS C or C++ compiler is missing snprintf(), the code from the above link should work just as well.
I would use a two-stage approach. Generally, a large percentage of output strings will be under a certain threshold and only a few will be larger.
Stage 1, use a reasonable sized static buffer such as 4K. Since snprintf() can restrict how many characters are written, you won't get a buffer overflow. What you will get returned from snprintf() is the number of characters it would have written if your buffer had been big enough.
If your call to snprintf() returns less than 4K, then use the buffer and exit. As stated, the vast majority of calls should just do that.
Some will not and that's when you enter stage 2. If the call to snprintf() won't fit in the 4K buffer, you at least now know how big a buffer you need.
Allocate, with malloc(), a buffer big enough to hold it then snprintf() it again to that new buffer. When you're done with the buffer, free it.
We worked on a system in the days before snprintf() and we acheived the same result by having a file handle connected to /dev/null and using fprintf() with that. /dev/null was always guaranteed to take as much data as you give it so we would actually get the size from that, then allocate a buffer if necessary.
Keep in kind that not all systems have snprintf() (for example, I understand it's _snprintf() in Microsoft C) so you may have to find the function that does the same job, or revert to the fprintf /dev/null solution.
Also be careful if the data can be changed between the size-checking snprintf() and the actual snprintf() to the buffer (i.e., wathch out for threads). If the sizes increase, you'll get buffer overflow corruption.
If you follow the rule that data, once handed to a function, belongs to that function exclusively until handed back, this won't be a problem.
For what it's worth, asprintf is a GNU extension that manages this functionality. It accepts a pointer as an output argument, along with a format string and a variable number of arguments, and writes back to the pointer the address of a properly-allocated buffer containing the result.
You can use it like so:
#define _GNU_SOURCE
#include <stdio.h>
int main(int argc, char const *argv[])
{
char *hi = "hello"; // these could be really long
char *everyone = "world";
char *message;
asprintf(&message, "%s %s", hi, everyone);
puts(message);
free(message);
return 0;
}
Hope this helps someone!
Take a look at CodeProject: CString-clone Using Standard C++. It uses solution you suggested with enlarging buffer size.
// -------------------------------------------------------------------------
// FUNCTION: FormatV
// void FormatV(PCSTR szFormat, va_list, argList);
//
// DESCRIPTION:
// This function formats the string with sprintf style format-specs.
// It makes a general guess at required buffer size and then tries
// successively larger buffers until it finds one big enough or a
// threshold (MAX_FMT_TRIES) is exceeded.
//
// PARAMETERS:
// szFormat - a PCSTR holding the format of the output
// argList - a Microsoft specific va_list for variable argument lists
//
// RETURN VALUE:
// -------------------------------------------------------------------------
void FormatV(const CT* szFormat, va_list argList)
{
#ifdef SS_ANSI
int nLen = sslen(szFormat) + STD_BUF_SIZE;
ssvsprintf(GetBuffer(nLen), nLen-1, szFormat, argList);
ReleaseBuffer();
#else
CT* pBuf = NULL;
int nChars = 1;
int nUsed = 0;
size_type nActual = 0;
int nTry = 0;
do
{
// Grow more than linearly (e.g. 512, 1536, 3072, etc)
nChars += ((nTry+1) * FMT_BLOCK_SIZE);
pBuf = reinterpret_cast<CT*>(_alloca(sizeof(CT)*nChars));
nUsed = ssnprintf(pBuf, nChars-1, szFormat, argList);
// Ensure proper NULL termination.
nActual = nUsed == -1 ? nChars-1 : SSMIN(nUsed, nChars-1);
pBuf[nActual+1]= '\0';
} while ( nUsed < 0 && nTry++ < MAX_FMT_TRIES );
// assign whatever we managed to format
this->assign(pBuf, nActual);
#endif
}
I've looked for the same functionality you're talking about, but as far as I know, something as simple as the C99 method is not available in C++, because C++ does not currently incorporate the features added in C99 (such as snprintf).
Your best bet is probably to use a stringstream object. It's a bit more cumbersome than a clearly written sprintf call, but it will work.
Since you're using C++, there's really no need to use any version of sprintf. The simplest thing to do is use a std::ostringstream.
std::ostringstream oss;
oss << a << " " << b << std::endl;
oss.str() returns a std::string with the contents of what you've written to oss. Use oss.str().c_str() to get a const char *. It's going to be a lot easier to deal with in the long run and eliminates memory leaks or buffer overruns. Generally, if you're worrying about memory issues like that in C++, you're not using the language to its full potential, and you should rethink your design.