memset() causing data abort - c++

I'm getting some strange, intermittent, data aborts (< 5% of the time) in some of my code, when calling memset(). The problem is that is usually doesn't happen unless the code is running for a couple days, so it's hard to catch it in the act.
I'm using the following code:
char *msg = (char*)malloc(sizeof(char)*2048);
char *temp = (char*)malloc(sizeof(char)*1024);
memset(msg, 0, 2048);
memset(temp, 0, 1024);
char *tempstr = (char*)malloc(sizeof(char)*128);
sprintf(temp, "%s %s/%s %s%s", EZMPPOST, EZMPTAG, EZMPVER, TYPETXT, EOL);
strcat(msg, temp);
//Add Data
memset(tempstr, '\0', 128);
wcstombs(tempstr, gdevID, wcslen(gdevID));
sprintf(temp, "%s: %s%s", "DeviceID", tempstr, EOL);
strcat(msg, temp);
As you can see, I'm not trying to use memset with a size larger that what's originally allocated with malloc()
Anyone see what might be wrong with this?

malloc can return NULL if no memory is available. You're not checking for that.

There's a couple of things. You're using sprintf which is inherently unsafe; unless you're 100% positive that you're not going to exceed the size of the buffer, you should almost always prefer snprintf. The same applies to strcat; prefer the safer alternative strncat.
Obviously this may not fix anything, but it goes a long way in helping spot what might otherwise be very annoying to spot bugs.

malloc can return NULL if no memory is
available. You're not checking for
that.
Right you are... I didn't think about that as I was monitoring the memory and it there was enough free. Is there any way for there to be available memory on the system but for malloc to fail?
Yes, if memory is fragmented. Also, when you say "monitoring memory," there may be something on the system which occasionally consumes a lot of memory and then releases it before you notice. If your call to malloc occurs then, there won't be any memory available. -- Joel
Either way...I will add that check :)

wcstombs doesn't get the size of the destination, so it can, in theory, buffer overflow.
And why are you using sprintf with what I assume are constants? Just use:
EZMPPOST" " EZMPTAG "/" EZMPVER " " TYPETXT EOL
C and C++ combines string literal declarations into a single string.

Have you tried using Valgrind? That is usually the fastest and easiest way to debug these sorts of errors. If you are reading or writing outside the bounds of allocated memory, it will flag it for you.

You're using sprintf which is
inherently unsafe; unless you're 100%
positive that you're not going to
exceed the size of the buffer, you
should almost always prefer snprintf.
The same applies to strcat; prefer the
safer alternative strncat.
Yeah..... I mostly do .NET lately and old habits die hard. I likely pulled that code out of something else that was written before my time...
But I'll try not to use those in the future ;)

You know it might not even be your code... Are there any other programs running that could have a memory leak?

It could be your processor. Some CPUs can't address single bytes, and require you to work in words or chunk sizes, or have instructions that can only be used on word or chunk aligned data.
Usually the compiler is made aware of these and works around them, but sometimes you can malloc a region as bytes, and then try to address it as a structure or wider-than-a-byte field, and the compiler won't catch it, but the processor will throw a data exception later.
It wouldn't happen unless you're using an unusual CPU. ARM9 will do that, for example, but i686 won't. I see it's tagged windows mobile, so maybe you do have this CPU issue.

Instead of doing malloc followed by memset, you should be using calloc which will clear the newly allocated memory for you. Other than that, do what Joel said.

NB borrowed some comments from other answers and integrated into a whole. The code is all mine...
Check your error codes. E.g. malloc can return NULL if no memory is available. This could be causing your data abort.
sizeof(char) is 1 by definition
Use snprintf not sprintf to avoid buffer overruns
If EZMPPOST etc are constants, then you don't need a format string, you can just combined several string literals as STRING1 " " STRING2 " " STRING3 and strcat the whole lot.
You are using much more memory than you need to.
With one minor change, you don't need to call memset in the first place. Nothing
really requires zero initialisation here.
This code does the same thing, safely, runs faster, and uses less memory.
// sizeof(char) is 1 by definition. This memory does not require zero
// initialisation. If it did, I'd use calloc.
const int max_msg = 2048;
char *msg = (char*)malloc(max_msg);
if(!msg)
{
// Allocaton failure
return;
}
// Use snprintf instead of sprintf to avoid buffer overruns
// we write directly to msg, instead of using a temporary buffer and then calling
// strcat. This saves CPU time, saves the temporary buffer, and removes the need
// to zero initialise msg.
snprintf(msg, max_msg, "%s %s/%s %s%s", EZMPPOST, EZMPTAG, EZMPVER, TYPETXT, EOL);
//Add Data
size_t len = wcslen(gdevID);
// No need to zero init this
char* temp = (char*)malloc(len);
if(!temp)
{
free(msg);
return;
}
wcstombs(temp, gdevID, len);
// No need to use a temporary buffer - just append directly to the msg, protecting
// against buffer overruns.
snprintf(msg + strlen(msg),
max_msg - strlen(msg), "%s: %s%s", "DeviceID", temp, EOL);
free(temp);

Related

String error-checking

I am using a lot of string functions like strncpy, strncat, sprintf etc. in my code. I know there are better alternatives to these, but I was handed over an old project, where these functions were used, so I have to stick with them for compatibility and consistency. My supervisor is very fussy about error checking and robustness, and insists that I check for buffer-overflow violations everytime I use these functions. This has created a lot of if-else statements in my code, which do not look pretty. My question is, is it really necessary to check for overflow everytime I call one of these functions? Even if I know that a buffer overflow can't possibly occur e.g. when storing an integer in a string using the sprintf function
sprintf(buf,"%d",someInteger);
I know that the maximum length of an unsigned integer on a 64-bit system can be 20 digits. buf on the other hand is well over 20 characters long. Should I still check for buffer overflow in this case?
I think the way to go is using exceptions. Exceptions are very useful when you must decouple the normal control-flow of a program and error-checking.
What you can do is create a wrapper for every string function in which you perform the error checking and throw an exception if a buffer overflow would occur.
Then, in your client code, you can simply call your wrappers inside a try block, and then check for exceptions and return error codes inside the catch block.
Sample code (not tested):
int sprintf_wrapper( char *buffer, int buffer_size, const char *format, ... )
{
if( /* check for buffer overflow */ )
throw my_buffer_exception;
va_list arg_ptr;
va_start( arg_ptr, format );
int ret = sprintf( buffer, , format, arg_ptr );
va_end(arg_ptr);
return ret;
}
Error foo()
{
//...
try{
sprintf_wrapper(buf1, 100, "%d", i1);
sprintf_wrapper(buf2, 100, "%d", i2);
sprintf_wrapper(buf3, 100, "%d", i3);
}
catch( my_buffer_exception& )
{
return err_code;
}
}
Maybe write a test case that you can invoke to simply test the buffer to reduce code duplication and ugliness.
You could abstract the if/else statements into a method of another class, and then pass in the buffer and length expected.
By nature, these buffers are VERY susceptible to overwrites, so be careful ANYTIME you take input in from a user/outside source. You could also try getting a string length (using strlen), or checking for the /0 end string character yourself, and comparing that to the buffer size. If you loop for the /0 character,and it's not there, you will get into an infinite loop if you don't constrain the max size of your loop by the expected buffer size, so check for this too.
Another option, is to refactor code, such that every time those methods are used, you replace them with a length safe version you write, where it calls a method with those checks already in place (but have to pass the buffer size to it). This may not be possible for some projects, as the complexity may be very hard to unit test.
Let me address your last paragraph first: You write code once, in contrast to how long it will be maintained and used. Guess how long you think your code will be in use, and then multiply that by 10-20 to figure out how long it will actually be in use. At the end of that window it's completely likely that an integer could be much bigger and overflow you buffer, so yes you must do buffer checking.
Given that you have a few options:
Use the "n" series of functions like snprintf to prevent buffer overflows and tell your users that it's undefined what will happen if the buffers overflow.
Consider it fatal and either abort() or throw an uncaught exception when a length violation occurs.
Try to notify the user there's a problem and either abort the operation or attempt to let the user modify input and retry.
The first two approaches are definitely going to be easier to implement and maintain because you don't have to worry about getting the right information back to the user in a reasonable way. In any of the cases you could most likely factored into a function as suggested in other answers.
Finally let me say since you've tagged this question C++ and not C, think long and hard about slowly migrating your code base to C++ (because your code base is C right now) and utilize the C++ facilities which then totally remove the need for these buffer checks, as it will happen automatically for you.
You can use gcc "-D_FORTIFY_SOURCE=1 D_FORTIFY_SOURCE=2" for buffer overflow detection.
https://securityblog.redhat.com/2014/03/26/fortify-and-you/

Does buffer overflow happen in C++ strings?

This is concerning strings in C++. I have not touched C/C++ for a very long time; infact I did programming in those languages only for the first year in my college, about 7 years ago.
In C to hold strings I had to create character arrays(whether static or dynamic, it is not of concern). So that would mean that I need to guess well in advance the size of the string that an array would contain. Well I applied the same approach in C++. I was aware that there was a std::string class but I never got around to use it.
My question is that since we never declare the size of an array/string in std::string class, does a buffer overflow occur when writing to it. I mean, in C, if the array’s size was 10 and I typed more than 10 characters on the console then that extra data would be writtein into some other object’s memory place, which is adjacent to the array. Can a similar thing happen in std::string when using the cin object.
Do I have to guess the size of the string before hand in C++ when using std::string?
Well! Thanks to you all. There is no one right answer on this page (a lot of different explanations provided), so I am not selecting any single one as such. I am satisfied with the first 5. Take Care!
Depending on the member(s) you are using to access the string object, yes. So, for example, if you use reference operator[](size_type pos) where pos > size(), yes, you would.
Assuming no bugs in the standard library implementation, no. A std::string always manages its own memory.
Unless, of course, you subvert the accessor methods that std::string provides, and do something like:
std::string str = "foo";
char *p = (char *)str.c_str();
strcpy(p, "blah");
You have no protection here, and are invoking undefined behaviour.
The std::string generally protects against buffer overflow, but there are still situations in which programming errors can lead to buffer overflows. While C++ generally throws an out_of_range exception when an operation references memory outside the bounds of the string, the subscript operator [] (which does not perform bounds checking) does not.
Another problem occurs when converting std::string objects to C-style strings. If you use string::c_str() to do the conversion, you get a properly null-terminated C-style string. However, if you use string::data(), which writes the string directly into an array (returning a pointer to the array), you get a buffer that is not null terminated. The only difference between c_str() and data() is that c_str() adds a trailing null byte.
Finally, many existing C++ programs and libraries have their own string classes. To use these libraries, you may have to use these string types or constantly convert back and forth. Such libraries are of varying quality when it comes to security. It is generally best to use the standard library (when possible) or to understand the semantics of the selected library. Generally speaking, libraries should be evaluated based on how easy or complex they are to use, the type of errors that can be made, how easy these errors are to make, and what the potential consequences may be.
refer https://buildsecurityin.us-cert.gov/bsi/articles/knowledge/coding/295-BSI.html
In c the cause is explained as follow:
void function (char *str) {
char buffer[16];
strcpy (buffer, str);
}
int main () {
char *str = "I am greater than 16 bytes"; // length of str = 27 bytes
function (str);
}
This program is guaranteed to cause unexpected behavior, because a string (str) of 27 bytes has been copied to a location (buffer) that has been allocated for only 16 bytes. The extra bytes run past the buffer and overwrites the space allocated for the FP, return address and so on. This, in turn, corrupts the process stack. The function used to copy the string is strcpy, which completes no checking of bounds. Using strncpy would have prevented this corruption of the stack. However, this classic example shows that a buffer overflow can overwrite a function's return address, which in turn can alter the program's execution path. Recall that a function's return address is the address of the next instruction in memory, which is executed immediately after the function returns.
here is a good tutorial that can give your answer satisfactory.
In C++, the std::string class starts out with a minimum size (or you can specify a starting size). If that size is exceeded, std::string allocates more dynamic memory.
Assuming the library providing std::string is correctly written, you cannot cause a buffer overflow by adding characters to a std::string object.
Of course, bugs in the library are not impossible.
"Do buffer overflows occur in C++ code?"
To the extent that C programs are legal C++ code (they almost all are), and C programs have buffer overflows, C++ programs can have buffer overflows.
Being richer than C, I'm sure C++ can have buffer overflows in ways that C cannot :-}

Allocate room for null terminating character when copying strings in C?

const char* src = "hello";
Calling strlen(src); returns size 5...
Now say I do this:
char* dest = new char[strlen(src)];
strcpy(dest, src);
That doesn't seem like it should work, but when I output everything it looks right. It seems like I'm not allocating space for the null terminator on the end... is this right? Thanks
You are correct that you are not allocating space for the terminator, however the failure to do this will not necessarily cause your program to fail. You may be overwriting following information on the heap, or your heap manager will be rounding up allocation size to a multiple of 16 bytes or something, so you won't necessarily see any visible effect of this bug.
If you run your program under Valgrind or other heap debugger, you may be able to detect this problem sooner.
Yes, you should allocate at least strlen(src)+1 characters.
That doesn't seem like it should work, but when I output everything it looks right.
Welcome to the world of Undefined Behavior. When you do this, anything can happen. Your program can crash, your computer can crash, your computer can explode, demons can fly out of your nose.
And worst of all, your program could run just fine, inconspicuously looking like it's working correctly until one day it starts spitting out garbage because it's overwriting sensitive data somewhere due to the fact that somewhere, someone allocated one too few characters for their arrays, and now you've corrupted the heap and you get a segfault at some point a million miles away, or even worse your program happily chugs along with a corrupted heap and your functions are operating on corrupted credit card numbers and you get in huge trouble.
Even if it looks like it works, it doesn't. That's Undefined Behavior. Avoid it, because you can never be sure what it will do, and even when what it does when you try it is okay, it may not be okay on another platform.
The best description I have read (was on stackoverflow) and went like this:
If the speed limit is 50 and you drive at 60. You may get lucky and not get a ticket but one day maybe not today maybe not tomorrow but one day that cop will be waiting for you. On that day you will pay and you will pay dearly.
If somebody can find the original I would much rather point at that they were much more eloquent than my explanation.
strcpy will copy the null terminated char as well as all of the other chars.
So you are copying the length of hello + 1 which is 6 into a buffer size which is 5.
You have a buffer overflow here though, and overwiting memory that is not your own will have undefined results.
Alternatively, you could also use dest = strdup(src) which will allocate enough memory for the string + 1 for the null terminator (+1 for Juliano's answer).
This is why you should always, always, always run valgrind on any C program that appears to work.
Yeah, everyone has covered the major point; you are not guaranteed to fail. The fact is that the null terminator is usually 0 and 0 is a pretty common value to be sitting in any particular memory address. So it just happens to work. You could test this by taking a set of memory, writing a bunch of garbage to it and then writing that string there and trying to work with it.
Anyway, the major issue I see here is that you are talking about C but you have this line of code:
char* dest = new char[strlen(src)];
This won't compile in any standard C compiler. There's no new keyword in C. That is C++. In C, you would use one of the memory allocation functions, usually malloc. I know it seems nitpicy, but really, it's not.

c++ what happens if you print more characters with sprintf, than the char pointer has allocated?

I assume this is a common way to use sprintf:
char pText[x];
sprintf(pText, "helloworld %d", Count );
but what exactly happens, if the char pointer has less memory allocated, than it will be print to?
i.e. what if x is smaller than the length of the second parameter of sprintf?
i am asking, since i get some strange behaviour in the code that follows the sprintf statement.
It's not possible to answer in general "exactly" what will happen. Doing this invokes what is called Undefined behavior, which basically means that anything might happen.
It's a good idea to simply avoid such cases, and use safe functions where available:
char pText[12];
snprintf(pText, sizeof pText, "helloworld %d", count);
Note how snprintf() takes an additional argument that is the buffer size, and won't write more than there is room for.
This is a common error and leads to memory after the char array being overwritten. So, for example, there could be some ints or another array in the memory after the char array and those would get overwritten with the text.
See a nice detailed description about the whole problem (buffer overflows) here. There's also a comment that some architectures provide a snprintf routine that has a fourth parameter that defines the maximum length (in your case x). If your compiler doesn't know it, you can also write it yourself to make sure you can't get such errors (or just check that you always have enough space allocated).
Note that the behaviour after such an error is undefined and can lead to very strange errors. Variables are usually aligned at memory locations divisible by 4, so you sometimes won't notice the error in most cases where you have written one or two bytes too much (i.e. forget to make place for a NUL), but get strange errors in other cases. These errors are hard to debug because other variables get changed and errors will often occur in a completely different part of the code.
This is called a buffer overrun.
sprintf will overwrite the memory that happens to follow pText address-wise. Since pText is on the stack, sprintf can overwrite local variables, function arguments and the return address, leading to all sorts of bugs. Many security vulnerabilities result from this kind of code — e.g. an attacker uses the buffer overrun to write a new return address pointing to his own code.
The behaviour in this situation is undefined. Normally, you will crash, but you might also see no ill effects, strange values appearing in unrelated variables and that kind of thing. Your code might also call into the wrong functions, format your hard-drive and kill other running programs. It is best to resolve this by allocating more memory for your buffer.
I have done this many times, you will receive memory corruption error. AFAIK, I remember i have done some thing like this:-
vector<char> vecMyObj(10);
vecMyObj.resize(10);
sprintf(&vecMyObj[0],"helloworld %d", count);
But when destructor of vector is called, my program receive memory corruption error, if size is less then 10, it will work successfully.
Can you spell Buffer Overflow ? One possible result will be stack corruption, and make your app vulnerable to Stack-based exploitation.

Using sprintf without a manually allocated buffer

In the application that I am working on, the logging facility makes use of sprintf to format the text that gets written to file. So, something like:
char buffer[512];
sprintf(buffer, ... );
This sometimes causes problems when the message that gets sent in becomes too big for the manually allocated buffer.
Is there a way to get sprintf behaviour without having to manually allocate memory like this?
EDIT: while sprintf is a C operation, I'm looking for C++ type solutions (if there are any!) for me to get this sort of behaviour...
You can use asprintf(3) (note: non-standard) which allocates the buffer for you so you don't need to pre-allocate it.
No you can't use sprintf() to allocate enough memory. Alternatives include:
use snprintf() to truncate the message - does not fully resolve your problem, but prevent the buffer overflow issue
double (or triple or ...) the buffer - unless you're in a constrained environment
use C++ std::string and ostringstream - but you'll lose the printf format, you'll have to use the << operator
use Boost Format that comes with a printf-like % operator
I dont also know a version wich avoids allocation, but if C99 sprintfs allows as string the NULL pointer. Not very efficient, but this would give you the complete string (as long as enough memory is available) without risking overflow:
length = snprintf(NULL, ...);
str = malloc(length+1);
snprintf(str, ...);
"the logging facility makes use of sprintf to format the text that gets written to file"
fprintf() does not impose any size limit. If you can write the text directly to file, do so!
I assume there is some intermediate processing step, however. If you know how much space you need, you can use malloc() to allocate that much space.
One technique at times like these is to allocate a reasonable-size buffer (that will be large enough 99% of the time) and if it's not big enough, break the data into chunks that you process one by one.
With the vanilla version of sprintf, there is no way to prevent the data from overwriting the passed in buffer. This is true regardless of wether the memory was manually allocated or allocated on the stack.
In order to prevent the buffer from being overwritten you'll need to use one of the more secure versions of sprintf like sprintf_s (windows only)
http://msdn.microsoft.com/en-us/library/ybk95axf.aspx