Using sprintf without a manually allocated buffer - c++

In the application that I am working on, the logging facility makes use of sprintf to format the text that gets written to file. So, something like:
char buffer[512];
sprintf(buffer, ... );
This sometimes causes problems when the message that gets sent in becomes too big for the manually allocated buffer.
Is there a way to get sprintf behaviour without having to manually allocate memory like this?
EDIT: while sprintf is a C operation, I'm looking for C++ type solutions (if there are any!) for me to get this sort of behaviour...

You can use asprintf(3) (note: non-standard) which allocates the buffer for you so you don't need to pre-allocate it.

No you can't use sprintf() to allocate enough memory. Alternatives include:
use snprintf() to truncate the message - does not fully resolve your problem, but prevent the buffer overflow issue
double (or triple or ...) the buffer - unless you're in a constrained environment
use C++ std::string and ostringstream - but you'll lose the printf format, you'll have to use the << operator
use Boost Format that comes with a printf-like % operator

I dont also know a version wich avoids allocation, but if C99 sprintfs allows as string the NULL pointer. Not very efficient, but this would give you the complete string (as long as enough memory is available) without risking overflow:
length = snprintf(NULL, ...);
str = malloc(length+1);
snprintf(str, ...);

"the logging facility makes use of sprintf to format the text that gets written to file"
fprintf() does not impose any size limit. If you can write the text directly to file, do so!
I assume there is some intermediate processing step, however. If you know how much space you need, you can use malloc() to allocate that much space.
One technique at times like these is to allocate a reasonable-size buffer (that will be large enough 99% of the time) and if it's not big enough, break the data into chunks that you process one by one.

With the vanilla version of sprintf, there is no way to prevent the data from overwriting the passed in buffer. This is true regardless of wether the memory was manually allocated or allocated on the stack.
In order to prevent the buffer from being overwritten you'll need to use one of the more secure versions of sprintf like sprintf_s (windows only)
http://msdn.microsoft.com/en-us/library/ybk95axf.aspx

Related

How do I avoid using a constant for filename size?

It seems like standard programming practice and the POSIX standard are at odds with each other. I'm working with a program and I noticed that I see a lot of stuff like:
char buf[NAME_MAX + 1]
And I'm also seeing that a lot of operating systems don't define NAME_MAX and say that that they technically don't have to according to POSIX because you're supposed to use pathconf to get the value it's configured to at runtime rather than hard-coding it as a constant anyway.
The problem is that the compiler won't let me use pathconf this way with arrays. Even if I try storing the result of pathconf in a const int, it still throws a fit and says it has to be a constant. So it looks like in order to actually use pathconf, I would have to avoid using an array of chars for the buffer here because that apparently isn't good enough. So I'm caught between a rock and a hard place, because the C++ standard seemingly won't allow me to do what POSIX says I must do, that is determine the size of a character buffer for a filename at runtime rather than compile time.
The only information I've been able to find on this suggests that I would need to replace the array with a vector, but it's not clear how I would do it. When I test using a simple program, I can get this to work:
std::vector<char> buf((pathconf("/", _PC_NAME_MAX) + 1));
And then I can figure out the size by calling buf.size() or something. But I'm not sure if this is the right approach at all. Does anyone have any experience with trying to get a program to stop depending on constants like NAME_MAX or MAXNAMLEN being defined in the system headers and getting the implementation to use pathconf at runtime instead?
Halfway measures do tend to result in conflicts of some sort.
const usigned NAME_MAX = /* get the value at runtime */;
char buf[NAME_MAX + 1];
The second line declares a C-style array (presumably) intended to hold a C-style string. In C, this is fine. In C++, there is an issue because the value of NAME_MAX is not known at compile time. That's why I called this a halfway measure—there is a mix of C-style code and C++ compiling. (Some compilers will allow this in C++. Apparently yours does not.)
The C++ approach would use C++-style strings, as in:
std::string buf;
That's it. The size does not need to be specified since memory will be allocated as needed, provided you avoid C-style interfaces. Use streaming (>>) when reasonable. If the buffer is being filled by user or file input, this should be all you need.
If you need to use C-style strings (perhaps this buffer is being filled by a system call written for C?), there are a few options for allocating the needed space. The simplest is probably a vector, much like you were thinking.
std::vector<char> buf{NAME_MAX + 1};
system_call(buf.data()); // Send a char* to the system call.
Alternatively, you could use a C++-style string, which could make manipulating the data more convenient.
std::string buf{NAME_MAX + 1, '\0'};
system_call(buf.data()); // Send a char* to the system call.
There is also a smart pointer option, but the vector approach might play nicer with existing code written for a C-style array.

Safe Use of strcpy

Plain old strcpy is prohibited in its use in our company's coding standard because of its potential for buffer overflows. I was looking the source for some 3rd Party Library that we link against in our code. The library source code has a use of strcpy like this:
for (int i = 0; i < newArgc; i++)
{
newArgv[i] = new char[strlen(argv[i]) + 1];
strcpy(newArgv[i], argv[i]);
}
Since strlen is used while allocating memory for the buffer to be copied to, this looks fine. Is there any possible way someone could exploit this normal strcpy, or is this safe as I think it looks to be?
I have seen naïve uses of strcpy that lead to buffer overflow situations, but this does not seem to have that since it is always allocating the right amount of space for the buffer using strlen and then copying to that buffer using the argv[] as the source which should always be null terminated.
I am honestly curious if someone running this code with a debugger could exploit this or if there are any other tactics someone that was trying to hack our binary (with this library source that we link against in its compiled version) could use to exploit this use of strcpy. Thank you for your input and expertise.
It is possible to use strcpy safely - it's just quite hard work (which is why your coding standards forbid it).
However, the code you have posted is not a vulnerability. There is no way to overwrite bits of memory with it; I would not bother rewriting it. (If you do decide to rewrite it, use std::string instead.
Well, there are multiple problems with that code:
If an allocation throws, you get a memory-leak.
Using strcpy() instead of reusing the length is sub-optimal. Use std::copy_n() or memcpy() instead.
Presumably, there are no data-races, not that we can tell.
Anyway, that slight drop in performance is the only thing "wrong" with using strcpy() there. At least if you insist on manually managing your strings yourself.
Deviating from a coding standard should always be possible, but then document well why you decided to do so.
The main problem with strcpy is that it has no length limitation. When taking care this is no problem, but it means that strcpy always is to be accompanied with some safeguarding code. Many less experienced coders have fallen into this pitfall, hence the coding guideline came into practice.
Possible ways to handle string copy safely are:
Check the string length
Use a safe variant like strlcpy, or on older Microsoft compilers, strncpy_s.
As a general strcpy replacement idiom, assuming you're okay with the slight overhead of the print formatting functions, use snprintf:
snprintf(dest, dest_total_buffer_length, "%s", source);
e.g.
snprintf(newArgv[i], strlen(argv[i]) + 1, "%s", argv[i]);
It's safe, simple, and you don't need to think about the +1/-1 size adjustment.

vswprintf() without buffer size crashes on small buffer instead of EOF. How to pass buffer size

Using Borland C++ Builder 2009
I use vswprintf per RAD Studio's Help (F1) :
int vswprintf(wchar_t *buffer, const wchar_t *format, va_list arglist);
Up to now I always provided a big buffer wchar_t OutputStr[1000] and never had any issues. As a test and wanting to do an improvement action, I tried a small buffer wchar_t OutputStr[12] and noticed that the program crashes entirely. Even try{}catch(...){} doesn't catch it. Codeguard reports that a memcpy() fails, which seems to be the internal implementation. I had expected an EOF as return value instead.
When searching online for vswprintf I find the c++ variant takes a buffer size as input, but I can't seem to convince my compiler to use that variant ?
Any idea how to force it using BCB2009 ?
The whole point of the exercise was to implement a fall back scenario for when the buffer is too small in possibly one or two freak situations, so that I can allocate more memory for the function and try again. But this mechanism doesn't seem to work at all.
Not sure how to best test for the exact amount of bytes/characters needed either ?
You can use vswprintf_s. It returns negative value on failure

Safe counterparts of itoa()?

I am converting some old c program to a more secure version. The following functions are used heavily, could anyone tell me their secure counterparts? Either windows functions or C runtime library functions. Thanks.
itoa()
getchar()
strcat()
memset()
itoa() is safe as long as the destination buffer is big enough to receive the largest possible representation (i.e. of INT_MIN with trailing NUL). So, you can simply check the buffer size. Still, it's not a very good function to use because if you change your data type to a larger integral type, you need to change to atol, atoll, atoq etc.. If you want a dynamic buffer that handles whatever type you throw at it with less maintenance issues, consider an std::ostringstream (from the <sstream> header).
getchar() has no "secure counterpart" - it's not insecure to begin with and has no buffer overrun potential.
Re memset(): it's dangerous in that it accepts the programmers judgement that memory should be overwritten without any confirmation of the content/address/length, but when used properly it leaves no issue, and sometimes it's the best tool for the job even in modern C++ programming. To check security issues with this, you need to inspect the code and ensure it's aimed at a suitable buffer or object to be 0ed, and that the length is computed properly (hint: use sizeof where possible).
strcat() can be dangerous if the strings being concatenated aren't known to fit into the destination buffer. For example: char buf[16]; strcpy(buf, "one,"); strcat(buf, "two"); is all totally safe (but fragile, as further operations or changing either string might require more than 16 chars and the compiler won't warn you), whereas strcat(buf, argv[0]) is not. The best replacement tends to be a std::ostringstream, although that can require significant reworking of the code. You may get away using strncat(), or even - if you have it - asprintf("%s%s", first, second), which will allocate the required amount of memory on the heap (do remember to free() it). You could also consider std::string and use operator+ to concatenate strings.
None of these functions are "insecure" provided you understand the behaviour and limitations. itoa is not standard C and should be replaced with sprintf("%d",...) if that's a concern to you.
The others are all fine to the experienced practitioner. If you have specific cases which you think may be unsafe, you should post them.
I'd change itoa(), because it's not standard, with sprintf or, better, snprintf if your goal is code security. I'd also change strcat() with strncat() but, since you specified C++ language too, a really better idea would be to use std::string class.
As for the other two functions, I can't see how you could make the code more secure without seeing your code.

memset() causing data abort

I'm getting some strange, intermittent, data aborts (< 5% of the time) in some of my code, when calling memset(). The problem is that is usually doesn't happen unless the code is running for a couple days, so it's hard to catch it in the act.
I'm using the following code:
char *msg = (char*)malloc(sizeof(char)*2048);
char *temp = (char*)malloc(sizeof(char)*1024);
memset(msg, 0, 2048);
memset(temp, 0, 1024);
char *tempstr = (char*)malloc(sizeof(char)*128);
sprintf(temp, "%s %s/%s %s%s", EZMPPOST, EZMPTAG, EZMPVER, TYPETXT, EOL);
strcat(msg, temp);
//Add Data
memset(tempstr, '\0', 128);
wcstombs(tempstr, gdevID, wcslen(gdevID));
sprintf(temp, "%s: %s%s", "DeviceID", tempstr, EOL);
strcat(msg, temp);
As you can see, I'm not trying to use memset with a size larger that what's originally allocated with malloc()
Anyone see what might be wrong with this?
malloc can return NULL if no memory is available. You're not checking for that.
There's a couple of things. You're using sprintf which is inherently unsafe; unless you're 100% positive that you're not going to exceed the size of the buffer, you should almost always prefer snprintf. The same applies to strcat; prefer the safer alternative strncat.
Obviously this may not fix anything, but it goes a long way in helping spot what might otherwise be very annoying to spot bugs.
malloc can return NULL if no memory is
available. You're not checking for
that.
Right you are... I didn't think about that as I was monitoring the memory and it there was enough free. Is there any way for there to be available memory on the system but for malloc to fail?
Yes, if memory is fragmented. Also, when you say "monitoring memory," there may be something on the system which occasionally consumes a lot of memory and then releases it before you notice. If your call to malloc occurs then, there won't be any memory available. -- Joel
Either way...I will add that check :)
wcstombs doesn't get the size of the destination, so it can, in theory, buffer overflow.
And why are you using sprintf with what I assume are constants? Just use:
EZMPPOST" " EZMPTAG "/" EZMPVER " " TYPETXT EOL
C and C++ combines string literal declarations into a single string.
Have you tried using Valgrind? That is usually the fastest and easiest way to debug these sorts of errors. If you are reading or writing outside the bounds of allocated memory, it will flag it for you.
You're using sprintf which is
inherently unsafe; unless you're 100%
positive that you're not going to
exceed the size of the buffer, you
should almost always prefer snprintf.
The same applies to strcat; prefer the
safer alternative strncat.
Yeah..... I mostly do .NET lately and old habits die hard. I likely pulled that code out of something else that was written before my time...
But I'll try not to use those in the future ;)
You know it might not even be your code... Are there any other programs running that could have a memory leak?
It could be your processor. Some CPUs can't address single bytes, and require you to work in words or chunk sizes, or have instructions that can only be used on word or chunk aligned data.
Usually the compiler is made aware of these and works around them, but sometimes you can malloc a region as bytes, and then try to address it as a structure or wider-than-a-byte field, and the compiler won't catch it, but the processor will throw a data exception later.
It wouldn't happen unless you're using an unusual CPU. ARM9 will do that, for example, but i686 won't. I see it's tagged windows mobile, so maybe you do have this CPU issue.
Instead of doing malloc followed by memset, you should be using calloc which will clear the newly allocated memory for you. Other than that, do what Joel said.
NB borrowed some comments from other answers and integrated into a whole. The code is all mine...
Check your error codes. E.g. malloc can return NULL if no memory is available. This could be causing your data abort.
sizeof(char) is 1 by definition
Use snprintf not sprintf to avoid buffer overruns
If EZMPPOST etc are constants, then you don't need a format string, you can just combined several string literals as STRING1 " " STRING2 " " STRING3 and strcat the whole lot.
You are using much more memory than you need to.
With one minor change, you don't need to call memset in the first place. Nothing
really requires zero initialisation here.
This code does the same thing, safely, runs faster, and uses less memory.
// sizeof(char) is 1 by definition. This memory does not require zero
// initialisation. If it did, I'd use calloc.
const int max_msg = 2048;
char *msg = (char*)malloc(max_msg);
if(!msg)
{
// Allocaton failure
return;
}
// Use snprintf instead of sprintf to avoid buffer overruns
// we write directly to msg, instead of using a temporary buffer and then calling
// strcat. This saves CPU time, saves the temporary buffer, and removes the need
// to zero initialise msg.
snprintf(msg, max_msg, "%s %s/%s %s%s", EZMPPOST, EZMPTAG, EZMPVER, TYPETXT, EOL);
//Add Data
size_t len = wcslen(gdevID);
// No need to zero init this
char* temp = (char*)malloc(len);
if(!temp)
{
free(msg);
return;
}
wcstombs(temp, gdevID, len);
// No need to use a temporary buffer - just append directly to the msg, protecting
// against buffer overruns.
snprintf(msg + strlen(msg),
max_msg - strlen(msg), "%s: %s%s", "DeviceID", temp, EOL);
free(temp);