How do I avoid using a constant for filename size? - c++

It seems like standard programming practice and the POSIX standard are at odds with each other. I'm working with a program and I noticed that I see a lot of stuff like:
char buf[NAME_MAX + 1]
And I'm also seeing that a lot of operating systems don't define NAME_MAX and say that that they technically don't have to according to POSIX because you're supposed to use pathconf to get the value it's configured to at runtime rather than hard-coding it as a constant anyway.
The problem is that the compiler won't let me use pathconf this way with arrays. Even if I try storing the result of pathconf in a const int, it still throws a fit and says it has to be a constant. So it looks like in order to actually use pathconf, I would have to avoid using an array of chars for the buffer here because that apparently isn't good enough. So I'm caught between a rock and a hard place, because the C++ standard seemingly won't allow me to do what POSIX says I must do, that is determine the size of a character buffer for a filename at runtime rather than compile time.
The only information I've been able to find on this suggests that I would need to replace the array with a vector, but it's not clear how I would do it. When I test using a simple program, I can get this to work:
std::vector<char> buf((pathconf("/", _PC_NAME_MAX) + 1));
And then I can figure out the size by calling buf.size() or something. But I'm not sure if this is the right approach at all. Does anyone have any experience with trying to get a program to stop depending on constants like NAME_MAX or MAXNAMLEN being defined in the system headers and getting the implementation to use pathconf at runtime instead?

Halfway measures do tend to result in conflicts of some sort.
const usigned NAME_MAX = /* get the value at runtime */;
char buf[NAME_MAX + 1];
The second line declares a C-style array (presumably) intended to hold a C-style string. In C, this is fine. In C++, there is an issue because the value of NAME_MAX is not known at compile time. That's why I called this a halfway measure—there is a mix of C-style code and C++ compiling. (Some compilers will allow this in C++. Apparently yours does not.)
The C++ approach would use C++-style strings, as in:
std::string buf;
That's it. The size does not need to be specified since memory will be allocated as needed, provided you avoid C-style interfaces. Use streaming (>>) when reasonable. If the buffer is being filled by user or file input, this should be all you need.
If you need to use C-style strings (perhaps this buffer is being filled by a system call written for C?), there are a few options for allocating the needed space. The simplest is probably a vector, much like you were thinking.
std::vector<char> buf{NAME_MAX + 1};
system_call(buf.data()); // Send a char* to the system call.
Alternatively, you could use a C++-style string, which could make manipulating the data more convenient.
std::string buf{NAME_MAX + 1, '\0'};
system_call(buf.data()); // Send a char* to the system call.
There is also a smart pointer option, but the vector approach might play nicer with existing code written for a C-style array.

Related

Safe Use of strcpy

Plain old strcpy is prohibited in its use in our company's coding standard because of its potential for buffer overflows. I was looking the source for some 3rd Party Library that we link against in our code. The library source code has a use of strcpy like this:
for (int i = 0; i < newArgc; i++)
{
newArgv[i] = new char[strlen(argv[i]) + 1];
strcpy(newArgv[i], argv[i]);
}
Since strlen is used while allocating memory for the buffer to be copied to, this looks fine. Is there any possible way someone could exploit this normal strcpy, or is this safe as I think it looks to be?
I have seen naïve uses of strcpy that lead to buffer overflow situations, but this does not seem to have that since it is always allocating the right amount of space for the buffer using strlen and then copying to that buffer using the argv[] as the source which should always be null terminated.
I am honestly curious if someone running this code with a debugger could exploit this or if there are any other tactics someone that was trying to hack our binary (with this library source that we link against in its compiled version) could use to exploit this use of strcpy. Thank you for your input and expertise.
It is possible to use strcpy safely - it's just quite hard work (which is why your coding standards forbid it).
However, the code you have posted is not a vulnerability. There is no way to overwrite bits of memory with it; I would not bother rewriting it. (If you do decide to rewrite it, use std::string instead.
Well, there are multiple problems with that code:
If an allocation throws, you get a memory-leak.
Using strcpy() instead of reusing the length is sub-optimal. Use std::copy_n() or memcpy() instead.
Presumably, there are no data-races, not that we can tell.
Anyway, that slight drop in performance is the only thing "wrong" with using strcpy() there. At least if you insist on manually managing your strings yourself.
Deviating from a coding standard should always be possible, but then document well why you decided to do so.
The main problem with strcpy is that it has no length limitation. When taking care this is no problem, but it means that strcpy always is to be accompanied with some safeguarding code. Many less experienced coders have fallen into this pitfall, hence the coding guideline came into practice.
Possible ways to handle string copy safely are:
Check the string length
Use a safe variant like strlcpy, or on older Microsoft compilers, strncpy_s.
As a general strcpy replacement idiom, assuming you're okay with the slight overhead of the print formatting functions, use snprintf:
snprintf(dest, dest_total_buffer_length, "%s", source);
e.g.
snprintf(newArgv[i], strlen(argv[i]) + 1, "%s", argv[i]);
It's safe, simple, and you don't need to think about the +1/-1 size adjustment.

What can be the Alternative of strlen()?

I found people suggesting not to use strlen() giving reason that its time complexity is not O(1).Than what should i use in character array to know the size in good time complexity except
s[i]!='\0'
If you're going to use a C-style array of char, then I don't think there's any way to work out the length other than by looping through until you get to the end. You can do this yourself or call strlen.
However, as you have tagged your question with c++, you should be using a std::string instead, then you can use the length (or equivalently, size) method instead.
If you are using C++, don't use C-style strings (i.e. char*). Use std::string instead. The C++ standard requires std::string::size to be of O(1) complexity. It can be achieved by computing the length of the string once and storing it in a data member (and updating if the string length changes).
However, if we are talking about C, then there is no other way. C uses null-terminated strings.
Interestingly, some languages (like Pascal) implement strings differently. Instead of using null-terminated strings (like C does), they opt for length-prefixed strings. It means that each string has its length recorded at the beginning. This has some problems, for example, now you're spending additional bytes to store the length counter, you can only store a finite amount of characters in the string (for example, if you use 2 bytes for length your string may contain at most 2^16 characters), and working with substrings becomes a chore.
If you're using a char[], then you can use sizeof(str) - 1 instead. Note std::strlen(str) is likely to be optimized out anyway. For example using GCC 4.9.2 with -O2, the following codes:
std::cout << sizeof("hello") - 1;
std::cout << std::strlen("hello");
result in this assembly:
[...]
movl $5, %esi
[...]
You should be using std::string anyway. std::string::size and std::string::length both have constant complexity, because it might be implemented like this (libstdc++):
size_type
size() const _GLIBCXX_NOEXCEPT
{ return _M_rep()->_M_length; }
As others have suggested, you can store the length in a variable and minimize the calls to std::strlen. In general though, this smells like a case of premature optimization. The C++ standard library is the superior choice, generally.
As someone already mentioned in a comment, you can easily have a structure that holds the Length and the Data like:
struct string{
char * Data;
int Length;
}
You could also have resizeable strings, for which you would also want to hold the capacity.
Either way, you could have both the efficiency of knowing how long it is, and library compatibility. All you need to do is just need to make sure that when you create the string, you make it one byte larger than the length, and make that last byte null. You just need to be careful that this always happens, or else it will cause huge issues.

Does buffer overflow happen in C++ strings?

This is concerning strings in C++. I have not touched C/C++ for a very long time; infact I did programming in those languages only for the first year in my college, about 7 years ago.
In C to hold strings I had to create character arrays(whether static or dynamic, it is not of concern). So that would mean that I need to guess well in advance the size of the string that an array would contain. Well I applied the same approach in C++. I was aware that there was a std::string class but I never got around to use it.
My question is that since we never declare the size of an array/string in std::string class, does a buffer overflow occur when writing to it. I mean, in C, if the array’s size was 10 and I typed more than 10 characters on the console then that extra data would be writtein into some other object’s memory place, which is adjacent to the array. Can a similar thing happen in std::string when using the cin object.
Do I have to guess the size of the string before hand in C++ when using std::string?
Well! Thanks to you all. There is no one right answer on this page (a lot of different explanations provided), so I am not selecting any single one as such. I am satisfied with the first 5. Take Care!
Depending on the member(s) you are using to access the string object, yes. So, for example, if you use reference operator[](size_type pos) where pos > size(), yes, you would.
Assuming no bugs in the standard library implementation, no. A std::string always manages its own memory.
Unless, of course, you subvert the accessor methods that std::string provides, and do something like:
std::string str = "foo";
char *p = (char *)str.c_str();
strcpy(p, "blah");
You have no protection here, and are invoking undefined behaviour.
The std::string generally protects against buffer overflow, but there are still situations in which programming errors can lead to buffer overflows. While C++ generally throws an out_of_range exception when an operation references memory outside the bounds of the string, the subscript operator [] (which does not perform bounds checking) does not.
Another problem occurs when converting std::string objects to C-style strings. If you use string::c_str() to do the conversion, you get a properly null-terminated C-style string. However, if you use string::data(), which writes the string directly into an array (returning a pointer to the array), you get a buffer that is not null terminated. The only difference between c_str() and data() is that c_str() adds a trailing null byte.
Finally, many existing C++ programs and libraries have their own string classes. To use these libraries, you may have to use these string types or constantly convert back and forth. Such libraries are of varying quality when it comes to security. It is generally best to use the standard library (when possible) or to understand the semantics of the selected library. Generally speaking, libraries should be evaluated based on how easy or complex they are to use, the type of errors that can be made, how easy these errors are to make, and what the potential consequences may be.
refer https://buildsecurityin.us-cert.gov/bsi/articles/knowledge/coding/295-BSI.html
In c the cause is explained as follow:
void function (char *str) {
char buffer[16];
strcpy (buffer, str);
}
int main () {
char *str = "I am greater than 16 bytes"; // length of str = 27 bytes
function (str);
}
This program is guaranteed to cause unexpected behavior, because a string (str) of 27 bytes has been copied to a location (buffer) that has been allocated for only 16 bytes. The extra bytes run past the buffer and overwrites the space allocated for the FP, return address and so on. This, in turn, corrupts the process stack. The function used to copy the string is strcpy, which completes no checking of bounds. Using strncpy would have prevented this corruption of the stack. However, this classic example shows that a buffer overflow can overwrite a function's return address, which in turn can alter the program's execution path. Recall that a function's return address is the address of the next instruction in memory, which is executed immediately after the function returns.
here is a good tutorial that can give your answer satisfactory.
In C++, the std::string class starts out with a minimum size (or you can specify a starting size). If that size is exceeded, std::string allocates more dynamic memory.
Assuming the library providing std::string is correctly written, you cannot cause a buffer overflow by adding characters to a std::string object.
Of course, bugs in the library are not impossible.
"Do buffer overflows occur in C++ code?"
To the extent that C programs are legal C++ code (they almost all are), and C programs have buffer overflows, C++ programs can have buffer overflows.
Being richer than C, I'm sure C++ can have buffer overflows in ways that C cannot :-}

Safe counterparts of itoa()?

I am converting some old c program to a more secure version. The following functions are used heavily, could anyone tell me their secure counterparts? Either windows functions or C runtime library functions. Thanks.
itoa()
getchar()
strcat()
memset()
itoa() is safe as long as the destination buffer is big enough to receive the largest possible representation (i.e. of INT_MIN with trailing NUL). So, you can simply check the buffer size. Still, it's not a very good function to use because if you change your data type to a larger integral type, you need to change to atol, atoll, atoq etc.. If you want a dynamic buffer that handles whatever type you throw at it with less maintenance issues, consider an std::ostringstream (from the <sstream> header).
getchar() has no "secure counterpart" - it's not insecure to begin with and has no buffer overrun potential.
Re memset(): it's dangerous in that it accepts the programmers judgement that memory should be overwritten without any confirmation of the content/address/length, but when used properly it leaves no issue, and sometimes it's the best tool for the job even in modern C++ programming. To check security issues with this, you need to inspect the code and ensure it's aimed at a suitable buffer or object to be 0ed, and that the length is computed properly (hint: use sizeof where possible).
strcat() can be dangerous if the strings being concatenated aren't known to fit into the destination buffer. For example: char buf[16]; strcpy(buf, "one,"); strcat(buf, "two"); is all totally safe (but fragile, as further operations or changing either string might require more than 16 chars and the compiler won't warn you), whereas strcat(buf, argv[0]) is not. The best replacement tends to be a std::ostringstream, although that can require significant reworking of the code. You may get away using strncat(), or even - if you have it - asprintf("%s%s", first, second), which will allocate the required amount of memory on the heap (do remember to free() it). You could also consider std::string and use operator+ to concatenate strings.
None of these functions are "insecure" provided you understand the behaviour and limitations. itoa is not standard C and should be replaced with sprintf("%d",...) if that's a concern to you.
The others are all fine to the experienced practitioner. If you have specific cases which you think may be unsafe, you should post them.
I'd change itoa(), because it's not standard, with sprintf or, better, snprintf if your goal is code security. I'd also change strcat() with strncat() but, since you specified C++ language too, a really better idea would be to use std::string class.
As for the other two functions, I can't see how you could make the code more secure without seeing your code.

Using sprintf without a manually allocated buffer

In the application that I am working on, the logging facility makes use of sprintf to format the text that gets written to file. So, something like:
char buffer[512];
sprintf(buffer, ... );
This sometimes causes problems when the message that gets sent in becomes too big for the manually allocated buffer.
Is there a way to get sprintf behaviour without having to manually allocate memory like this?
EDIT: while sprintf is a C operation, I'm looking for C++ type solutions (if there are any!) for me to get this sort of behaviour...
You can use asprintf(3) (note: non-standard) which allocates the buffer for you so you don't need to pre-allocate it.
No you can't use sprintf() to allocate enough memory. Alternatives include:
use snprintf() to truncate the message - does not fully resolve your problem, but prevent the buffer overflow issue
double (or triple or ...) the buffer - unless you're in a constrained environment
use C++ std::string and ostringstream - but you'll lose the printf format, you'll have to use the << operator
use Boost Format that comes with a printf-like % operator
I dont also know a version wich avoids allocation, but if C99 sprintfs allows as string the NULL pointer. Not very efficient, but this would give you the complete string (as long as enough memory is available) without risking overflow:
length = snprintf(NULL, ...);
str = malloc(length+1);
snprintf(str, ...);
"the logging facility makes use of sprintf to format the text that gets written to file"
fprintf() does not impose any size limit. If you can write the text directly to file, do so!
I assume there is some intermediate processing step, however. If you know how much space you need, you can use malloc() to allocate that much space.
One technique at times like these is to allocate a reasonable-size buffer (that will be large enough 99% of the time) and if it's not big enough, break the data into chunks that you process one by one.
With the vanilla version of sprintf, there is no way to prevent the data from overwriting the passed in buffer. This is true regardless of wether the memory was manually allocated or allocated on the stack.
In order to prevent the buffer from being overwritten you'll need to use one of the more secure versions of sprintf like sprintf_s (windows only)
http://msdn.microsoft.com/en-us/library/ybk95axf.aspx