gcc-8 -Wstringop-truncation what is the good practice?

gcc-8 -Wstringop-truncation what is the good practice? - c++

GCC 8 added a -Wstringop-truncation warning. From https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82944 :
The -Wstringop-truncation warning added in GCC 8.0 via r254630 for bug 81117 is specifically intended to highlight likely unintended uses of the strncpy function that truncate the terminating NUL charcter from the source string. An example of such a misuse given in the request is the following:
char buf[2];
void test (const char* str)
{
strncpy (buf, str, strlen (str));
}
I get the same warning with this code.
strncpy(this->name, name, 32);
warning: 'char* strncpy(char*, const char*, size_t)' specified bound 32 equals destination size [-Wstringop-truncation`]
Considering that this->name is char name[32] and name is a char* with a length potentially greater than 32. I would like to copy name into this->name and truncate it if it is greater than 32. Should size_t be 31 instead of 32? I'm confused. It is not mandatory for this->name to be NUL-terminated.

This message is trying to warn you that you're doing exactly what you're doing. A lot of the time, that's not what the programmer intended. If it is what you intended (meaning, your code will correctly handle the case where the character array will not end up containing any null character), turn off the warning.
If you do not want to or cannot turn it off globally, you can turn it off locally as pointed out by #doron:
#include <string.h>
char d[32];
void f(const char *s) {
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wstringop-truncation"
strncpy(d, s, 32);
#pragma GCC diagnostic pop
}

This new GCC warning renders strncpy() mostly unusable in many projects: Code review will not accept code, that produces warnings. But if strncpy() is used only with strings short enough, so that it can write the terminating zero byte, then zeroing out the destination buffer in the beginning and then plain strcpy() would achieve the same job.
Actually, strncpy() is one of the functions, that they had better not put into the C library. There are legitimate use cases for it, sure. But library designers forgot to put fixed size string aware counterparts to strncpy() into the standard, too. The most important such functions, strnlen() and strndup(), were only included 2008 into POSIX.1, decades after strncpy() was created! And there is still no function, that copies a strncpy() generated fixed-length string into a preallocated buffer with correct C semantics, i.e. always writing the 0-termination byte. One such function could be:
// Copy string "in" with at most "insz" chars to buffer "out", which
// is "outsz" bytes long. The output is always 0-terminated. Unlike
// strncpy(), strncpy_t() does not zero fill remaining space in the
// output buffer:
char* strncpy_t(char* out, size_t outsz, const char* in, size_t insz){
assert(outsz > 0);
while(--outsz > 0 && insz > 0 && *in) { *out++ = *in++; insz--; }
*out = 0;
return out;
}
I recommend to use two length inputs for strncpy_t(), to avoid confusion: If there was only a single size argument, it would be unclear, if it is the size of the output buffer or the maximum length of the input string (which is usually one less).

There are very little justified case for using strncpy. This is a quite dangerous function. If the source string length (without the null character) is equal to the destination buffer size, then strncpy will not add the null character at the end of the destination buffer. So the destination buffer will not be null terminated.
We should write this kind of code on Linux:
lenSrc = strnlen(pSrc, destSize)
if (lenSrc < destSize)
memcpy(pDest, pSrc, lenSrc + 1);
else {
/* Handle error... */
}
In your case, if you want to truncate the source on copy, but still want a null terminated destination buffer, then you could write this kind of code:
destSize = 32
sizeCp = strnlen(pSrc, destSize - 1);
memcpy(pDest, pSrc, sizeCp);
pDest[sizeCp] = '\0';
Edit: Oh... If this not mandatory to be NULL terminated, strncpy is the right function to use. And yes you need to call it with 32 and not 31.
I think you need to ignore this warning by disabling it... Honestly I do not have a good answer for that...
Edit2: In order to mimic the strncpy function, you could write this code:
destSize = 32
sizeCp = strnlen(pSrc, destSize - 1);
memcpy(pDest, pSrc, sizeCp + 1);

TL;DR: handle the truncation case and the warning will dissappear.
This warning happened to be really useful for me, as it uncovered an issue in my code. Consider this listing:
#include <string.h>
#include <stdio.h>
int main() {
const char long_string[] = "It is a very long string";
char short_string[8];
strncpy(short_string, long_string, sizeof(short_string));
/* This line is extremely important, it handles string truncation */
short_string[7] = '\0';
printf("short_string = \"%s\"\n", short_string);
return 0;
}
demo
As the comment says short_string[7] = '\0'; is necessary here. From the strncpy man:
Warning: If there is no null byte among the first n bytes of src, the string placed in dest will not be null-terminated.
If we remove this line, it invokes UB. For example, for me, the program starts printing:
short_string = "It is a It is a very long string"
Basically, GCC wants you to fix the UB. I added such handling to my code and the warning is gone.

The responses from others led me to just write a simple version of strncpy.
#include<string.h>
char* mystrncpy(char* dest, const char*src, size_t n) {
memset(dest, 0, n);
memcpy(dest, src, strnlen(src, n-1));
return dest;
}
It avoids the warnings and guarantees dest is null terminated. I'm using the g++ compiler and wanted to avoid pragma entries.

I found this while looking for a near-perfect solution to this problem. Since most of the answers here describing the possibility and ways about how to handle without suppressing the warning. The accepted answer suggests the use of the following wrapper which results in another set of warnings and is frustrating and not desirable.
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wstringop-truncation"
...
#pragma GCC diagnostic pop
Instead, I found this working solution, can't say if there are any pitfalls, but it does the work nicely.
_Pragma("GCC diagnostic push")
_Pragma("GCC diagnostic ignored \"-Wstringop-truncation\"")
strncpy(d, s, 32);
_Pragma("GCC diagnostic pop")
See full article here.

I found the best way to suppress the warning is to put the expression in parentheses like this gRPC patch:
(strncpy(req->initial_request.name, lb_service_name,
GRPC_GRPCLB_SERVICE_NAME_MAX_LENGTH));
The problem with #pragma diagnostics suppression solution is that the #pragma itself will cause a warning when the compiler does not recognize either the pragma or the particular warning; also it is too verbose.

What it say is that we can only use len - 1 characters because last one should be '\0', so use seems to clean the warning we only can copy len - 1 ...
by the examples:
strncpy(this->name, name, 31);
or
#include <string.h>
char d[32];
void f(const char *s) {
strncpy(d, s, 31);
}
d[31] = '\0';

Related

std::string constructor with lvalue throws with clang

I am making use of Matei David's handy C++ wrapper for zlib, but I get an error when compiling on macOs (clang-1100.0.33.
include/strict_fstream.hpp:39:37: error: cannot initialize a parameter of type 'const char *' with an lvalue of type 'int'
The problem is here:
/// Overload of error-reporting function, to enable use with VS.
/// Ref: http://stackoverflow.com/a/901316/717706
static std::string strerror()
{
std::string buff(80, '\0');
// options for other environments omitted, where error message is set
// if not Win32 or _POSIX_C_SOURCE >= 200112L, error message is left empty.
auto p = strerror_r(errno, &buff[0], buff.size());
// error next line
std::string tmp(p, std::strlen(p));
std::swap(buff, tmp);
buff.resize(buff.find('\0'));
return buff;
}
(Which IIUC has nothing to do with zlib, just trying to report errors in a thread safe manner).
If I change to this:
static std::string strerror()
{
std::string buff(80, '\0');
auto p = strerror_r(errno, &buff[0], buff.size());
// "fix" below
size_t length = buff.size();
std::string tmp(p, length);
std::swap(buff, tmp);
buff.resize(buff.find('\0'));
return buff;
}
My program compiles and runs fine.
I have two questions:
Why does clang not like the constructor std::string tmp(p, std::strlen(p));?
The buffer was declared at the beginning of the function as length 80. Why are we even bothering to look up the length?
The answer to 2 may answer this, but is there something wrong with my version?
Thanks.

If you use int strerror_r(int errnum, char *buf, size_t buflen);, then there is no appropriate string constructor and the program is ill formed.
If you use char *strerror_r(int errnum, char *buf, size_t buflen);, then the program is well-formed.
The standard C/POSIX library implementation influences which function you get. Compiler is only involved as much as influencing what system library may be used by default.
Former function is extension to POSIX specified in XSI (which is essentially an optional part of POSIX) and latter is a GNU extension.
In case you use glibc (I don't know if that is an option on MacOS), you can control which version you get with macros, although the XSI compliant version is not available in older versions. Its documentation says:
The XSI-compliant version of strerror_r() is provided if:
(_POSIX_C_SOURCE >= 200112L || _XOPEN_SOURCE >= 600) && ! _GNU_SOURCE
The buffer was declared at the beginning of the function as length 80. Why are we even bothering to look up the length?
In the construction std::string tmp(p, std::strlen(p));, strlen seems entirely unnecessary to me. std::string tmp(p); is equivalent.
If you don't need thread safety, then the most portable solution is to use std::strerror which is in standard C++:
return std::strerror(errno); // yes, it's this simple
If you do need thread safety, then you could wrap this in a critical section using a mutex.
Note that strerror, the name of your function, is reserved to the language implementation in the global namespace when the standard library is used. The function should be in a namespace, or be renamed.

There are two different versions of the strerror_r that you'll commonly see:
A POSIX-compliant version that always stores the error message in the provided buffer (if it succeeds) and returns an int (0 for success, non-zero for error)
A GNU version that may store the error message in the provided buffer, or maybe not. It returns a char* pointing to the error message, which may point to the user-supplied buffer or may point to some other global static storage.
That strerror function is clearly written to work with the GNU version of strerror_r.
As for your second question, you need the strlen. buff is 80 characters long, but the actual error message may be shorter and only partially fill the buffer. That strlen is being used to trim off any extra nul characters from the end.

Right way of defining a buffer when using sprintf

In my Arduino sketch I use sprintf to format a reply. The formatted data will be send using SPI.
I have to set the buffer length, but I don't know how large would be good. Here is a piece of the code:
uint8_t myValue = 4;
// Set buffer
unsigned char data[1000];
// Reset this buffer
memset(data, 0, sizeof(data));
// Format the reply
sprintf((char *)data, "This is the int value: %d", \
myValue);
I can also set the buffer to 0 (unsigned char data[0];), The code compiles and the reply is the same as using a large buffer. I can't explain this?
It seems malloc() and free() are pretty rare in the Arduino world...
What is the right way of using a buffer in Arduino?

I would recommend using snprintf instead, where you specify how large the buffer is. With the current approach sprintf assumes that the buffer is large enough, so if you pass it a small buffer it will stomp over some other memory. Not good (and may explain the weird results you sometimes get)!
You can use a fixed buffer as in your current code. Allocating also works, but it has more overhead.

I can also set the buffer to 0 (unsigned char data[0];),
If this is true, you're just getting lucky (or unlucky, depending on how you look at it). The behaviour of your code doing this is undefined, in both C and C++. Which, among other things, means it is not guaranteed to work (e.g. it might break if your compiler is ever updated).
As to better options .... Your question is tagged C++ and C, and different approaches are preferable in different languages.
Assuming you are using C, and your implementation complies with the 1999 standard or later, you can use snprintf().
char *buffer;
int length = snprintf((char *)NULL, 0, "This is the int value: %d", myValue);
buffer = malloc(length + 1); /* +1 allows for terminator */
snprintf(buffer, length, "This is the int value: %d", myValue);
/* use buffer */
free(buffer);
The first call of sprintf() is used to recover the buffer length. The second to actually write to the allocated buffer. Remember to release the dynamically allocated buffer when done. (The above also does not check for errors - snprintf() returns a negative value if an error occurs).
If your C library does not include snprintf() then (AFAIK) there is no standard way to work out the length automatically - you will need to estimate an upper bound on length by hand (e.g. work out the length of output of the largest negative or positive int, and add that to the length of other content in your format string).
In C++, use a string stream
#include <sstream>
#include <string>
std::ostringstream oss;
oss << "This is the int value: " << myValue);
std::string str = oss.str();
const char *buffer = str.data();
// readonly access to buffer here
// the objects above will be cleaned up when they pass out of scope.
If you insist on using the C approach in C++, the approach above using snprintf() can be used in C++ if your implementation complies with the 2011 standard (or later). However, in general terms, I would NOT recommend this.

Crash C++ application using sprintf_s without specifying string length

I have a c++ application, in which customer reported a crash.But the crash is not easily reproducible.
After analysing some logs and all i found that the crash may occure in between the following code portions. Please tell me there is any chance of getting crashed the application if i have these code statements in it?
//Tesrt
std::string strAppName = "App1\0";
int nSize = 10;
sprintf_s(szBuff, "The appname %s have %d dependancies ", strAppName.c_str(), nSize);
//Then use the szBuff to log to a text file
//Test end

The problem is that you've not provided the correct arguments to sprintf_s:
int sprintf_s(
char *buffer,
size_t sizeOfBuffer,
const char *format [,
argument] ...
);
sprintf_s takes a size_t as it's second argument (the size of szBuff), but you've not provided that. Instead, you've given it a const char * where that parameter should be. The only way to have compiled this is for you to have ignored compiler warnings.
So what sprintf_s is seeing is:
buffer to print into
large number of characters allowed to go into buffer
strAppName.c_str() as the format string
In other words, this isn't doing anything like what you want. Provide the size of szBuff as the second parameter, and I'll bet your problems go away.
And yes, given what you've done I'd expect crashes all over the place.

C++ std::string alternative to strcpy?

I know there is a similarly titled question already on SO but I want to know my options for this specific case.
MSVC compiler gives a warning about strcpy:
1>c:\something\mycontrol.cpp(65): warning C4996: 'strcpy': This function or
variable may be unsafe. Consider using strcpy_s instead. To disable
deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
Here's my code:
void MyControl::SetFontFace(const char *faceName)
{
LOGFONT lf;
CFont *currentFont = GetFont();
currentFont->GetLogFont(&lf);
strcpy(lf.lfFaceName, faceName); <--- offending line
font_.DeleteObject();
// Create the font.
font_.CreateFontIndirect(&lf);
// Use the font to paint a control.
SetFont(&font_);
}
Note font_ is an instance variable. LOGFONT is a windows structure where lfFaceName is defined as TCHAR lfFaceName[LF_FACESIZE].
What I'm wondering is can I do something like the following (and if not why not):
void MyControl::SetFontFace(const std::string& faceName)
...
lf.lfFaceName = faceName.c_str();
...
Or if there is a different alternative altogether then let me know.

The reason you're getting the security warning is, your faceName argument could point to a string that is longer than LF_FACESIZE characters, and then strcpy would blindly overwrite whatever comes after lfFaceName in the LOGFONT structure. You do have a bug.
You should not blindly fix the bug by changing strcpy to strcpy_s, because:
The *_s functions are unportable Microsoft inventions almost all of which duplicate the functionality of other C library functions that are portable. They should never be used, even in a program not intended to be portable (as this appears to be).
Blind changes tend to not actually fix this class of bug. For instance, the "safe" variants of strcpy (strncpy, strlcpy, strcpy_s) simply truncate the string if it's too long, which in this case would make you try to load the wrong font. Worse, strncpy omits the NUL terminator when it does that, so you'd probably just move the crash inside CreateFontIndirect if you used that one. The correct fix is to check the length up front and fail the entire operation if it's too long. At which point strcpy becomes safe (because you know it's not too long), although I prefer memcpy because it makes it obvious to future readers of the code that I've thought about this.
TCHAR and char are not the same thing; copying either a C-style const char * string or a C++ std::string into an array of TCHAR without a proper encoding conversion may produce complete nonsense. (Using TCHAR is, in my experience, always a mistake, and the biggest problem with it is that code like this will appear to work correctly in an ASCII build, and will still compile in UNICODE mode, but will then fail catastrophically at runtime.)
You certainly can use std::string to help with this problem, but it won't get you out of needing to check the length and manually copy the string. I'd probably do it like this. Note that I am using LOGFONTW and CreateFontIndirectW and an explicit conversion from UTF-8 in the std::string. Note also that chunks of this were cargo-culted out of MSDN and none of it has been tested. Sorry.
void MyControl::SetFontFace(const std::string& faceName)
{
LOGFONTW lf;
this->font_.GetLogFontW(&lf);
int count = MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS,
faceName.data(), faceName.length(),
lf.lfFaceName, LF_FACESIZE - 1)
if (count <= 0)
throw GetLastError(); // FIXME: use a real exception
lf.lfFaceName[count] = L'\0'; // MultiByteToWideChar does not NUL-terminate.
this->font_.DeleteObject();
if (!this->font_.CreateFontIndirectW(&lf))
throw GetLastError(); // FIXME: use a real exception
// ...
}

lf.lfFaceName = faceName.c_str();
No you shouldn't do that because you are making a local copy of the poitner to the data held inside the std::string. If the c++ string changes, or is deleted, the pointer is no longer valid, and if lFaceName decides to change the data this will almost certainly break the std::string.
Since you need to copy a c string, you need a 'c' function, then strcpy_s (or it's equivalent) is the safe alternative

Have you tried? Given the information in your post, the assignment should generate a compiler error because you're trying to assign a pointer to an array, which does not work in C(++).
#include <cstdio>
#include <string>
using namespace std;
struct LOGFONT {
char lfFaceName[3];
};
int main() {
struct LOGFONT f;
string foo="bar";
f.lfFaceName = foo.c_str();
return 0;
}
leads to
x.c:13: error: incompatible types in assignment of `const char*' to `char[3]'
I'd recommend using a secure strcpy alternative like the warning says, given that you know the size of the destination space anyway.

#include <algorithm>
#include <iostream>
#include <string>
enum { LF_FACESIZE = 256 }; // = 3 // test too-long input
struct LOGFONT
{
char lfFaceName[LF_FACESIZE];
};
int main()
{
LOGFONT f;
std::string foo("Sans-Serif");
std::copy_n(foo.c_str(), foo.size()+1 > LF_FACESIZE ? LF_FACESIZE : foo.size()+1,
f.lfFaceName);
std::cout << f.lfFaceName << std::endl;
return 0;
}

lf.lfFaceName = faceName.c_str(); won't work for two reasons (assuming you change faceName to a std:string)
The lifetime of the pointer returned by c_str() is temporary. It's only valid as long as the fileName object doesn't change and in alive.
The line won't compile. .c_str() returns a pointer to a char, and lfFaceName is a character array and can't be assigned to. You need to do something to fill in the string array, to fill in the bytes at lfFaceName, and pointer assignment doesn't do that.
There isn't anything C++ that can help here, since lfFaceName is a C "string". You need to use a C string function, like strcpy or strcpy_s. You can change your code to:
strcpy_s(lf.lfFaceName, LF_FACESIZE, faceName);

understanding the dangers of sprintf(...)

OWASP says:
"C library functions such as strcpy
(), strcat (), sprintf () and vsprintf
() operate on null terminated strings
and perform no bounds checking."
sprintf writes formatted data to string
int sprintf ( char * str, const char * format, ... );
Example:
sprintf(str, "%s", message); // assume declaration and
// initialization of variables
If I understand OWASP's comment, then the dangers of using sprintf are that
1) if message's length > str's length, there's a buffer overflow
and
2) if message does not null-terminate with \0, then message could get copied into str beyond the memory address of message, causing a buffer overflow
Please confirm/deny. Thanks

You're correct on both problems, though they're really both the same problem (which is accessing data beyond the boundaries of an array).
A solution to your first problem is to instead use std::snprintf, which accepts a buffer size as an argument.
A solution to your second problem is to give a maximum length argument to snprintf. For example:
char buffer[128];
std::snprintf(buffer, sizeof(buffer), "This is a %.4s\n", "testGARBAGE DATA");
// std::strcmp(buffer, "This is a test\n") == 0
If you want to store the entire string (e.g. in the case sizeof(buffer) is too small), run snprintf twice:
int length = std::snprintf(nullptr, 0, "This is a %.4s\n", "testGARBAGE DATA");
++length; // +1 for null terminator
char *buffer = new char[length];
std::snprintf(buffer, length, "This is a %.4s\n", "testGARBAGE DATA");
(You can probably fit this into a function using va or variadic templates.)

Both of your assertions are correct.
There's an additional problem not mentioned. There is no type checking on the parameters. If you mismatch the format string and the parameters, undefined and undesirable behavior could result. For example:
char buf[1024] = {0};
float f = 42.0f;
sprintf(buf, "%s", f); // `f` isn't a string. the sun may explode here
This can be particularly nasty to debug.
All of the above lead many C++ developers to the conclusion that you should never use sprintf and its brethren. Indeed, there are facilities you can use to avoid all of the above problems. One, streams, is built right in to the language:
#include <sstream>
#include <string>
// ...
float f = 42.0f;
stringstream ss;
ss << f;
string s = ss.str();
...and another popular choice for those who, like me, still prefer to use sprintf comes from the boost Format libraries:
#include <string>
#include <boost\format.hpp>
// ...
float f = 42.0f;
string s = (boost::format("%1%") %f).str();
Should you adopt the "never use sprintf" mantra? Decide for yourself. There's usually a best tool for the job and depending on what you're doing, sprintf just might be it.

Yes, it is mostly a matter of buffer overflows. However, those are quite serious business nowdays, since buffer overflows are the prime attack vector used by system crackers to circumvent software or system security. If you expose something like this to user input, there's a very good chance you are handing the keys to your program (or even your computer itself) to the crackers.
From OWASP's perspective, let's pretend we are writing a web server, and we use sprintf to parse the input that a browser passes us.
Now let's suppose someone malicious out there passes our web browser a string far larger than will fit in the buffer we chose. His extra data will instead overwrite nearby data. If he makes it large enough, some of his data will get copied over the webserver's instructions rather than its data. Now he can get our webserver to execute his code.

Your 2 numbered conclusions are correct, but incomplete.
There is an additional risk:
char* format = 0;
char buf[128];
sprintf(buf, format, "hello");
Here, format is not NULL-terminated. sprintf() doesn't check that either.

Your interpretation seems to be correct. However, your case #2 isn't really a buffer overflow. It's more of a memory access violation. That's just terminology though, it's still a major problem.

The sprintf function, when used with certain format specifiers, poses two types of security risk: (1) writing memory it shouldn't; (2) reading memory it shouldn't. If snprintf is used with a size parameter that matches the buffer, it won't write anything it shouldn't. Depending upon the parameters, it may still read stuff it shouldn't. Depending upon the operating environment and what else a program is doing, the danger from improper reads may or may not be less severe than that from improper writes.

It is very important to remember that sprintf() adds the ASCII 0 character as string terminator at the end of each string. Therefore, the destination buffer must have at least n+1 bytes (To print the word "HELLO", a 6-byte buffer is required, NOT 5)
In the example below, it may not be obvious, but in the 2-byte destination buffer, the second byte will be overwritten by ASCII 0 character. If only 1 byte was allocated for the buffer, this would cause buffer overrun.
char buf[3] = {'1', '2'};
int n = sprintf(buf, "A");
Also note that the return value of sprintf() does NOT include the null-terminating character. In the example above, 2 bytes were written, but the function returns '1'.
In the example below, the first byte of class member variable 'i' would be partially overwritten by sprintf() (on a 32-bit system).
struct S
{
char buf[4];
int i;
};
int main()
{
struct S s = { };
s.i = 12345;
int num = sprintf(s.buf, "ABCD");
// The value of s.i is NOT 12345 anymore !
return 0;
}

I pretty much have stated a small example how you could get rid of the buffer size declaration for the sprintf (if you intended to, of course!) and no snprintf envolved ....
Note: This is an APPEND/CONCATENATION example, take a look at here

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

gcc-8 -Wstringop-truncation what is the good practice? - c++

What it say is that we can only use len - 1 characters because last one should be '\0', so use seems to clean the warning we only can copy len - 1 ... by the examples: strncpy(this->name, name, 31); or #include <string.h> char d[32]; void f(const char *s) { strncpy(d, s, 31); } d[31] = '\0';

Related

std::string constructor with lvalue throws with clang

Right way of defining a buffer when using sprintf

Crash C++ application using sprintf_s without specifying string length

C++ std::string alternative to strcpy?

understanding the dangers of sprintf(...)

Categories

Resources