Cppcheck 1.67 raised a portability issue in my source code at this line:
sscanf(s, "%d%*[,;.]%d", &f, &a);
This is the message I got from it:
scanf without field width limits can crash with huge input data on some versions of libc.
The original intention of the format string was to accept one of three possible limiter chars between two integers, and today - thanks to Cppcheck[1] - I see that %*[,;.] accepts even strings of limiter chars. However I doubt that my format string may cause a crash, because the unlimited part is ignored.
Is there possibly an issue with a buffer overrun? ...maybe behind the scenes?
[1]
How to get lost between farsightedness and blindness:
I tried to fix it by %1*[,;.] (after some API doc), but Cppcheck insisted in the issue, so I also tried %*1[,;.] with the same "success". Seems that I have to suppress it for now...
Congratulations on finding a bug in Cppcheck 1.67 (the current version).
You have basically three workarounds:
Just ignore the false positive.
Rework your format (assign that field, possible as you only want to match one character).
char tmp;
if(3 != sscanf(s, "%d %c%d", &f, &tmp, &a) || tmp!=',' && tmp!=';' && tmp!= '.')
goto error;
Suppress the warning directly (preferably inline-suppressions):
//cppcheck-suppress invalidscanf_libc
if(2 != sscanf(s, "%d%1*[,;.]%d", &f, &a))
goto error;
Don't forget to report the error, as "defect / false positive", so you can retire and forget that workaround as fast as possible.
When to quantify ignored pattern match in the C sscanf function?
Probably it's a good idea to always quantify (see below), but over-quantification may also distract from your intentions. In the above case, where a single separator char has to be skipped, the quantification would definitely be useful.
Is there possibly an issue with a buffer overrun? ...maybe behind the scenes?
There will be no crashes caused by your code. As to deal with the "behind the scenes" question, I experimented with large input strings. In the C library I tested, there was no internal buffer overflow. I tried the C lib that's shipped with Borland C++ 5.6.4 and found that I could not trigger a buffer overrun with large inputs (more than 400 million chars).
Surprisingly, Cppcheck was not totally wrong - there is a portability issue, but a different one:
#include <stdio.h>
#include <assert.h>
#include <sstream>
int traced_sscanf_set(const int count, const bool limited)
{
const char sep = '.';
printf("\n");
std::stringstream ss;
ss << "123" << std::string(count, sep) << "456";
std::string s = ss.str();
printf("string of size %d with %d '%c's in it\n", s.size(), count, sep);
std::stringstream fs;
fs << "%d%";
if (limited) {
fs << count;
}
fs << "*["<< sep << "]%d";
std::string fmt = fs.str();
printf("fmt: \"%s\"\n", fmt.c_str());
int a = 0;
int b = 0;
const sscanfResult = sscanf(s.c_str(), fmt.c_str(), &a, &b);
printf("sscanfResult=%d, a=%d, b=%d\n", sscanfResult, a, b);
return sscanfResult;
}
void test_sscanf()
{
assert(traced_sscanf_set(0x7fff, true)==2);
assert(traced_sscanf_set(0x7fff, false)==2);
assert(traced_sscanf_set(0x8000, true)==2);
assert(traced_sscanf_set(0x8000, false)==1);
}
The library I checked, internally limits the input consumed (and skipped) to 32767 (215-1) chars, if there is no explicitly specified limit in the format parameter.
For those who are interested, here is the trace output:
string of size 32773 with 32767 '.'s in it
fmt: "%d%32767*[.]%d"
sscanfResult=2, a=123, b=456
string of size 32773 with 32767 '.'s in it
fmt: "%d%*[.]%d"
sscanfResult=2, a=123, b=456
string of size 32774 with 32768 '.'s in it
fmt: "%d%32768*[.]%d"
sscanfResult=2, a=123, b=456
string of size 32774 with 32768 '.'s in it
fmt: "%d%*[.]%d"
sscanfResult=1, a=123, b=0
Related
I have a char array defined like this
char buffer[100];
When I run Flawfinder scan for hits I get the one says:
(buffer) char:
Statically-sized arrays can be improperly restricted, leading to potential
overflows or other issues (CWE-119!/CWE-120). Perform bounds checking, use
functions that limit length, or ensure that the size is larger than the
maximum possible length.
I know I have to do the checks when needed to make sure my code will be exceptions free but do we have any way to solve this (define a char array in other ways) and make the Flawfindr output without any hit?
UPDATE
Here's the full code of the function in case it would help
std::string MyClass::randomGenerator(odb::nullable<int> maxLength) {
struct timeval tmnow;
struct tm *tm;
char buf[100];
gettimeofday(&tmnow, NULL);
tm = localtime(&tmnow.tv_sec);
strftime(buf, 100, "%m%d%H%M%S", tm);
string micro = std::to_string(((int)tmnow.tv_usec / 10000));
strlcat(buf, micro.c_str(), sizeof(buf));
std::stringstream stream;
stream << std::hex << stoll(buf);
std::string result(stream.str());
Utilities::find_and_replace(result, "0", "h");
Utilities::find_and_replace(result, "1", "k");
std::transform(result.begin(), result.end(),result.begin(), ::toupper);
if (maxLength) {
return result.substr(result.size() - maxLength.get(), result.size() - 1);
} else {
return result ;
}
}
Flawfinder is really a slightly glorified grep - it's not a true static-analysis tool that does data flow analysis, so I have always taken its output with a healthy dose of salt!
The way you should really write this code is to write true C++ code rather than glorified-C using C runtime functions, which are absolutely subject to memory corruption issues.
I have some numbers of different length (like 1, 999, 76492, so on) and I want to convert them all to strings with a common length (for example, if the length is 6, then those strings will be: '000001', '000999', '076492').
In other words, I need to add correct amount of leading zeros to the number.
int n = 999;
string str = some_function(n,6);
//str = '000999'
Is there a function like this in C++?
or using the stringstreams:
#include <sstream>
#include <iomanip>
std::stringstream ss;
ss << std::setw(10) << std::setfill('0') << i;
std::string s = ss.str();
I compiled the information I found on arachnoid.com because I like the type-safe way of iostreams more. Besides, you can equally use this code on any other output stream.
char str[7];
snprintf (str, 7, "%06d", n);
See snprintf
One thing that you may want to be aware of is the potential locking that may go on when you use the stringstream approach. In the STL that ships with Visual Studio 2008, at least, there are many locks taken out and released as various locale information is used during formatting. This may, or may not, be an issue for you depending on how many threads you have that might be concurrently converting numbers to strings...
The sprintf version doesn't take any locks (at least according to the lock monitoring tool that I'm developing at the moment...) and so might be 'better' for use in concurrent situations.
I only noticed this because my tool recently spat out the 'locale' locks as being amongst the most contended for locks in my server system; it came as a bit of a surprise and may cause me to revise the approach that I've been taking (i.e. move back towards sprintf from stringstream)...
There are many ways of doing this. The simplest would be:
int n = 999;
char buffer[256]; sprintf(buffer, "%06d", n);
string str(buffer);
This method doesn't use streams nor sprintf. Other than having locking problems, streams incur a performance overhead and is really an overkill. For streams the overhead comes from the need to construct the steam and stream buffer. For sprintf, the overhead comes from needing to interpret the format string. This works even when n is negative or when the string representation of n is longer than len. This is the FASTEST solution.
inline string some_function(int n, int len)
{
string result(len--, '0');
for (int val=(n<0)?-n:n; len>=0&&val!=0; --len,val/=10)
result[len]='0'+val%10;
if (len>=0&&n<0) result[0]='-';
return result;
}
stringstream will do (as xtofl pointed out). Boost format is a more convenient replacement for snprintf.
This is an old thread, but as fmt might make it into the standard, here is an additional solution:
#include <fmt/format.h>
int n = 999;
const auto str = fmt::format("{:0>{}}", n, 6);
Note that the fmt::format("{:0>6}", n) works equally well when the desired width is known at compile time. Another option is abseil:
#include <absl/strings/str_format.h>
int n = 999;
const auto str = absl::StrFormat("%0*d", 6, n);
Again, abs::StrFormat("%06d", n) is possible. boost format is another tool for this problem:
#include <boost/format.hpp>
int n = 999;
const auto str = boost::str(boost::format("%06d") % n);
Unfortunately, variable width specifier as arguments chained with the % operator are unsupported, this requires a format string setup (e.g. const std::string fmt = "%0" + std::to_string(6) + "d";).
In terms of performance, abseil and fmt claim to be very attractive and faster than boost. In any case, all three solutions should be more efficient than std::stringstream approaches, and other than the std::*printf family, they do not sacrifice type safety.
sprintf is the C-like way of doing this, which also works in C++.
In C++, a combination of a stringstream and stream output formatting (see http://www.arachnoid.com/cpptutor/student3.html ) will do the job.
From C++ 11, you can do:
string to_string(unsigned int number, int length) {
string num_str = std::to_string(number);
if(num_str.length() >= length) return num_str;
string leading_zeros(length - num_str.length(), '0');
return leading_zeros + num_str;
}
If you also need to handle negative numbers, you can rewrite the function as below:
string to_string(int number, int length) {
string num_str = std::to_string(number);
if(num_str.length() >= length) return num_str;
string leading_zeros(length - num_str.length(), '0');
//for negative numbers swap the leading zero with the leading negative sign
if(num_str[0] == '-') {
num_str[0] = '0';
leading_zeros[0] = '-';
}
return leading_zeros + num_str;
}
What is proper size of an char array (buffer) when i want to use sprintf function?
I dont know why this part of code is working if buffer can hold only 1 char? I put a lot more chars inside than 1.
/* sprintf example */
#include <stdio.h>
int main ()
{
char buffer[1];
int n, a=5, b=3;
n = sprintf (buffer, "%d plus %d is %d", a, b, a+b);
printf ("[%s] is a string %d chars long\n", buffer, n);
return 0;
}
Results:
[5 plus 3 is 8] is a string 13 chars long
What is proper size of an char array (buffer) when i want to use sprintf function?
There isn't one.
If you can work out an upper bound from the format string and types of input, then you might use that. For example, a 32-bit int won't take up more than 11 characters to represent in decimal with an optional sign, so your particular example won't need more than 44 characters (unless I miscounted).
Otherwise, use something safer: std::stringstream in C++, or snprintf and care in C.
I don't know why this part of code is working if buffer can hold only 1 char?
It isn't. It's writing past the end of the buffer into some other memory.
Maybe that won't cause any visible errors; maybe it will corrupt some other variables; maybe it will cause a protection fault and end the program; maybe it will corrupt the stack frame and cause all kinds of havoc when the function tries to return; or maybe it will cause some other kind of undefined behaviour. But it's certainly not behaving correctly.
In your code a buffer overflow occurred, there were no apparent consequences, but that doesn't mean it worked correctly, try using a memory debugger like valgrind and you will see what I mean.
You can't ensure that sprintf() will not overflow the buffer, that's why there is a snprintf() function to which you pass the size of the buffer.
Sample usage
char buffer[100];
int result;
result = snprintf(buffer, sizeof(buffer), "%d plus %d is %d", a, b, a + b);
if (result >= sizeof(buffer))
{
fprintf(stderr, "The string does not fit `buffer'.\n");
}
Assuming code must use sprintf() and not some other function:
pre-determine the worse case output size and add margin.
Unless there are major memory concerns, suggest a 2x buffer. Various locales can do interesting things like add ',' to integer output as in "123,456,789".
#include <stdio.h>
#include <limits.h>
#define INT_DECIMAL_SIZE(i) (sizeof(i)*CHAR_BIT/3 + 3)
#define format1 "%d plus %d is %d"
char buffer[(sizeof format1 * 3 * INT_DECIMAL_SIZE(int)) * 2];
int n = sprintf(buffer, format1, a, b, a + b);
A challenging example is when code tries sprintf(buf,"%Lf", some_long_double) as the output could be 1000s of characters should x == LDBL_MAX. About 5000 characters with binary128 as long double.
// - 123.............456 . 000000 \0
#define LDBL_DECIMAL_SIZE(i) (1 + 1 + LDBL_MAX_10_EXP + 1 + 6 1)
My question is exactly the same as this one. That is, I'm trying to use scanf() to receive a string of indeterminate length, and I want scanf() to dynamically allocate memory for it.
However, in my situation, I am using VS2010. As far as I can see, MS's scanf() doesn't have an a or m modifier for when scanning for strings. Is there any way to do this (other than receiving input one character at a time)?
Standard versions of scanf() do not allocate memory for any of the variables it reads into.
If you've been hoodwinked into using a non-standard extension in some version of scanf(), you've just had your first lesson in how to write portable code - do not use non-standard extensions. You can nuance that to say "Do not use extensions that are not available on all the platforms of interest to you", but realize that the set of platforms may change over time.
Must you absolutely use scanf ? Aren't std::string s; std::cin >> s; or getline( std::cin, s ); an option for you?
If you want to use scanf you could just allocate a large enough buffer to hold any possible value, say 1024 bytes, then use a maximum field width specifier of 1024.
The m and a are specific non-standard GNU extensions, so thats why Microsofts compiler does not support them. One could wish that visual studio did.
Here is an example using scanf to read settings, and just print them back out:
#include <stdio.h>
#include <errno.h>
#include <malloc.h>
int
main( int argc, char **argv )
{ // usage ./a.out < settings.conf
char *varname;
int value, r, run = 1;
varname = malloc( 1024 );
// clear errno
errno = 0;
while( run )
{ // match any number of "variable = #number" and do some "processing"
// the 1024 here is the maximum field width specifier.
r = scanf ( "%1024s = %d", varname, &value );
if( r == 2 )
{ // matched both string and number
printf( " Variable %s is set to %d \n", varname, value );
} else {
// it did not, either there was an error in which case errno was
// set or we are out of variables to match
if( errno != 0 )
{ // an error has ocurred.
perror("scanf");
}
run = 0;
}
}
return 0;
}
Here is an example settings.conf
cake = 5
three = 3
answertolifeuniverseandeverything = 42
charcoal = -12
You can read more about scanf on the manpages.
And you can of course use getline(), and after that parse character after character.
If you would go into a little more what you are trying to achieve you could maybe get an better answer.
I think, in real world, one need to have some maximum limit on length of user input.
Then you may read the whole line with something like getline(). See http://www.cplusplus.com/reference/iostream/istream/getline/
Note that, if you want multiple input from user, you don't need to have separate char arrays for each of them. You can have one big buffer, e.g. char buffer[2048], for using with getline(), and copy the contents to a suitably allocated (and named) variable, e.g. something like char * name = strdup( buffer ).
Don't use scanf for reading strings. It probably doesn't even do what you think it does; %s reads only up until the next whitespace.
I have some numbers of different length (like 1, 999, 76492, so on) and I want to convert them all to strings with a common length (for example, if the length is 6, then those strings will be: '000001', '000999', '076492').
In other words, I need to add correct amount of leading zeros to the number.
int n = 999;
string str = some_function(n,6);
//str = '000999'
Is there a function like this in C++?
or using the stringstreams:
#include <sstream>
#include <iomanip>
std::stringstream ss;
ss << std::setw(10) << std::setfill('0') << i;
std::string s = ss.str();
I compiled the information I found on arachnoid.com because I like the type-safe way of iostreams more. Besides, you can equally use this code on any other output stream.
char str[7];
snprintf (str, 7, "%06d", n);
See snprintf
One thing that you may want to be aware of is the potential locking that may go on when you use the stringstream approach. In the STL that ships with Visual Studio 2008, at least, there are many locks taken out and released as various locale information is used during formatting. This may, or may not, be an issue for you depending on how many threads you have that might be concurrently converting numbers to strings...
The sprintf version doesn't take any locks (at least according to the lock monitoring tool that I'm developing at the moment...) and so might be 'better' for use in concurrent situations.
I only noticed this because my tool recently spat out the 'locale' locks as being amongst the most contended for locks in my server system; it came as a bit of a surprise and may cause me to revise the approach that I've been taking (i.e. move back towards sprintf from stringstream)...
There are many ways of doing this. The simplest would be:
int n = 999;
char buffer[256]; sprintf(buffer, "%06d", n);
string str(buffer);
This method doesn't use streams nor sprintf. Other than having locking problems, streams incur a performance overhead and is really an overkill. For streams the overhead comes from the need to construct the steam and stream buffer. For sprintf, the overhead comes from needing to interpret the format string. This works even when n is negative or when the string representation of n is longer than len. This is the FASTEST solution.
inline string some_function(int n, int len)
{
string result(len--, '0');
for (int val=(n<0)?-n:n; len>=0&&val!=0; --len,val/=10)
result[len]='0'+val%10;
if (len>=0&&n<0) result[0]='-';
return result;
}
stringstream will do (as xtofl pointed out). Boost format is a more convenient replacement for snprintf.
This is an old thread, but as fmt might make it into the standard, here is an additional solution:
#include <fmt/format.h>
int n = 999;
const auto str = fmt::format("{:0>{}}", n, 6);
Note that the fmt::format("{:0>6}", n) works equally well when the desired width is known at compile time. Another option is abseil:
#include <absl/strings/str_format.h>
int n = 999;
const auto str = absl::StrFormat("%0*d", 6, n);
Again, abs::StrFormat("%06d", n) is possible. boost format is another tool for this problem:
#include <boost/format.hpp>
int n = 999;
const auto str = boost::str(boost::format("%06d") % n);
Unfortunately, variable width specifier as arguments chained with the % operator are unsupported, this requires a format string setup (e.g. const std::string fmt = "%0" + std::to_string(6) + "d";).
In terms of performance, abseil and fmt claim to be very attractive and faster than boost. In any case, all three solutions should be more efficient than std::stringstream approaches, and other than the std::*printf family, they do not sacrifice type safety.
sprintf is the C-like way of doing this, which also works in C++.
In C++, a combination of a stringstream and stream output formatting (see http://www.arachnoid.com/cpptutor/student3.html ) will do the job.
From C++ 11, you can do:
string to_string(unsigned int number, int length) {
string num_str = std::to_string(number);
if(num_str.length() >= length) return num_str;
string leading_zeros(length - num_str.length(), '0');
return leading_zeros + num_str;
}
If you also need to handle negative numbers, you can rewrite the function as below:
string to_string(int number, int length) {
string num_str = std::to_string(number);
if(num_str.length() >= length) return num_str;
string leading_zeros(length - num_str.length(), '0');
//for negative numbers swap the leading zero with the leading negative sign
if(num_str[0] == '-') {
num_str[0] = '0';
leading_zeros[0] = '-';
}
return leading_zeros + num_str;
}