Float to Char Array to std::string - c++

I'm having a bit of a problem with this piece of code:
string StringServices::ToStringf(float value)
{
char buffer[10];
sprintf (buffer, "%f", value);
return (string) buffer; // signal SIGABRT
}
Its been working previously and continues to work for other calls, but I currently get a SIGABRT on the return, when the function is being passed -211.0
The buffer loads up fine, and I'm really not sure why this isn't working. Can someone who understands std::string and c strings a lot better then me help me out here?

You probably overflowed your buffer because you're not using snprintf. You have this tagged C++ so do it that way:
std::string buffer = boost::lexical_cast<std::string>(value);
Or without boost use a string stream:
std::ostringstream os;
os << value;
// os.str() has the string representation now.

The main problem with C and its string functions is that you have to do too much work manually. You also have to make too many decisions when you write in C. One of the trivial issues is buffer overflow. Consider this code:
char buf[5]; // 5 chars, ok
sprintf(buf, "qwert"); // 5 letters, ok
You're going to have problems with this code, since when talking about strings, 5 chars means 4 letters + '\0'. So, you may try:
printf("'%s'\n", buf); // you'll probably get 'qwertIOUYOIY*&T^*&TYDGKUYTU&*#*#T^&#$T67...'
What you do with your code is a trivial buffer overflow :-)
sprintf() just has no way to check the size of buf, so a piece of memory that goes right after buf can get corrupted.

Related

Right way of defining a buffer when using sprintf

In my Arduino sketch I use sprintf to format a reply. The formatted data will be send using SPI.
I have to set the buffer length, but I don't know how large would be good. Here is a piece of the code:
uint8_t myValue = 4;
// Set buffer
unsigned char data[1000];
// Reset this buffer
memset(data, 0, sizeof(data));
// Format the reply
sprintf((char *)data, "This is the int value: %d", \
myValue);
I can also set the buffer to 0 (unsigned char data[0];), The code compiles and the reply is the same as using a large buffer. I can't explain this?
It seems malloc() and free() are pretty rare in the Arduino world...
What is the right way of using a buffer in Arduino?
I would recommend using snprintf instead, where you specify how large the buffer is. With the current approach sprintf assumes that the buffer is large enough, so if you pass it a small buffer it will stomp over some other memory. Not good (and may explain the weird results you sometimes get)!
You can use a fixed buffer as in your current code. Allocating also works, but it has more overhead.
I can also set the buffer to 0 (unsigned char data[0];),
If this is true, you're just getting lucky (or unlucky, depending on how you look at it). The behaviour of your code doing this is undefined, in both C and C++. Which, among other things, means it is not guaranteed to work (e.g. it might break if your compiler is ever updated).
As to better options .... Your question is tagged C++ and C, and different approaches are preferable in different languages.
Assuming you are using C, and your implementation complies with the 1999 standard or later, you can use snprintf().
char *buffer;
int length = snprintf((char *)NULL, 0, "This is the int value: %d", myValue);
buffer = malloc(length + 1); /* +1 allows for terminator */
snprintf(buffer, length, "This is the int value: %d", myValue);
/* use buffer */
free(buffer);
The first call of sprintf() is used to recover the buffer length. The second to actually write to the allocated buffer. Remember to release the dynamically allocated buffer when done. (The above also does not check for errors - snprintf() returns a negative value if an error occurs).
If your C library does not include snprintf() then (AFAIK) there is no standard way to work out the length automatically - you will need to estimate an upper bound on length by hand (e.g. work out the length of output of the largest negative or positive int, and add that to the length of other content in your format string).
In C++, use a string stream
#include <sstream>
#include <string>
std::ostringstream oss;
oss << "This is the int value: " << myValue);
std::string str = oss.str();
const char *buffer = str.data();
// readonly access to buffer here
// the objects above will be cleaned up when they pass out of scope.
If you insist on using the C approach in C++, the approach above using snprintf() can be used in C++ if your implementation complies with the 2011 standard (or later). However, in general terms, I would NOT recommend this.

Reading contents of file into dynamically allocated char* array- can I read into std::string instead?

I have found myself writing code which looks like this
// Treat the following as pseudocode - just an example
iofile.seekg(0, std::ios::end); // iofile is a file opened for read/write
uint64_t f_len = iofile.tellg();
if(f_len >= some_min_length)
{
// Focus on the following code here
char *buf = new char[7];
char buf2[]{"MYFILET"}; // just some random string
// if we see this it's a good indication
// the rest of the file will be in the
// expected format (unlikely to see this
// sequence in a "random file", but don't
// worry too much about this)
iofile.read(buf, 7);
if(memcmp(buf, buf2, 7) == 0) // I am confident this works
{
// carry on processing file ...
// ...
// ...
}
}
else
cout << "invalid file format" << endl;
This code is probably an okay sketch of what we might want to do when opening a file, which has some specified format (which I've dictated). We do some initial check to make sure the string "MYFILET" is at the start of the file - because I've decided all my files for the job I'm doing are going to start with this sequence of characters.
I think this code would be better if we didn't have to play around with "c-style" character arrays, but used strings everywhere instead. This would be advantageous because we could do things like if(buf == buf2) if buf and buf2 where std::strings.
A possible alternative could be,
// Focus on the following code here
std::string buf;
std::string buf2("MYFILET"); // very nice
buf.resize(7); // okay, but not great
iofile.read(buf.data(), 7); // pretty awful - error prone if wrong length argument given
// also we have to resize buf to 7 in the previous step
// lots of potential for mistakes here,
// and the length was used twice which is never good
if(buf == buf2) then do something
What are the problems with this?
We had to use the length variable 7 (or constant in this case) twice. Which is somewhere between "not ideal" and "potentially error prone".
We had to access the contents of buf using .data() which I shall assume here is implemented to return a raw pointer of some sort. I don't personally mind this too much, but others may prefer a more memory-safe solution, perhaps hinting we should use an iterator of some sort? I think in Visual Studio (for Windows users which I am not) then this may return an iterator anyway, which will give [?] warnings/errors [?] - not sure on this.
We had to have an additional resize statement for buf. It would be better if the size of buf could be automatically set somehow.
It is undefined behavior to write into the const char* returned by std::string::data(). However, you are free to use std::vector::data() in this way.
If you want to use std::string, and dislike setting the size yourself, you may consider whether you can use std::getline(). This is the free function, not std::istream::getline(). The std::string version will read up to a specified delimiter, so if you have a text format you can tell it to read until '\0' or some other character which will never occur, and it will automatically resize the given string to hold the contents.
If your file is binary in nature, rather than text, I think most people would find std::vector<char> to be a more natural fit than std::string anyway.
We had to use the length variable 7 (or constant in this case) twice.
Which is somewhere between "not ideal" and "potentially error prone".
The second time you can use buf.size()
iofile.read(buf.data(), buf.size());
We had to access the contents of buf using .data() which I shall
assume here is implemented to return a raw pointer of some sort.
And pointed by John Zwinck, .data() return a pointer to const.
I suppose you could define buf as std::vector<char>; for vector (if I'm not wrong) .data() return a pointer to char (in this case), not to const char.
size() and resize() are working in the same way.
We had to have an additional resize statement for buf. It would be
better if the size of buf could be automatically set somehow.
I don't think read() permit this.
p.s.: sorry for my bad English.
We can validate a signature without double buffering (rdbuf and a string) and allocating from the heap...
// terminating null not included
constexpr char sig[] = { 'M', 'Y', 'F', 'I', 'L', 'E', 'T' };
auto ok = all_of(begin(sig), end(sig), [&fs](char c) { return fs.get() == (int)c; });
if (ok) {}
template<class Src>
std::string read_string( Src& src, std::size_t count){
std::string buf;
buf.resize(count);
src.read(&buf.front(), 7); // in C++17 make it buf.data()
return buf;
}
Now auto read = read_string( iofile, 7 ); is clean at point of use.
buf2 is a bad plan. I'd do:
if(read=="MYFILET")
directly, or use a const char myfile_magic[] = "MYFILET";.
I liked many of the ideas from the examples above, however I wasn't completely satisfied that there was an answer which would produce undefined-behaviour-free code for C++11 and C++17. I currently write most of my code in C++11 - because I don't anticipate using it on a machine in the future which doesn't have a C++11 compiler.
If one doesn't, then I add a new compiler or change machines.
However it does seem to me to be a bad idea to write code which I know may not work under C++17... That's just my personal opinion. I don't anticipate using this code again, but I don't want to create a potential problem for myself in the future.
Therefore I have come up with the following code. I hope other users will give feedback to help improve this. (For example there is no error checking yet.)
std::string
fstream_read_string(std::fstream& src, std::size_t n)
{
char *const buffer = new char[n + 1];
src.read(buffer, n);
buffer[n] = '\0';
std::string ret(buffer);
delete [] buffer;
return ret;
}
This seems like a basic, probably fool-proof method... It's a shame there seems to be no way to get std::string to use the same memory as allocated by the call to new.
Note we had to add an extra trailing null character in the C-style string, which is sliced off in the C++-style std::string.

understanding the dangers of sprintf(...)

OWASP says:
"C library functions such as strcpy
(), strcat (), sprintf () and vsprintf
() operate on null terminated strings
and perform no bounds checking."
sprintf writes formatted data to string
int sprintf ( char * str, const char * format, ... );
Example:
sprintf(str, "%s", message); // assume declaration and
// initialization of variables
If I understand OWASP's comment, then the dangers of using sprintf are that
1) if message's length > str's length, there's a buffer overflow
and
2) if message does not null-terminate with \0, then message could get copied into str beyond the memory address of message, causing a buffer overflow
Please confirm/deny. Thanks
You're correct on both problems, though they're really both the same problem (which is accessing data beyond the boundaries of an array).
A solution to your first problem is to instead use std::snprintf, which accepts a buffer size as an argument.
A solution to your second problem is to give a maximum length argument to snprintf. For example:
char buffer[128];
std::snprintf(buffer, sizeof(buffer), "This is a %.4s\n", "testGARBAGE DATA");
// std::strcmp(buffer, "This is a test\n") == 0
If you want to store the entire string (e.g. in the case sizeof(buffer) is too small), run snprintf twice:
int length = std::snprintf(nullptr, 0, "This is a %.4s\n", "testGARBAGE DATA");
++length; // +1 for null terminator
char *buffer = new char[length];
std::snprintf(buffer, length, "This is a %.4s\n", "testGARBAGE DATA");
(You can probably fit this into a function using va or variadic templates.)
Both of your assertions are correct.
There's an additional problem not mentioned. There is no type checking on the parameters. If you mismatch the format string and the parameters, undefined and undesirable behavior could result. For example:
char buf[1024] = {0};
float f = 42.0f;
sprintf(buf, "%s", f); // `f` isn't a string. the sun may explode here
This can be particularly nasty to debug.
All of the above lead many C++ developers to the conclusion that you should never use sprintf and its brethren. Indeed, there are facilities you can use to avoid all of the above problems. One, streams, is built right in to the language:
#include <sstream>
#include <string>
// ...
float f = 42.0f;
stringstream ss;
ss << f;
string s = ss.str();
...and another popular choice for those who, like me, still prefer to use sprintf comes from the boost Format libraries:
#include <string>
#include <boost\format.hpp>
// ...
float f = 42.0f;
string s = (boost::format("%1%") %f).str();
Should you adopt the "never use sprintf" mantra? Decide for yourself. There's usually a best tool for the job and depending on what you're doing, sprintf just might be it.
Yes, it is mostly a matter of buffer overflows. However, those are quite serious business nowdays, since buffer overflows are the prime attack vector used by system crackers to circumvent software or system security. If you expose something like this to user input, there's a very good chance you are handing the keys to your program (or even your computer itself) to the crackers.
From OWASP's perspective, let's pretend we are writing a web server, and we use sprintf to parse the input that a browser passes us.
Now let's suppose someone malicious out there passes our web browser a string far larger than will fit in the buffer we chose. His extra data will instead overwrite nearby data. If he makes it large enough, some of his data will get copied over the webserver's instructions rather than its data. Now he can get our webserver to execute his code.
Your 2 numbered conclusions are correct, but incomplete.
There is an additional risk:
char* format = 0;
char buf[128];
sprintf(buf, format, "hello");
Here, format is not NULL-terminated. sprintf() doesn't check that either.
Your interpretation seems to be correct. However, your case #2 isn't really a buffer overflow. It's more of a memory access violation. That's just terminology though, it's still a major problem.
The sprintf function, when used with certain format specifiers, poses two types of security risk: (1) writing memory it shouldn't; (2) reading memory it shouldn't. If snprintf is used with a size parameter that matches the buffer, it won't write anything it shouldn't. Depending upon the parameters, it may still read stuff it shouldn't. Depending upon the operating environment and what else a program is doing, the danger from improper reads may or may not be less severe than that from improper writes.
It is very important to remember that sprintf() adds the ASCII 0 character as string terminator at the end of each string. Therefore, the destination buffer must have at least n+1 bytes (To print the word "HELLO", a 6-byte buffer is required, NOT 5)
In the example below, it may not be obvious, but in the 2-byte destination buffer, the second byte will be overwritten by ASCII 0 character. If only 1 byte was allocated for the buffer, this would cause buffer overrun.
char buf[3] = {'1', '2'};
int n = sprintf(buf, "A");
Also note that the return value of sprintf() does NOT include the null-terminating character. In the example above, 2 bytes were written, but the function returns '1'.
In the example below, the first byte of class member variable 'i' would be partially overwritten by sprintf() (on a 32-bit system).
struct S
{
char buf[4];
int i;
};
int main()
{
struct S s = { };
s.i = 12345;
int num = sprintf(s.buf, "ABCD");
// The value of s.i is NOT 12345 anymore !
return 0;
}
I pretty much have stated a small example how you could get rid of the buffer size declaration for the sprintf (if you intended to, of course!) and no snprintf envolved ....
Note: This is an APPEND/CONCATENATION example, take a look at here

Best way to safely printf to a string? [duplicate]

This question already has answers here:
C++: how to get fprintf results as a std::string w/o sprintf
(8 answers)
Closed 5 years ago.
Does anyone know a good safe way to redirect the output of a printf-style function to a string? The obvious ways result in buffer overflows.
Something like:
string s;
output.beginRedirect( s ); // redirect output to s
... output.print( "%s%d", foo, bar );
output.endRedirect();
I think the problem is the same as asking, "how many characters will print produce?"
Ideas?
You can use:
std::snprintf if you are working with a char*
std::stringstream if you want to use strings (not same as printf but will allow you to easily manipulate the string using the normal stream functions).
boost::format if you want a function similar to printf that will work with streams. (as per jalf in comments)
fmt::format which is has been standardized since c++20 std::format
The snprintf() function prints to a string, but only as much as the length given to it.
Might be what you're looking for...
The fmt library provides fmt::sprintf function that performs printf-compatible formatting (including positional arguments according to POSIX specification) and returns the result as an std::string:
std::string s = fmt::sprintf( "%s%d", foo, bar );
Disclaimer: I'm the author of this library.
Since you've tagged this as C++ (rather than just C), I'll point out that the typical way to do this sort of thing in C++ is to use stringstream, not the printf family. No need to worry about buffer overflows with stringstreams.
The Boost Format library is also available if you like printf-style format strings but want something safer.
snprintf() returns the number of bytes needed to write the whole string.
So, as a tiny example:
#include <strings.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char** argv)
{
char* buf = 0;
size_t bufsize = 0;
size_t sz;
const char* format="%s and %s.";
const char* str_1 ="string 1";
const char* str_2 ="string 2";
sz = snprintf(buf, bufsize, format, str_1, str_2);
printf("The buffer needs to be %d bytes.\n", sz);
buf=malloc(sz+1);
if(!buf) {
printf("Can't allocate buffer!\n");
return 1;
}
bufsize = sz+1;
buf[bufsize-1] = '\0';
sz = snprintf(buf, bufsize, format, str_1, str_2);
printf("Filled buffer with %d bytes.\n", sz);
printf("The buffer contains:\n'%s'\n", buf);
return 0;
}
output:
The buffer needs to be 22 bytes.
Filled buffer with 22 bytes.
The buffer contains:
'string 1 and string 2.'
This StackOverflow question has a similar discussion. Also in that question I present my favorite solution, a "format" function that takes identical arguments to printf and returns a std::string.
Old school:
snprintf()
allows you to put a limit on the number written, and return the actual written size, and
asprintf()
allocate (with malloc()) a sufficient buffer which then becomes your problem to free(). `asprintf is a GNU libc function now reimplemented in the BSD libc.
With C99 you have the snprintf-function which takes the size of the buffer as a parameter. The GNU C-library has asprintf which allocates a buffer for you. For c++ though, you might be better of using iostream.
Wikipedia has more info.
I find the printf formatting to be very helpful and easier to use than streams. On the other hand, I do like std::string a lot too. The solution is to use sprintf, but that cannot handle arbitrary buffer size.
I've found that I need to handle common case (say, buffer limited to 256 chars) w/o
overhead, and yet handle the large buffer safely. To do that, I have a buffer of 256 chars alocated in my class as a member, and I use snprinf, passing that buffer and its size. If snprintf succeeds, I can immediately retunr the formatted string. If it fails, I allocate the buffer and call snprinf again. The buffer is deallocated in the class' destructor.
On Windows:
StringCchPrintf
StringCbPrintf
from strsafe.h/lib.
Microsoft introduce the 'safe' crt functions for this.
You could use printf_s()

Is there any way to determine how many characters will be written by sprintf?

I'm working in C++.
I want to write a potentially very long formatted string using sprintf (specifically a secure counted version like _snprintf_s, but the idea is the same). The approximate length is unknown at compile time so I'll have to use some dynamically allocated memory rather than relying on a big static buffer. Is there any way to determine how many characters will be needed for a particular sprintf call so I can always be sure I've got a big enough buffer?
My fallback is I'll just take the length of the format string, double it, and try that. If it works, great, if it doesn't I'll just double the size of the buffer and try again. Repeat until it fits. Not exactly the cleverest solution.
It looks like C99 supports passing NULL to snprintf to get the length. I suppose I could create a module to wrap that functionality if nothing else, but I'm not crazy about that idea.
Maybe an fprintf to "/dev/null"/"nul" might work instead? Any other ideas?
EDIT: Alternatively, is there any way to "chunk" the sprintf so it picks up mid-write? If that's possible it could fill the buffer, process it, then start refilling from where it left off.
The man page for snprintf says:
Return value
Upon successful return, these functions return the number of
characters printed (not including the trailing '\0' used to end
output to strings). The functions snprintf and vsnprintf do not
write more than size bytes (including the trailing '\0'). If
the output was truncated due to this limit then the return value
is the number of characters (not including the trailing '\0')
which would have been written to the final string if enough
space had been available. Thus, a return value of size or more
means that the output was truncated. (See also below under
NOTES.) If an output error is encountered, a negative value is
returned.
What this means is that you can call snprintf with a size of 0. Nothing will get written, and the return value will tell you how much space you need to allocate to your string:
int how_much_space = snprintf(NULL, 0, fmt_string, param0, param1, ...);
As others have mentioned, snprintf() will return the number of characters required in a buffer to prevent the output from being truncated. You can simply call it with a 0 buffer length parameter to get the required size then use an appropriately sized buffer.
For a slight improvement in efficiency, you can call it with a buffer that's large enough for the normal case and only do a second call to snprintf() if the output is truncated. In order to make sure the buffer(s) are properly freed in that case, I'll often use an auto_buffer<> object that handles the dynamic memory for me (and has the default buffer on the stack to avoid a heap allocation in the normal case).
If you're using a Microsoft compiler, MS has a non-standard _snprintf() that has serious limitations of not always null terminating the buffer and not indicating how big the buffer should be.
To work around Microsoft's non-support, I use a nearly public domain snprintf() from Holger Weiss.
Of course if your non-MS C or C++ compiler is missing snprintf(), the code from the above link should work just as well.
I would use a two-stage approach. Generally, a large percentage of output strings will be under a certain threshold and only a few will be larger.
Stage 1, use a reasonable sized static buffer such as 4K. Since snprintf() can restrict how many characters are written, you won't get a buffer overflow. What you will get returned from snprintf() is the number of characters it would have written if your buffer had been big enough.
If your call to snprintf() returns less than 4K, then use the buffer and exit. As stated, the vast majority of calls should just do that.
Some will not and that's when you enter stage 2. If the call to snprintf() won't fit in the 4K buffer, you at least now know how big a buffer you need.
Allocate, with malloc(), a buffer big enough to hold it then snprintf() it again to that new buffer. When you're done with the buffer, free it.
We worked on a system in the days before snprintf() and we acheived the same result by having a file handle connected to /dev/null and using fprintf() with that. /dev/null was always guaranteed to take as much data as you give it so we would actually get the size from that, then allocate a buffer if necessary.
Keep in kind that not all systems have snprintf() (for example, I understand it's _snprintf() in Microsoft C) so you may have to find the function that does the same job, or revert to the fprintf /dev/null solution.
Also be careful if the data can be changed between the size-checking snprintf() and the actual snprintf() to the buffer (i.e., wathch out for threads). If the sizes increase, you'll get buffer overflow corruption.
If you follow the rule that data, once handed to a function, belongs to that function exclusively until handed back, this won't be a problem.
For what it's worth, asprintf is a GNU extension that manages this functionality. It accepts a pointer as an output argument, along with a format string and a variable number of arguments, and writes back to the pointer the address of a properly-allocated buffer containing the result.
You can use it like so:
#define _GNU_SOURCE
#include <stdio.h>
int main(int argc, char const *argv[])
{
char *hi = "hello"; // these could be really long
char *everyone = "world";
char *message;
asprintf(&message, "%s %s", hi, everyone);
puts(message);
free(message);
return 0;
}
Hope this helps someone!
Take a look at CodeProject: CString-clone Using Standard C++. It uses solution you suggested with enlarging buffer size.
// -------------------------------------------------------------------------
// FUNCTION: FormatV
// void FormatV(PCSTR szFormat, va_list, argList);
//
// DESCRIPTION:
// This function formats the string with sprintf style format-specs.
// It makes a general guess at required buffer size and then tries
// successively larger buffers until it finds one big enough or a
// threshold (MAX_FMT_TRIES) is exceeded.
//
// PARAMETERS:
// szFormat - a PCSTR holding the format of the output
// argList - a Microsoft specific va_list for variable argument lists
//
// RETURN VALUE:
// -------------------------------------------------------------------------
void FormatV(const CT* szFormat, va_list argList)
{
#ifdef SS_ANSI
int nLen = sslen(szFormat) + STD_BUF_SIZE;
ssvsprintf(GetBuffer(nLen), nLen-1, szFormat, argList);
ReleaseBuffer();
#else
CT* pBuf = NULL;
int nChars = 1;
int nUsed = 0;
size_type nActual = 0;
int nTry = 0;
do
{
// Grow more than linearly (e.g. 512, 1536, 3072, etc)
nChars += ((nTry+1) * FMT_BLOCK_SIZE);
pBuf = reinterpret_cast<CT*>(_alloca(sizeof(CT)*nChars));
nUsed = ssnprintf(pBuf, nChars-1, szFormat, argList);
// Ensure proper NULL termination.
nActual = nUsed == -1 ? nChars-1 : SSMIN(nUsed, nChars-1);
pBuf[nActual+1]= '\0';
} while ( nUsed < 0 && nTry++ < MAX_FMT_TRIES );
// assign whatever we managed to format
this->assign(pBuf, nActual);
#endif
}
I've looked for the same functionality you're talking about, but as far as I know, something as simple as the C99 method is not available in C++, because C++ does not currently incorporate the features added in C99 (such as snprintf).
Your best bet is probably to use a stringstream object. It's a bit more cumbersome than a clearly written sprintf call, but it will work.
Since you're using C++, there's really no need to use any version of sprintf. The simplest thing to do is use a std::ostringstream.
std::ostringstream oss;
oss << a << " " << b << std::endl;
oss.str() returns a std::string with the contents of what you've written to oss. Use oss.str().c_str() to get a const char *. It's going to be a lot easier to deal with in the long run and eliminates memory leaks or buffer overruns. Generally, if you're worrying about memory issues like that in C++, you're not using the language to its full potential, and you should rethink your design.