std::string and stdarg.h - c++

I have written a function which tries to do an auto-allocating sprintf by returning a std::string instead of writing into a user-supplied char*. (Please, no answers recommending iostreams or Boost.Format or friends -- I know they exist, I do use them in other contexts, but there is a requirement for this particular case.)
std::string FormatString(const std::string& format, va_list argList)
{
char smallBuffer[500], *text = smallBuffer;
int length = _countof(smallBuffer);
// MSVC is not C99 conformant, so its vsnprintf returns -1
// on insufficient buffer space
int outputSize = _vsnprintf(text, length, format.c_str(), argList);
while (outputSize < 0 && errno == ERANGE && length > 0)
{
length <<= 1;
if (text != smallBuffer) { delete[] text; }
text = new char[length];
outputSize = _vsnprintf(text, length, format.c_str(), argList);
}
if (outputSize < 0)
{
throw std::runtime_error("Failed to format string.");
}
std::string ret(text);
if (text != smallBuffer)
{
delete[] text;
}
return ret;
}
std::string FormatString(const std::string& format, ...)
{
va_list argList;
va_start(argList, format);
std::string result;
try
{
result = FormatString(format, argList);
}
catch(...)
{
va_end(argList);
throw;
}
va_end(argList);
return result;
}
int _tmain(int argc, _TCHAR* argv[])
{
int foo = 1234;
std::string bar = "BlaBla";
std::cout << FormatString("%i (%s)", foo, bar.c_str()) << std::endl;
return 0;
}
(And yes, I see the irony of piping a C-formatted string to a C++ iostream. This is just test code.)
Unfortunately, using VS2008, it's crashing deep within the bowels of the printf internals, apparently because it's reading the wrong arguments out of the va_list (according to the debugger, after the va_start it's pointing at a four-byte null sequence immediately prior to the "real" first parameter).
Of particular note is that if in the variadic function I change the const std::string& format to just std::string format (ie. pass by value), it works properly; it also does so if I change it to a const char *, of course.
Is this some sort of compiler bug, or is it not legal to use a va_list with reference parameters?

I think you are out of luck if you want to pass a reference. Here is what the C++2011 standard has to say about the subject in 18.10 [support.runtime] paragraph 3:
The restrictions that ISO C places on the second parameter to the va_start() macro in header are different in this International Standard. The parameter parmN is the identifier of the rightmost parameter in the variable parameter list of the function definition (the one just before the ...).230 If the parameter parmN is declared with a function, array, or reference type, or with a type that is not compatible with the type that results when passing an argument for which there is no parameter, the behavior is undefined.

Related

How to properly replace sprintf_s by sprintf in C++03?

sprintf_sis a Microsoft implementation of the function sprintf where they patched a flaw, adding an argument to take a boundary value where the function is limited to write.
An equivalent was introduced in C++11: snprintf. But here, we are talking of C++03 syntax.
Signatures:
count_char_written sprintf(char* string_out, const char* output_template, VARIADIC_ARGS);
// and
count_char_written sprintf_s(char* string_out, size_t buffer_max_size, const char* output_template, VARIADIC_ARGS);
Functionnaly, sprintf_s is more advanced than sprintf, because it avoids overflows.
But sprintf_s is Microsoft only!
What to do if you want to port back a C++03 code written with sprintf_s to POSIX compatible syntax?
Today both snprintf and vsnprintf should be available everywhere with the exception of Windows with MSVC12 and older. The simplest way for you is to provide snprintf/vsnprintf on Windows where it is not available.
Windows provides function _vsnprintf_s which is already similar to vsnprintf, but has following important differences with regards to what happens when provided buffer is too small:
Buffer content depends on the additional count argument which does not exist in vsnprintf. To get vsnprintf behavior you can pass _TRUNCATE here.
-1 is returned instead of number of characters required. This can be fixed by using _vscprintf function which only needs to be called if previous call to _vsnprintf_s has failed.
Additionally those functions do not support format specifiers added in C99 such as %zd. This cannot be easily resolved, you will have to avoid using them.
Code below:
int vsnprintf(char *buf, size_t size, const char *fmt, va_list args)
{
int r = -1;
if (size != 0)
{
va_list args_copy;
va_copy(args_copy, args);
r = _vsnprintf_s(buf, size, _TRUNCATE, fmt, args_copy);
va_end(args_copy);
}
if (r == -1)
{
r = _vscprintf(fmt, args);
}
return r;
}
int snprintf(char *buf, size_t size, const char *fmt, ...)
{
va_list args;
va_start(args, fmt);
int r = vsnprintf(buf, size, fmt, args);
va_end(args);
return r;
}
Note: Windows also provides _vsnprintf which looks better suited for this implementation, but it does not terminate the resulting string. If you want to use it, you should be careful.

Best practice to use execvp in C++

At the beginning, I wrote something like this
char* argv[] = { "ls", "-al", ..., (char*)NULL };
execvp("ls", argv);
However, GCC popped up this warning, "C++ forbids converting a string constant to char*."
Then, I changed my code into
const char* argv[] = { "ls", "-al", ..., (char*)NULL };
execvp("ls", argv);
As a result, GCC popped up this error, "invalid conversion from const char** to char* const*."
Then, I changed my code into
const char* argv[] = { "ls", "-al", ..., (char*)NULL };
execvp("ls", (char* const*)argv);
It finally works and is compiled without any warning and error, but I think this is a bit cumbersome, and I cannot find anyone wrote something like this on the Internet.
Is there any better way to use execvp in C++?
You hit a real problem because we are facing two incompatible constraints:
One from the C++ standard requiring you that you must use const char*:
In C, string literals are of type char[], and can be assigned directly
to a (non-const) char*. C++03 allowed it as well (but deprecated it,
as literals are const in C++). C++11 no longer allows such assignments
without a cast.
The other from the legacy C function prototype that requires an array of (non-const) char*:
int execv(const char *path, char *const argv[]);
By consequence there must be a const_cast<> somewhere and the only solution I found is to wrap the execvp function.
Here is a complete running C++ demonstration of this solution. The inconvenience is that you have some glue code to write once, but the advantage is that you get a safer and cleaner C++11 code (the final nullptr is checked).
#include <cassert>
#include <unistd.h>
template <std::size_t N>
int execvp(const char* file, const char* const (&argv)[N])
{
assert((N > 0) && (argv[N - 1] == nullptr));
return execvp(file, const_cast<char* const*>(argv));
}
int main()
{
const char* const argv[] = {"-al", nullptr};
execvp("ls", argv);
}
You can compile this demo with:
g++ -std=c++11 demo.cpp
You can see a similar approach in the CPP Reference example for std::experimental::to_array.
This is a conflict between the declaration of execvp() (which can't promise not to modify its arguments, for backwards compatibility) and the C++ interpretation of string literals as arrays of constant char.
If the cast concerns you, your remaining option is to copy the argument list, like this:
#include <unistd.h>
#include <cstring>
#include <memory>
int execvp(const char *file, const char *const argv[])
{
std::size_t argc = 0;
std::size_t len = 0;
/* measure the inputs */
for (auto *p = argv; *p; ++p) {
++argc;
len += std::strlen(*p) + 1;
}
/* allocate copies */
auto const arg_string = std::make_unique<char[]>(len);
auto const args = std::make_unique<char*[]>(argc+1);
/* copy the inputs */
len = 0; // re-use for position in arg_string
for (auto i = 0u; i < argc; ++i) {
len += std::strlen(args[i] = std::strcpy(&arg_string[len], argv[i]))
+ 1; /* advance to one AFTER the nul */
}
args[argc] = nullptr;
return execvp(file, args.get());
}
(You may consider std::unique_ptr to be overkill, but this function does correctly clean up if execvp() fails, and the function returns).
Demo:
int main()
{
const char *argv[] = { "printf", "%s\n", "one", "two", "three", nullptr };
return execvp("printf", argv);
}
one
two
three
execvpe requires an char *const argv[] as it's second argument. That is, it requires a list of const pointers to non-const data. String literals in C are const hence the problem with warnings and casting your argv to char* const* is a hack as execvp is now allowed to write to the strings in your argv. Te solution I see is to either allocate a writeble buffer for each item or just use execlp instead which works with const char* args and allows for passing string literals.

Checking variable arguments for type

I have the below a function I use for string concatenation, it takes in a variable length set of arguments. I want to check to make sure each element is
a char*. I was looking into using dymanic_cast but it cannot be used for char*.
How should I go about casting the arg?:
char* Concatenate(int numStrings, ...)
{
vector<char*> stringVectorArray;
va_list vargList;
if (numStrings > 0 && numStrings < MAX_STRING_BUFFER_SIZE)
{
//Store each of the arguments so we can iterate through them later.
va_start(vargList, numStrings);
for (int currIndex = 0; currIndex < numStrings; currIndex++)
{
char* item = (char*)(va_arg(vargList, char*));
if (item == NULL)
{
//Error: One of the parameters is not char*.
va_end(vargList);
return NULL;
}
else
{
stringVectorArray.push_back(item);
}
}
va_end(vargList);
}
return ConcatenateStrings(stringVectorArray);
}
You simply don't know. There is no well-defined way of knowing what the argument types are for a variable argument list.
You have to trust the caller to get it right: in C, use the (char*) notation, in C++ use reinterpret_cast.
The variadic templates of C++11 introduce type safety into variable argument lists.
"cannot be used for char*"
Don't use char* then, use an object which CAN be used with dynamic cast, like std::string or your own class.

How to pass variable number of arguments to printf/sprintf

I have a class that holds an "error" function that will format some text. I want to accept a variable number of arguments and then format them using printf.
Example:
class MyClass
{
public:
void Error(const char* format, ...);
};
The Error method should take in the parameters, call printf/sprintf to format it and then do something with it. I don't want to write all the formatting myself so it makes sense to try and figure out how to use the existing formatting.
Use vfprintf, like so:
void Error(const char* format, ...)
{
va_list argptr;
va_start(argptr, format);
vfprintf(stderr, format, argptr);
va_end(argptr);
}
This outputs the results to stderr. If you want to save the output in a string instead of displaying it use vsnprintf. (Avoid using vsprintf: it is susceptible to buffer overflows as it doesn't know the size of the output buffer.)
have a look at vsnprintf as this will do what ya want http://www.cplusplus.com/reference/clibrary/cstdio/vsprintf/
you will have to init the va_list arg array first, then call it.
Example from that link:
/* vsprintf example */
#include <stdio.h>
#include <stdarg.h>
void Error (char * format, ...)
{
char buffer[256];
va_list args;
va_start (args, format);
vsnprintf (buffer, 255, format, args);
//do something with the error
va_end (args);
}
I should have read more on existing questions in stack overflow.
C++ Passing Variable Number of Arguments is a similar question. Mike F has the following explanation:
There's no way of calling (eg) printf
without knowing how many arguments
you're passing to it, unless you want
to get into naughty and non-portable
tricks.
The generally used solution is to
always provide an alternate form of
vararg functions, so printf has
vprintf which takes a va_list in place
of the .... The ... versions are just
wrappers around the va_list versions.
This is exactly what I was looking for. I performed a test implementation like this:
void Error(const char* format, ...)
{
char dest[1024 * 16];
va_list argptr;
va_start(argptr, format);
vsprintf(dest, format, argptr);
va_end(argptr);
printf(dest);
}
You are looking for variadic functions. printf() and sprintf() are variadic functions - they can accept a variable number of arguments.
This entails basically these steps:
The first parameter must give some indication of the number of parameters that follow. So in printf(), the "format" parameter gives this indication - if you have 5 format specifiers, then it will look for 5 more arguments (for a total of 6 arguments.) The first argument could be an integer (eg "myfunction(3, a, b, c)" where "3" signifies "3 arguments)
Then loop through and retrieve each successive argument, using the va_start() etc. functions.
There are plenty of tutorials on how to do this - good luck!
Simple example below. Note you should pass in a larger buffer, and test to see if the buffer was large enough or not
void Log(LPCWSTR pFormat, ...)
{
va_list pArg;
va_start(pArg, pFormat);
char buf[1000];
int len = _vsntprintf(buf, 1000, pFormat, pArg);
va_end(pArg);
//do something with buf
}
Using functions with the ellipses is not very safe. If performance is not critical for log function consider using operator overloading as in boost::format. You could write something like this:
#include <sstream>
#include <boost/format.hpp>
#include <iostream>
using namespace std;
class formatted_log_t {
public:
formatted_log_t(const char* msg ) : fmt(msg) {}
~formatted_log_t() { cout << fmt << endl; }
template <typename T>
formatted_log_t& operator %(T value) {
fmt % value;
return *this;
}
protected:
boost::format fmt;
};
formatted_log_t log(const char* msg) { return formatted_log_t( msg ); }
// use
int main ()
{
log("hello %s in %d-th time") % "world" % 10000000;
return 0;
}
The following sample demonstrates possible errors with ellipses:
int x = SOME_VALUE;
double y = SOME_MORE_VALUE;
printf( "some var = %f, other one %f", y, x ); // no errors at compile time, but error at runtime. compiler do not know types you wanted
log( "some var = %f, other one %f" ) % y % x; // no errors. %f only for compatibility. you could write %1% instead.
Have a look at the example http://www.cplusplus.com/reference/clibrary/cstdarg/va_arg/, they pass the number of arguments to the method but you can ommit that and modify the code appropriately (see the example).

Are there gotchas using varargs with reference parameters

I have this piece of code (summarized)...
AnsiString working(AnsiString format,...)
{
va_list argptr;
AnsiString buff;
va_start(argptr, format);
buff.vprintf(format.c_str(), argptr);
va_end(argptr);
return buff;
}
And, on the basis that pass by reference is preferred where possible, I changed it thusly.
AnsiString broken(const AnsiString &format,...)
{
... the rest, totally identical ...
}
My calling code is like this:-
AnsiString s1, s2;
s1 = working("Hello %s", "World");
s2 = broken("Hello %s", "World");
But, s1 contains "Hello World", while s2 has "Hello (null)". I think this is due to the way va_start works, but I'm not exactly sure what's going on.
If you look at what va_start expands out to, you'll see what's happening:
va_start(argptr, format);
becomes (roughly)
argptr = (va_list) (&format+1);
If format is a value-type, it gets placed on the stack right before all the variadic arguments. If format is a reference type, only the address gets placed on the stack. When you take the address of the reference variable, you get the address or the original variable (in this case of a temporary AnsiString created before calling Broken), not the address of the argument.
If you don't want to pass around full classes, your options are to either pass by pointer, or put in a dummy argument:
AnsiString working_ptr(const AnsiString *format,...)
{
ASSERT(format != NULL);
va_list argptr;
AnsiString buff;
va_start(argptr, format);
buff.vprintf(format->c_str(), argptr);
va_end(argptr);
return buff;
}
...
AnsiString format = "Hello %s";
s1 = working_ptr(&format, "World");
or
AnsiString working_dummy(const AnsiString &format, int dummy, ...)
{
va_list argptr;
AnsiString buff;
va_start(argptr, dummy);
buff.vprintf(format.c_str(), argptr);
va_end(argptr);
return buff;
}
...
s1 = working_dummy("Hello %s", 0, "World");
Here's what the C++ standard (18.7 - Other runtime support) says about va_start() (emphasis mine) :
The restrictions that ISO C places on
the second parameter to the
va_start() macro in header
<stdarg.h> are different in this
International Standard. The parameter
parmN is the identifier of the
rightmost parameter in the variable
parameter list of the function
definition (the one just before the
...).
If the parameter parmN is declared with a function, array, or reference
type, or with a type that is not
compatible with the type that results
when passing an argument for which
there is no parameter, the behavior
is undefined.
As others have mentioned, using varargs in C++ is dangerous if you use it with non-straight-C items (and possibly even in other ways).
That said - I still use printf() all the time...
A good analysis why you don't want this is found in N0695
According to C++ Coding Standards (Sutter, Alexandrescu):
varargs should never be used with C++:
They are not type safe and have UNDEFINED behavior for objects of class type, which is likely causing your problem.
Here's my easy workaround (compiled with Visual C++ 2010):
void not_broken(const string& format,...)
{
va_list argptr;
_asm {
lea eax, [format];
add eax, 4;
mov [argptr], eax;
}
vprintf(format.c_str(), argptr);
}
Side note:
The behavior for class types as varargs arguments may be undefined, but it's consistent in my experience. The compiler pushes sizeof(class) of the class's memory onto the stack. Ie, in pseudo-code:
alloca(sizeof(class));
memcpy(stack, &instance, sizeof(class);
For a really interesting example of this being utilized in a very creative way, notice that you can pass a CString instance in place of a LPCTSTR to a varargs function directly, and it works, and there's no casting involved. I leave it as an exercise to the reader to figure out how they made that work.