var arg list to tempfile, why is it needed? - c++

I have this code inside a constructor of a class (not written by me) and it writes a variable arg list to a tmp file.
I wondered why this would be needed? The tmpfile is removed after this ctor goes out of scope and the var arg list sits inside the m_str vector.
Can someone suggest a better way of doing this without the use of a tmpfile?
DString(const char *fmt, ...)
{
DLog::Instance()->Log("Inside DString with ellipses");
va_list varptr;
va_start(varptr, fmt);
FILE *f = tmpfile();
if (f != NULL)
{
int n = ::vfprintf(f, fmt, varptr) + 1;
m_str.resize(n + 1);
::vsprintf(&m_str[0], fmt, varptr);
va_end(varptr);
}
else
DLog::Instance()->Log("[ERROR TMPFILE:] Unable to create TmpFile for request!");
}

This is C++ code: I think you may be trying to solve the wrong problem here.
The need for a temp file would go away completely if you consider using a C++-esque design instead of continuing to use the varargs. It may seem like a lot of work to convert all the calling sites to use a new mechanism, but varargs provide a wide variety of possibilities to mis-pass parameters leaving you open to insidious bugs, not to mention you can't pass non-POD types at all. I believe in the long (or even medium) term it will pay off in reliability, clarity, and ease of debugging.
Instead try to implement a C++-style streams interface that provides type safety and even the ability to disallow certain operations if needed.

It's just using the temporary file as a place that it can write the contents that won't overflow, so it can measure the length, then allocate sufficient space for the string, and finally deposit the real output in the string.
I'd at least consider how difficult it would be to replace the current printf-style interface that's leading to this with an iostreams-style interface, which will make it easy to avoid and give all the usual benefits of iostreams (type-safe, extensible, etc.)
Edit: if changing the function's signature is really too difficult to contemplate, then you probably want to replace vfprintf with vsnprintf. vsnprintf allows you to specify a buffer length (so it won't overrun the buffer) and it returns the number of characters that would have been generated if there had been sufficient space. As such, usage would be almost like you have now, but avoid generating the temporary file. You'd call it once specifying a buffer length of 0, use the return value (+1 for the NUL terminator) to resize your buffer, then call it again specifying the correct buffer size.

It appears to be using the temp file as an output place for the ::vfprintf() call. It does that to get the length of the formatted string (plus 1 for the NULL). Then resizes m_str big enough to hold the formatted string, which gets filled in from the ::vsprintf() call.
The var arg list is not in the file or in m_str. The formatted output from printf() (and its variants) is in the file and in m_str.

I have a queasy feeling showing this but you could try:
FILE *fp=freopen("nul","w", stderr)
int n = ::vfprintf( fp , fmt, varptr );
fclose(fp);
(windows)

Related

C++ - Using GetPrivateProfileString without buffer

I'm using GetPrivateProfileStringA to read some things from a .ini file. I have some other class where I save things along with a string array. I have to use it like this to get a proper string into the ControlAlt array:
char buffer[24];
GetPrivateProfileStringA("CONTROLS",
"ShiftUpAlt",
"LeftThumb",
buffer,
(DWORD)24,
"./Gears.ini");
scriptControl->ControlAlt[ScriptControls::ControlType::ShiftUp] = buffer;
I've tried putting it in directly, like so:
GetPrivateProfileStringA("CONTROLS",
"ShiftUpAlt",
"LeftThumb",
(LPSTR)scriptControl->ControlAlt[ScriptControls::ControlType::ShiftUp],
(DWORD)24,
"./Gears.ini");
But then the value in ControlAlt is an LPSTR, which gives complications later when comparing it against a proper string. Is there a way to not use a buffer for this?
ControlAlt is defined as std::string ControlAlt[SIZEOF_ControlType];
GetPrivateProfileStringA requires a buffer to write a classic C-style '\0'-terminated string into, and a std::string is not such a buffer, although as you observe, a C-style string can be converted to a std::string.
More specifically, GetPrivateProfileStringA expects a char * (LPSTR in Windows API terms) pointing to a writable buffer and that buffer's length. std::string does not provide this - at best, it provides the c_str() accessor which returns const char * (LPCSTR in Windows API terms) - a pointer to a read-only buffer. The const-ness of the buffer data is a pretty good indication that modifying it is a bad idea and will more than likely lead to undefined behavior.
C++ '98 says: "A program shall not alter any of the characters in this sequence." However, implementations conforming to newer standards may well be more willing to put up with monkey business: resize() to make the buffer large enough, then use &foo[0] to get a char * that isn't const (or just const_cast away the protection on data()), let GetPrivateProfileStringA write to the buffer, then truncate the std::string at the '\0' wherever it landed. This still doesn't let you pass in a std::string directly to a function expecting a buffer pointer, though, because they are not the same thing - it just gives you a chance to avoid copying the string one extra time from the buffer.

Overlapping strings

I have a problem with overlapping char*.
I'm working in a low-memory environment, namely Arduino and I would like to use the least memory possible. I want to be able to prepend a string with another and to do it without any copying of variables which wastes memory.
This is standard C or C++.
char* bigPacket = (char*)malloc(25); //Makes a big string of length 25
char* payload = bigPacket + 2; //This is part of the big string, 2 chars in.
bigPacket[0] = 72; // Letter 'H'
bigPacket[1] = 72; //I'm expecting the final bigPacket to read "HHHello, world"
payload = "Hello, World";
print(bigPacket);
But the problem is that it does not print "HHHello, world" as it should. Instead, it just prints "HH". Is there a proper way to make it be able to overlap these strings to print "HHHello, world"?
You changed where payload points. What you needed to do was leave payload alone and change the data it points to.
strcpy(payload, "Hello World");
Edit: If you really want to avoid copies you'd end up with something like the SGI Rope class. But you'd pay a lot in code complexity.
If you want to do this without either very complicated code or multiple copies of data, destroying the benefit, you need to have the complete string as one literal in your program: "HHHelloWorld". You can then play with pointers and lengths to access various parts of it, but remember there is only one null byte, at the end of the string.
However, I suspect that this is an over-optimization. Arduino programming rarely involves a lot of very long string. It is important to keep the code simple and direct.
You should not mess with pointers for something like that. Instead you should store string literals in flash instead of sram memory. This is usually done with the help of progmem macros. Often the "F" macro is sufficient though. Then you can copy your strings - as needed - and if needed - into a suitable buffer.
Simplest example:
Serial.println(F("this is text from flash memory"));
You just assign the payload pointer to point to the constant string, you do not copy the string to what it currently points to.
In order to copy the string you need to use strcpy or memcpy:
char *bigPacket = malloc(25);
bigPacket[0] = bigpacket[1] = 72;
strcpy( bigpacket+2, "Hello, World");
print( bigPacket );
Note that this is rather unlikely to save memory, since "Hello, world" will exist as a constant string in your code, to save memory it is probably most efficient to call print multiple times.
However, I guess that is not possible in this case.

Why does the VS2008 std::string.erase() move its buffer?

I want to read a file line by line and capture one particular line of input. For maximum performance I could do this in a low level way by reading the entire file in and just iterating over its contents using pointers, but this code is not performance critical so therefore I wish to use a more readable and typesafe std library style implementation.
So what I have is this:
std::string line;
line.reserve(1024);
std::ifstream file(filePath);
while(file)
{
std::getline(file, line);
if(line.substr(0, 8) == "Whatever")
{
// Do something ...
}
}
While this isn't performance critical code I've called line.reserve(1024) before the parsing operation to preclude multiple reallocations of the string as larger lines are read in.
Inside std::getline the string is erased before having the characters from each line added to it. I stepped through this code to satisfy myself that the memory wasn't being reallocated each iteration, what I found fried my brain.
Deep inside string::erase rather than just resetting its size variable to zero what it's actually doing is calling memmove_s with pointer values that would overwrite the used part of the buffer with the unused part of the buffer immediately following it, except that memmove_s is being called with a count argument of zero, i.e. requesting a move of zero bytes.
Questions:
Why would I want the overhead of a library function call in the middle of my lovely loop, especially one that is being called to do nothing at all?
I haven't picked it apart myself yet but under what circumstances would this call not actually do nothing but would in fact start moving chunks of buffer around?
And why is it doing this at all?
Bonus question: What the C++ standard library tag?
This is a known issue I reported a year ago, to take advantage of the fix you'll have to upgrade to a future version of the compiler.
Connect Bug: "std::string::erase is stupidly slow when erasing to the end, which impacts std::string::resize"
The standard doesn't say anything about the complexity of any std::string functions, except swap.
std::string::clear() is defined in terms of std::string::erase(),
and std::string::erase() does have to move all of the characters after
the block which was erased. So why shouldn't it call a standard
function to do so? If you've got some profiler output which proves that
this is a bottleneck, then perhaps you can complain about it, but
otherwise, frankly, I can't see it making a difference. (The logic
necessary to avoid the call could end up costing more than the call.)
Also, you're not checking the results of the call to getline before
using them. Your loop should be something like:
while ( std::getline( file, line ) ) {
// ...
}
And if you're so worried about performance, creating a substring (a new
std::string) just in order to do a comparison is far more expensive
than a call to memmove_s. What's wrong with something like:
static std::string const target( "Whatever" );
if ( line.size() >= target.size()
&& std::equal( target.begin(), target().end(), line.being() ) ) {
// ...
}
I'ld consider this the most idiomatic way of determining whether a
string starts with a specific value.
(I might add that from experience, the reserve here doesn't buy you
much either. After you've read a couple of lines in the file, your
string isn't going to grow much anyway, so there'll be very few
reallocations after the first couple of lines. Another case of
premature optimization?)
In this case, I think the idea you mention of reading the entire file and iterating over the result may actually give about as simple of code. You're simply changing: "read line, check for prefix, process" to "read file, scan for prefix, process":
size_t not_found = std::string::npos;
std::istringstream buffer;
buffer << file.rdbuf();
std::string &data = buffer.str();
char const target[] = "\nWhatever";
size_t len = sizeof(target)-1;
for (size_t pos=0; not_found!=(pos=data.find(target, pos)); pos+=len)
{
// process relevant line starting at contents[pos+1]
}

C++: Is it safe to move a jpeg around in memory using a std::string?

I have an external_jpeg_func() that takes jpeg data in a char array to do stuff with it. I am unable to modify this function. In order to provide it the char array, I do something like the following:
//what the funcs take as inputs
std::string my_get_jpeg();
void external_jpeg_func(const char* buf, unsigned int size);
int main ()
{
std::string myString = my_get_jpeg();
external_jpeg_func(myString.data(), myString.length() );
}
My question is: Is it safe to use a string to transport the char array around? Does jpeg (or perhaps any binary file format) be at risk of running into characters like '\0' and cause data loss?
My recommendation would be to use std::vector<char>, instead of std::string, in this case; the danger with std::string is that it provides a c_str() function and most developers assume that the contents of a std::string are NUL-terminated, even though std::string provides a size() function that can return a different value than what you would get by stopping at NUL. That said, as long as you are careful to always use the constructor that takes a size parameter, and you are careful not to pass the .c_str() to anything, then there is no problem with using a string here.
While there is no technical advantage to using a std::vector<char> over a std::string, I feel that it does a better job of communicating to other developers that the content is to be interpreted as an arbitrary byte sequence rather than NUL-terminated textual content. Therefore, I would choose the former for this added readability. That said, I have worked with plenty of code that uses std::string for storing arbitrary bytes. In fact, the C++ proto compiler generates such code (though, I should add, that I don't think this was a good choice for the readability reasons that I mentioned).
std::string does not treat null characters specially, unless you don't give it an explicit string length. So your code will work fine.
Although, in C++03, strings are technically not required to be stored in contiguous memory. Just about every std::string implementation you will find will in fact store them that way, but it is not technically required. C++11 rectifies this.
So, I would suggest you use a std::vector<char> in this case. std::string doesn't buy you anything over a std::vector<char>, and it's more explicit that this is an array of characters and not a possibly printable string.
I think it is better to use char array char[] or std::vector<char>. This is standard way to keep images. Of course, binary file may contain 0 characters.

Why does MSVC++ consider "std::strcat" to be "unsafe"? (C++)

When I try to do things like this:
char* prefix = "Sector_Data\\sector";
char* s_num = "0";
std::strcat(prefix, s_num);
std::strcat(prefix, "\\");
and so on and so forth, I get a warning
warning C4996: 'strcat': This function or variable may be unsafe. Consider using strcat_s instead.
Why is strcat considered unsafe, and is there a way to get rid of this warning without using strcat_s?
Also, if the only way to get rid of the warning is to use strcat_s, how does it work (syntax-wise: apparently it does not take two arguments).
If you are using c++, why not avoid the whole mess and use std::string. The same example without any errors would look like this:
std::string prefix = "Sector_Data\\sector";
prefix += "0";
prefix += "\\"
no need to worry about buffer sizes and all that stuff. And if you have an API which takes a const char *, you can just use the .c_str() member;
some_c_api(prefix.c_str());
Because the buffer, prefix, could have less space than you are copying into it, causing a buffer overrun.
Therefore, a hacker could pass in a specially crafted string which overwrites the return address or other critical memory and start executing code in the context of your program.
strcat_s solves this by forcing you to pass in the length of the buffer into which you are copying the string; it will truncate the string if necessary to make sure that the buffer is not overrun.
google strcat_s to see precisely how to use it.
You can get rid of these warning by adding:
_CRT_SECURE_NO_WARNINGS
and
_SCL_SECURE_NO_WARNINGS
to your project's preprocessor definitions.
That's one of the string-manipulation functions in C/C++ that can lead to buffer overrun errors.
The problem is that the function doesn't know what the size of the buffers are. From the MSDN documentation:
The first argument, strDestination,
must be large enough to hold the
current strDestination and strSource
combined and a closing '\0';
otherwise, a buffer overrun can occur.
strcat_s takes an extra argument telling it the size of the buffer. This allows it to validate the sizes before doing the concat, and will prevent overruns. See http://msdn.microsoft.com/en-us/library/d45bbxx4.aspx
Because it has no means of checking to see if the destination string (prefix) in your case will be written past its bounds. strcat essentially works by looping, copying byte-by-byte the source string into the destination. Its stops when it sees a value "0" (notated by '\0') called a null terminal. Since C has no built in bounds checking, and the dest str is just a place in memory, strcat will continue going ad-infinidium even if it blows past the source str or the dest. str doesn't have a null terminal.
The solutions above are platform-specific to your windows environment. If you want something platform independent, you have to wrangle with strncat:
strncat(char* dest, const char* src, size_t count)
This is another option when used intelligently. You can use count to specify the max number of characters to copy. To do this, you have to figure out how much space is available in dest (how much you allocated - strlen(dest)) and pass that as count.
To turn the warning off, you can do this.
#pragma warning(disable:4996)
btw, I strongly recommend that you use strcat_s().
There are two problems with strcat. First, you have to do all your validation outside the function, doing work that is almost the same as the function:
if(pDest+strlen(pDest)+strlen(pScr) < destSize)
You have to walk down the entire length of both strings just to make sure it will fit, before walking down their entire length AGAIN to do the copy. Because of this, many programmers will simply assume that it will fit and skip the test. Even worse, it may be that when the code is first written it is GUARANTEED to fit, but when someone adds another strcat, or changes a buffer size or constant somewhere else in the program, you now have issues.
The other problem is if pSrc and pDst overlap. Depending on your compiler, strcat may very well be simple loop that checks a character at a time for a 0 in pSrc. If pDst overwrites that 0, then you will get into a loop that will run until your program crashes.