libpq error message deallocation - c++

Here comes a stupid question. libpq's PQerrorMessage function return a char const*
char const* msg = PQerrorMessage(conn);
Now since it's const, I don't think I should be deallocating it and I've never seen that done in any examples. But then, when and how does it get freed?
How could it know when I'm finished using my msg pointer?
At first I thought it gets deallocated once another error message is requested but that's not the case.
// cause some error
char const* msg1 = PQerrorMessage(pgconn);
// cause another error
char const* msg2 = PQerrorMessage(pgconn);
// still works
std::cout << msg1 << msg2 << std::endl;
Can someone shed some light on this for me?
Edit: credits to Dmitriy Igrishin
I asked this on the postgresql mailing list and turns out that my initial assumption was correct.
The msg1 pointer should not have been valid and I got lucky somehow.
Edit: from the postgresql docs
PQerrorMessage
Returns the error message most recently generated by an operation on the connection.
char *PQerrorMessage(const PGconn *conn);
Nearly all libpq functions will set a message for PQerrorMessage if they fail. Note that by libpq convention, a nonempty PQerrorMessage result can consist of multiple lines, and will include a trailing newline. The caller should not free the result directly. It will be freed when the associated PGconn handle is passed to PQfinish. The result string should not be expected to remain the same across operations on the PGconn structure.

Do as the docs say, dont expect it's contents to remain constant, just save them away in a std::string rather than storing the pointer.
// cause some error
std::string msg1 = PQerrorMessage(pgconn);
// cause another error
std::string msg2 = PQerrorMessage(pgconn);
// works all the time
std::cout << msg1 << msg2 << std::endl;

A library function that returns a plain-old-pointer to allocated memory is very old-school and C-ish, but there are still a lot of them around. There's no way other than documentation to know if the intent of the library designer was to transfer ownership of the allocated storage to your code. The modern library designer can return a shared_ptr<> to make their intention about storage lifetime completely clear, or wrap the string up as an std::string, which also handles allocation and deletion under the covers.
The const char* declaration doesn't really say anything about the storage lifetime. Instead, it says don't modify the storage. For an old-school function that returns allocated storage, you just have to know that deleting the storage isn't the same as modifying it. The old-school function might want to return a const char* to let you know that only so many storage positions are allocated, and if you write off the end, chaos will ensue.
Of course this function might be returning data from a static table, in which case you should neither write into it nor delete it. Again, when you use plain-old-pointers, there's no way to know.

Related

What unexpected behaviour can returning a pointer to a char array member cause?

Okay, so. I've been working on a class project (we haven't covered std::string and std::vector yet though obviously I know about them) to construct a time clock of sorts. The main portion of the program expects time and date values as formatted c-strings (e.g. "12:45:45", "12/12/12" etc.), and I probably could have kept things simple by storing them the same way in my basic class. But, I didn't.
Instead I did this:
class UsageEntry {
public:
....
typedef time_t TimeType;
typedef int IDType;
...
// none of these getters are thread safe
// furthermore, the char* the getters return should be used immediately
// and then discarded: its contents will be modified on the next call
// to any of these functions.
const char* getUserID();
const char* getDate();
const char* getTimeIn();
const char* getTimeOut();
private:
IDType m_id;
TimeType m_timeIn;
TimeType m_timeOut;
char m_buf[LEN_MAX];
};
And one of the getters (they all do basically the same thing):
const char* UsageEntry::getDate()
{
strftime(m_buf, LEN_OF_DATE, "%D", localtime(&m_timeIn));
return m_buf;
}
And here is a function that uses this pointer:
// ==== TDataSet::writeOut ====================================================
// writes an entry to the output file
void TDataSet::writeOut(int index, FILE* outFile)
{
// because of the m_buf kludge, this cannot be a single
// call to fprintf
fprintf(outFile, "%s,", m_data[index].getUserID());
fprintf(outFile, "%s,", m_data[index].getDate());
fprintf(outFile, "%s,", m_data[index].getTimeIn());
fprintf(outFile, "%s\n", m_data[index].getTimeOut());
fflush(outFile);
} // end of TDataSet::writeOut
How much trouble will this cause? Or to look at it from another angle, what other sorts of interesting and !!FUN!! behaviour can this cause? And, finally, what can be done to fix it (besides the obvious solution of using strings/vectors instead)?
Somewhat related: How do the C++ library functions that do similar things handle this? e.g. localtime() returns a pointer to a struct tm object, which somehow survives the end of that function call at least long enough to be used by strftime.
There is not enough information to determine if it will cause trouble because you do not show how you use it. As long as you document the caveats and keep them in mind when using your class, there won't be issues.
There are some common gotchas to watch out for, but hopefully these are common sense:
Deleting the UsageEntry will invalidate the pointers returned by your getters, since those buffers will be deleted too. (This is especially easy to run into if using locally declared UsageEntrys, as in MadScienceDream's example.) If this is a risk, callers should create their own copy of the string. Document this.
It does not look like m_timeIn is const, and therefore it may change. Calling the getter will modify the internal buffer and these changes will be visible to anything that has that pointer. If this is a risk, callers should create their own copy of the string. Document this.
Your getters are neither reentrant nor thread-safe. Document this.
It would be safer to have the caller supply a destination buffer and length as a parameter. The function can return a pointer to that buffer for convenience. This is how e.g. read works.
A strong API can avoid issues. Failing that, good documentation and common sense can also reduce the chance of issues. Behavior is only unexpected if nobody expects it, this is why documentation about the behavior is important: It generally eliminates unexpected behavior.
Think of it like the "CAUTION: HOT SURFACE" warning on top of a toaster oven. You could design the toaster oven with insulation on top so that an accident can't happen. Failing that, the least you can do is put a warning label on it and there probably won't be an accident. If there's neither insulation nor a warning, eventually somebody will burn themselves.
Now that you've edited your question to show some documentation in the header, many of the initial risks have been reduced. This was a good change to make.
Here is an example of how your usage would change if user-supplied buffers were used (and a pointer to that buffer returned):
// ==== TDataSet::writeOut ====================================================
// writes an entry to the output file
void TDataSet::writeOut(int index, FILE* outFile)
{
char userId[LEN_MAX], date[LEN_MAX], timeIn[LEN_MAX], timeOut[LEN_MAX];
fprintf(outFile, "%s,%s,%s,%s\n",
m_data[index].getUserID(userId, sizeof(userId)),
m_data[index].getDate(date, sizeof(date)),
m_data[index].getTimeIn(timeIn, sizeof(timeIn)),
m_data[index].getTimeOut(timeOut, sizeof(timeOut))
);
fflush(outFile);
} // end of TDataSet::writeOut
How much trouble will this cause? Or to look at it from another angle,
what other sorts of interesting and !!FUN!! behaviour can this cause?
And, finally, what can be done to fix it (besides the obvious solution
of using strings/vectors instead)?
Well there is nothing very FUN here, it just means that the results of your getter cannot outlive the corresponding instance of UsageEntry or you have a dangling pointer.
How do the C++ library functions that do similar things handle this?
e.g. localtime() returns a pointer to a struct tm object, which
somehow survives the end of that function call at least long enough to
be used by strftime.
The documentation of localtime says:
Return value
pointer to a static internal std::tm object on success, or NULL otherwise. The structure may be shared between
std::gmtime, std::localtime, and std::ctime, and may be overwritten on
each invocation.
The main problem here, as the main problem with most pointer based code, is the issue of ownership. The problem is the following:
const char* val;
{
UsageEntry ue;
val = ue.getDate();
}//ue goes out of scope
std::cout << val << std::endl;//SEGFAULT (maybe, really nasal demons)
Because val is actually owned by ue, you shoot yourself in the foot if they exist in different scopes. You COULD document this, but it is oh-so-much simpler to pass the buffer in as an argument (just like the strftime function does).
(Thanks to odedsh below for pointing this one out)
Another issue is that subsequent calls will blow away the info gained. The example odesh used was
fprintf(outFile, "%s\n%s",ue.getUserID(), ue.getDate());
but the problem is more pervasive:
const char* id = ue.getUserID();
const char* date = ue.getDate();//Changes id!
This violates the "Principal of Least Astonishment" becuase...well, its weird.
This design also breaks the rule-of-thumb that each class should do exactly one thing. In this case, UsageEntry both provides accessors to get the formatted time as a string, AND manages that strings buffer.

Why does this work: returning C string literal from std::string function and calling c_str()

We recently had a lecture in college where our professor told us about different things to be careful about when programming in different languages.
The following is an example in C++:
std::string myFunction()
{
return "it's me!!";
}
int main(int argc, const char * argv[])
{
const char* tempString = myFunction().c_str();
char myNewString[100] = "Who is it?? - ";
strcat(myNewString, tempString);
printf("The string: %s", myNewString);
return 0;
}
The idea why this would fail is that return "it's me!!" implicitly calls the std::string constructor with a char[]. This string gets returned from the function and the function c_str() returns a pointer to the data from the std::string.
As the string returned from the function is not referenced anywhere, it should be deallocated immediately. That was the theory.
However, letting this code run works without problems.
Would be curious to hear what you think.
Thanks!
Your analysis is correct. What you have is undefined behaviour. This means pretty much anything can happen. It seems in your case the memory used for the string, although de-allocated, still holds the original contents when you access it. This often happens because the OS does not clear out de-allocated memory. It just marks it as available for future use. This is not something the C++ language has to deal with: it is really an OS implementation detail. As far as C++ is concerned, the catch-all "undefined behaviour" applies.
I guess deallocation does not imply memory clean-up or zeroing. And obviously this could lead to a segfault in other circumstances.
I think that the reason is that the stack memory has not been rewriten, so it can get the original data. I created a test function and called it before the strcat.
std::string myFunction()
{
return "it's me!!";
}
void test()
{
std::string str = "this is my class";
std::string hi = "hahahahahaha";
return;
}
int main(int argc, const char * argv[])
{
const char* tempString = myFunction().c_str();
test();
char myNewString[100] = "Who is it?? - ";
strcat(myNewString, tempString);
printf("The string: %s\n", myNewString);
return 0;
}
And get the result:
The string: Who is it?? - hahahahahaha
This proved my idea.
As others have mentioned, according to the C++ standard this is undefined behavior.
The reason why this "works" is because the memory has been given back to the heap manager which holds on to it for later reuse. The memory has not been given back to the OS and thus still belongs to the process. That's why accessing freed memory does not cause a segmentation fault. The problem remains however that now two parts of your program (your code and the heap manager or new owner) are accessing memory that they think uniquely belongs to them. This will destroy things sooner or later.
The fact that the string is deallocated does not necessarily mean that the memory is no longer accessible. As long as you do nothing that could overwrite it, the memory is still usable.
As said above - it's unpredicted behaviour. It doesn't work for me (in Debug configuration).
The std::string Destructor is called immediately after the assignment to the tempString - when the expression using the temporary string object finishes.
Leaving the tempString to point on a released memory (that in your case still contains the "it's me!!" literals).
You cannot conclude there is no problems by getting your result by coincidence.
There are other means to detect 'problems' :
Static analysis.
Valgrind would catch the error, showing you both the offending action (trying to copy from freed zone -by strcat) and the deallocation which caused the freeing.
Invalid read of size 1
at 0x40265BD: strcat (mc_replace_strmem.c:262)
by 0x80A5BDB: main() (valgrind_sample_for_so.cpp:20)
[...]
Address 0x5be236d is 13 bytes inside a block of size 55 free'd
at 0x4024B46: operator delete(void*) (vg_replace_malloc.c:480)
by 0x563E6BC: std::string::_Rep::_M_destroy(std::allocator<char> const&) (in /usr/lib/libstdc++.so.6.0.13)
by 0x80A5C18: main() (basic_string.h:236)
[...]
The one true way would be to prove the program correct. But it is really hard for procedural language, and C++ makes it harder.
Actually, string literals have static storage duration. They are packed inside the executable itself. They are not on the stack, nor dynamically allocated. In the usual case, it is correct that this would be pointing to invalid memory and be undefined behavior, however for strings, the memory is in static storage, so it will always be valid.
Unless I'm missing something, I think this is an issue of scope. myFunction() returns a std::string. The string object is not directly assigned to a variable. But it remains in scope until the end of main(). So, tempString will point to perfectly valid and available space in memory until the end of the main() code block, at which time tempString will also fall out of scope.

how to pass LPTSTR to a function and modify it's contents

I need a function that is supplied a LPTSTR and an enumerated value, constructs a string based on the value and puts it in the LPTSTR.
I've written the following function which uses an array of names indexed by an enumerated value:
bool GetWinClassName(const int &WinType, LPTSTR *className, const int bufSize)
{
bool allOk = true;
LPTSTR tempName = new TCHAR[bufSize];
_stprintf_s(tempName, bufSize, TEXT("Win%sClass"), g_WinNames[WinType]);
std::cout << (char*)tempName << std::endl;
if (FAILED(StringCchCopy(*className, (bufSize+1)*sizeof(TCHAR), tempName)))
{
allOk = false;
}
delete[] tempName;
return allOk;
}
(Originally I just had the _stprintf_s line using className instead of tempName, this has been broken up to find where the error lies.)
The above code compiles in VC2010 Express but gives an unhandled exception: "Access violation writing" to (presumably) *className when it tries to execute the StringCchCopy line.
I can get this to work by doing
className = new TCHAR[bufSize];
before calling the function (with a matching delete[] after it) but do I really need to do that each time I want to call the function?
I understand where the problem lies but not why which is hampering my efforts to come up with a workable solution. The problem appears to me to be that I can't put something in the LPTSTR (via _stprintf_s or StringCchCopy) unless I allocate it some memory by using new TCHAR[bufSize]. I've tried assigning it an intial value of exactly the same size but with the same results which is leading me to think that the memory allocation actually has nothing to do with it. Is it then somehow casting my LPTSTR into a TCHAR[]? I don't see how that's possible but at this stage, I'd believe anything.
Can someone please explain what I'm doing wrong? (Or at least where my understanding is wrong.) And a probably related question is why is my std::cout only showing the first character of the string?
wstring winClassName( int const winType )
{
return wstring( L"Win" ) + g_WinNames[winType] + L"Class";
}
But I'm just completely baffled why you have that global array of names etc.: it's probably a design level error.
do I really need to do that each time I want to call the function?
An LPTSTR value is not a string object, it is simply a pointer-to-TCHAR. If you do not allocate a buffer, where do you think the characters will go? You must make sure that the className pointer argument points to a memory buffer that you can write to. Whether you allocate a new buffer each time is up to you.
As Alf implies, a better alternative is to avoid the direct use of pointers and dynamically allocated arrays altogether, and return a string object.
why is my std::cout only showing the first character of the string?
Use std::wcout instead if UNICODE is defined.

Confusing std::string::c_str() behavior in VS2010

I'm sure I've done something wrong, but for the life of me I can't figure out what! Please consider the following code:
cerr<<el.getText()<<endl;
cerr<<el.getText().c_str()<<endl;
cerr<<"---"<<endl;
const char *value = el.getText().c_str();
cerr<<"\""<<value<<"\""<<endl;
field.cdata = el.getText().c_str();
cerr<<"\""<<field.cdata<<"\""<<endl;
el is an XML element and getText returns a std::string. As expected, el.getText() and el.getText().c_str() print the same value. However, value is set to "" - that is, the empty string - when it assigned the result of c_str(). This code had been written to set field.cdata=value, and so was clearing it out. After changing it to the supposedly-identical expression value is set from, it works fine and the final line prints the expected value.
Since el is on the stack, I thought I might have been clobbering it - but even after value is set, the underlying value in el is still correct.
My next thought was that there was some weird compiler-specific issue with assigning things to const pointers, so I wrote the following:
std::string thing = "test";
std::cout << thing << std::endl;
std::cout << thing.c_str() << std::endl;
const char* value = thing.c_str();
std::cout << value << std::endl;
As expected, I get 'test' three times.
So now I have no clue what is going on. It would seem obvious that there is something strange going on in my program that's not happening in the sample, but I don't know what it is and I'm out of ideas about how to keep looking. Can somebody enlighten me, or at least point me in the right direction?
I assume that el.getText() is returning a temporary string object. When that object is destroyed the pointer returned by c_str() is no longer valid (keep in mind that that are other ways the pointer returned by c_str() can be invalidated, too).
The temporary object will be destroyed at the end of the full expression it's created in (which is generally at the semi-colon in your example above).
You may be able to solve your problem with something like the following:
const char *value = strdup(el.getText().c_str());
which creates a copy of the string as a raw char array in dynamically allocated memory. You then become responsible for calling free() on that pointer at some point when that data is no longer needed.

How to prevent copying a wild pointer string

My program is crash intermittently when it tries to copy a character array which is not ended by a NULL terminator('\0').
class CMenuButton {
TCHAR m_szNode[32];
CMenuButton() {
memset(m_szNode, '\0', sizeof(m_szNode));
}
};
int main() {
....
CString szTemp = ((CMenuButton*)pButton)->m_szNode; // sometime it crashes here
...
return 0;
}
I suspected someone had not copied the character well ended by '\0', and it ended like:
Stack
m_szNode $%#^&!&!&!*#*#&!(*#(!*##&#&*&##!^&*&#(*!#*((*&*SDFKJSHDF*(&(*&(()(**
Can you tell me what is happening and what should i do to prevent the copying of wild pointer? Help will be very much appreciated!
I guess I'm unable to check if the character array is NULL before copying...
I suspect that your real problem could be that pButton is a bad pointer, so check that out first.
The only way to be 100% sure that a pointer is correct, and points to a correctly sized/allocated object is to never use pointers you didn't create, and never accept/return pointers. You would use cookies, instead, and look up your pointer in some sort of cookie -> pointer lookup (such as a hash table). Basically, don't trust user input.
If you are more concerned with finding bugs, and less about 100% safety against things like buffer overrun attacks, etc. then you can take a less aggressive approach. In your function signatures, where you currently take pointers to arrays, add a size parameter. E.g.:
void someFunction(char* someString);
Becomes
void someFunction(char* someString, size_t size_of_buffer);
Also, force the termination of arrays/strings in your functions. If you hit the end, and it isn't null-terminated, truncate it.
Make it so you can provide the size of the buffer when you call these, rather than calling strlen (or equivalent) on all your arrays before you call them.
This is similar to the approach taken by the "safe string functions" that were created by Microsoft (some of which were proposed for standardization). Not sure if this is the perfect link, but you can google for additional links:
http://msdn.microsoft.com/en-us/library/ff565508(VS.85).aspx
There are two possibilities:
pButton doesn't point to a CMenuButton like you think it does, and the cast is causing undefined behavior.
The code that sets m_szNode is incorrect, overflowing the given size of 32 characters.
Since you haven't shown us either piece of code, it's difficult to see what's wrong. Your initialization of m_szNode looks OK.
Is there any reason that you didn't choose a CString for m_szNode?
My approach would be to make m_szNode a private member in CMenuButton, and explicitly NULL-terminate it in the mutator method.
class CMenuButton {
private:
TCHAR m_szNode[32];
public:
void set_szNode( TCHAR x ) {
// set m_szNode appropriately
m_szNode[ 31 ] = 0;
}
};