I will use a pseudo example here, though I have noticed this behaviour in several APIs, like sqlite3 or windows.
Say a function is declared like so:
void Fu(some_identifier **ppBar);
and I do this in my code:
some_identifier **ppFubar;
fu(ppFubar);
It is my understanding that this would work and indeed it does in my own functions. Yet my program crashes after a buffer overflow when I do this with some APIs.
If I do this:
some_identifier *pFubar;
fu(&pFubar);
everything is fine.
Do ppFubar and &pFubar not evaluate to the exact same thing?
EDIT:
A concrete example would be (fourth argument):
int sqlite3_prepare(
sqlite3 *db, /* Database handle */
const char *zSql, /* SQL statement, UTF-8 encoded */
int nByte, /* Maximum length of zSql in bytes. */
sqlite3_stmt **ppStmt, /* OUT: Statement handle */
const char **pzTail /* OUT: Pointer to unused portion of zSql */
);
Your understanding is wrong.
If a function takes a some_identifier **ppFubar; parameter it is probably going to do something related to a some_identifier object somewhere inside its function body.
If you call it with some_identifier **ppFubar; you are giving it an uninitialized pointer, i.e. a pointer to garbage. If the function does anything with it (for example, it dereferences it, either once or twice), you are incurring in undefined behaviour (most likely, it will crash).
Pass a correctly initialized pointer to the function.
Related
If I pass a FILE pointer to a function, is it updated?
Can I do something like the following?
FILE* fp;
size_t read, len;
char *key;
fp=fopen((tmpDir+"/"+filename).c_str(),"r");
while((read=getline(&key,&len,fp))!=-1){
if (header_section){
processHeader(fp);
}else{
processBody(fp);
}
}
fclose(fp);
void processHeader(FILE* fp){
size_t read, len;
char *key;
while((read=getline(&key,&len,fp))!=-1){
... do header processing ...
if(strcmp(key,"end_of_header")==0){
return;
}
}
}
void processBody(FILE* fp){
size_t read, len;
char *key;
while((read=getline(&key,&len,fp))!=-1){
... process body data ...
}
}
The above code doesn't work (I get a Segmentation Fault). Is there a way to process parts of a text file in different functions according to the section of the file?
Yes, it is possible to pass a FILE * to a function. After all, various standard C I/O functions accept an argument which is a pointer.
However, FILE is an opaque type. Whether the FILE * points at something (e.g a data structure) which is updated is implementation defined. But if your code is doing things that are valid on a FILE * (e.g. passing it to C I/O functions) then that would not explain a segmentation fault.
The partial code you have supplied is not sufficient to identify the cause of your "segmentation fault". Odds are, if the program is crashing, some code in your program is exhibiting undefined behaviour. But the simple act of passing a FILE *, obtained as a return value from fopen(), as an argument to a function would not be the cause. You need to look at other code in your program.
And, in C++, you would be better off using C++ streams than C I/O functions. But, at most, that will only change the symptom. If other code is the cause of your undefined behaviour, changing method of I/O (assuming you do it correctly) won't fix that.
Okay, so. I've been working on a class project (we haven't covered std::string and std::vector yet though obviously I know about them) to construct a time clock of sorts. The main portion of the program expects time and date values as formatted c-strings (e.g. "12:45:45", "12/12/12" etc.), and I probably could have kept things simple by storing them the same way in my basic class. But, I didn't.
Instead I did this:
class UsageEntry {
public:
....
typedef time_t TimeType;
typedef int IDType;
...
// none of these getters are thread safe
// furthermore, the char* the getters return should be used immediately
// and then discarded: its contents will be modified on the next call
// to any of these functions.
const char* getUserID();
const char* getDate();
const char* getTimeIn();
const char* getTimeOut();
private:
IDType m_id;
TimeType m_timeIn;
TimeType m_timeOut;
char m_buf[LEN_MAX];
};
And one of the getters (they all do basically the same thing):
const char* UsageEntry::getDate()
{
strftime(m_buf, LEN_OF_DATE, "%D", localtime(&m_timeIn));
return m_buf;
}
And here is a function that uses this pointer:
// ==== TDataSet::writeOut ====================================================
// writes an entry to the output file
void TDataSet::writeOut(int index, FILE* outFile)
{
// because of the m_buf kludge, this cannot be a single
// call to fprintf
fprintf(outFile, "%s,", m_data[index].getUserID());
fprintf(outFile, "%s,", m_data[index].getDate());
fprintf(outFile, "%s,", m_data[index].getTimeIn());
fprintf(outFile, "%s\n", m_data[index].getTimeOut());
fflush(outFile);
} // end of TDataSet::writeOut
How much trouble will this cause? Or to look at it from another angle, what other sorts of interesting and !!FUN!! behaviour can this cause? And, finally, what can be done to fix it (besides the obvious solution of using strings/vectors instead)?
Somewhat related: How do the C++ library functions that do similar things handle this? e.g. localtime() returns a pointer to a struct tm object, which somehow survives the end of that function call at least long enough to be used by strftime.
There is not enough information to determine if it will cause trouble because you do not show how you use it. As long as you document the caveats and keep them in mind when using your class, there won't be issues.
There are some common gotchas to watch out for, but hopefully these are common sense:
Deleting the UsageEntry will invalidate the pointers returned by your getters, since those buffers will be deleted too. (This is especially easy to run into if using locally declared UsageEntrys, as in MadScienceDream's example.) If this is a risk, callers should create their own copy of the string. Document this.
It does not look like m_timeIn is const, and therefore it may change. Calling the getter will modify the internal buffer and these changes will be visible to anything that has that pointer. If this is a risk, callers should create their own copy of the string. Document this.
Your getters are neither reentrant nor thread-safe. Document this.
It would be safer to have the caller supply a destination buffer and length as a parameter. The function can return a pointer to that buffer for convenience. This is how e.g. read works.
A strong API can avoid issues. Failing that, good documentation and common sense can also reduce the chance of issues. Behavior is only unexpected if nobody expects it, this is why documentation about the behavior is important: It generally eliminates unexpected behavior.
Think of it like the "CAUTION: HOT SURFACE" warning on top of a toaster oven. You could design the toaster oven with insulation on top so that an accident can't happen. Failing that, the least you can do is put a warning label on it and there probably won't be an accident. If there's neither insulation nor a warning, eventually somebody will burn themselves.
Now that you've edited your question to show some documentation in the header, many of the initial risks have been reduced. This was a good change to make.
Here is an example of how your usage would change if user-supplied buffers were used (and a pointer to that buffer returned):
// ==== TDataSet::writeOut ====================================================
// writes an entry to the output file
void TDataSet::writeOut(int index, FILE* outFile)
{
char userId[LEN_MAX], date[LEN_MAX], timeIn[LEN_MAX], timeOut[LEN_MAX];
fprintf(outFile, "%s,%s,%s,%s\n",
m_data[index].getUserID(userId, sizeof(userId)),
m_data[index].getDate(date, sizeof(date)),
m_data[index].getTimeIn(timeIn, sizeof(timeIn)),
m_data[index].getTimeOut(timeOut, sizeof(timeOut))
);
fflush(outFile);
} // end of TDataSet::writeOut
How much trouble will this cause? Or to look at it from another angle,
what other sorts of interesting and !!FUN!! behaviour can this cause?
And, finally, what can be done to fix it (besides the obvious solution
of using strings/vectors instead)?
Well there is nothing very FUN here, it just means that the results of your getter cannot outlive the corresponding instance of UsageEntry or you have a dangling pointer.
How do the C++ library functions that do similar things handle this?
e.g. localtime() returns a pointer to a struct tm object, which
somehow survives the end of that function call at least long enough to
be used by strftime.
The documentation of localtime says:
Return value
pointer to a static internal std::tm object on success, or NULL otherwise. The structure may be shared between
std::gmtime, std::localtime, and std::ctime, and may be overwritten on
each invocation.
The main problem here, as the main problem with most pointer based code, is the issue of ownership. The problem is the following:
const char* val;
{
UsageEntry ue;
val = ue.getDate();
}//ue goes out of scope
std::cout << val << std::endl;//SEGFAULT (maybe, really nasal demons)
Because val is actually owned by ue, you shoot yourself in the foot if they exist in different scopes. You COULD document this, but it is oh-so-much simpler to pass the buffer in as an argument (just like the strftime function does).
(Thanks to odedsh below for pointing this one out)
Another issue is that subsequent calls will blow away the info gained. The example odesh used was
fprintf(outFile, "%s\n%s",ue.getUserID(), ue.getDate());
but the problem is more pervasive:
const char* id = ue.getUserID();
const char* date = ue.getDate();//Changes id!
This violates the "Principal of Least Astonishment" becuase...well, its weird.
This design also breaks the rule-of-thumb that each class should do exactly one thing. In this case, UsageEntry both provides accessors to get the formatted time as a string, AND manages that strings buffer.
Alright, so here is a really weird one. I am reading raw data into a buffer, nothing fancy, my code went like so:
typedef unsigned char Byte;
/* ... */
static Byte SerializeBuffer[2048];
/* ... */
std::streamsize readInBuffer =
data.read((char*)SerializeBuffer, sizeof(SerializeBuffer));
But I would keep getting the compile error message 'error: invalid cast from type ‘void *’ to type ‘std::streamsize’', No idea why the compiler thought that sizeof was a void pointer. Well I tried casting it in several ways, but the same error kept happening. I ended up with this:
std::streamsize dummy = sizeof(SerializeBuffer);
std::streamsize readInBuffer =
data.read((char*)SerializeBuffer, reinterpret_cast<std::streamsize>(dummy));
Which pops up the following: error: invalid cast from type ‘std::streamsize’ to type ‘std::streamsize’
I am at a complete loss. Any other Ideas?
Compiler: gcc 4.4.5
OS: Linux 2.6.35
edit:
Same thing on Visual Studio 2010
If data is an istream, keep in mind that the member read returns a reference to data (the stream itself), not the number of characters read.
The void * stuff is probably because the compiler, to assign it to the std::streamsize member, tries to use the implicit conversion to void * (the one that is used when you do if(data) ...), but still void * is not a good match for std::streamsize.
By the way, the information about the number of characters read can be obtained, after the call to read, using the gcount method.
You should check the documentation. Read returns a reference to the stream. So what's happening is:
You call read, which returns an istream&.
You try to assign that istream to a std::streamsize.
Since the compiler does not find a suitable way to do this, it tryes to assign the result of the stream's operator void* to your std::streamsize.
Since you can't assign these types, an error is produced.
It must be the std::streamsize readInBuffer = data.read(... part. read doesn't return size, but the stream itself.
If you want to know how many bytes were read use readsome() not read()
I am trying to do this:
#include <atlstr.h>
CHAR Filename; // [sp+26Ch] [bp-110h]#1
char v31; // [sp+36Ch] [bp-10h]#1
int v32; // [sp+378h] [bp-4h]#1
GetModuleFileNameA(0, &Filename, 0x100u);
CString::CString(&v31, &Filename);
But I am getting the compiler error C2039:'CString': is not a member of 'ATL::CStringT'
This is a non MFC based dll, but according to the docs you should be able to use CString functionality with the include #include atlstr.h how do I make it work?
Thanks
That's not how constructors are invoked in C++.
CString s = CString(&v21,&File);
Note that GetModuleFilename expects a pointer to an array of characters (which it fills), not a pointer to a single character. Your code is therefore doomed to crash at runtime.
You have several problems in this code snippet:
1) CHAR Filename; declares a variable that is only a single character. However, GetModuleFileNameA expects to be given a pointer to an array of characters. When you pass the parameters &Filename and 0x100u you would make it think that &Filename points to an array of memory with room for up to 256 characters. However, as written in your snipped, it's only a single character. Thus you would have a bad buffer overflow.
Filename should most likely be declared as CHAR Filename[0x100]; in this case. That would also mean you don't need to take the address of Filename when passing it to that function. So the call would then be written as GetModuleFileNameA(0, Filename, 0x100u);
2) When writing code for a constructor, you define is by writing something similar to CString::CString (using whatever your class's name is) and then filling out the function. However, when using a constructor you don't use that syntax at all. You don't call CString::CString() to create a CString object.
You would have to choose an name for the CString object, such as "FilenameStr". So the in the context of you code you would write something like CString FilenameStr(Filename);
3) As implied at the end of the last point, the parameters you are trying to pass to the constructor are wrong. &v31 and &Filename would each by pointers to characters in your original code. However, as far as I know, CString does not have any constructor that takes two character pointers.
I can't even tell how v31 is supposed to be involved there, but it doesn't seem necessary at all. If you want to fill a CString with the contents of a character array, then you can just pass that array to the constructor and it will take care of everything. So, something like CString FilenameStr(Filename);
My program is crash intermittently when it tries to copy a character array which is not ended by a NULL terminator('\0').
class CMenuButton {
TCHAR m_szNode[32];
CMenuButton() {
memset(m_szNode, '\0', sizeof(m_szNode));
}
};
int main() {
....
CString szTemp = ((CMenuButton*)pButton)->m_szNode; // sometime it crashes here
...
return 0;
}
I suspected someone had not copied the character well ended by '\0', and it ended like:
Stack
m_szNode $%#^&!&!&!*#*#&!(*#(!*##&#&*&##!^&*&#(*!#*((*&*SDFKJSHDF*(&(*&(()(**
Can you tell me what is happening and what should i do to prevent the copying of wild pointer? Help will be very much appreciated!
I guess I'm unable to check if the character array is NULL before copying...
I suspect that your real problem could be that pButton is a bad pointer, so check that out first.
The only way to be 100% sure that a pointer is correct, and points to a correctly sized/allocated object is to never use pointers you didn't create, and never accept/return pointers. You would use cookies, instead, and look up your pointer in some sort of cookie -> pointer lookup (such as a hash table). Basically, don't trust user input.
If you are more concerned with finding bugs, and less about 100% safety against things like buffer overrun attacks, etc. then you can take a less aggressive approach. In your function signatures, where you currently take pointers to arrays, add a size parameter. E.g.:
void someFunction(char* someString);
Becomes
void someFunction(char* someString, size_t size_of_buffer);
Also, force the termination of arrays/strings in your functions. If you hit the end, and it isn't null-terminated, truncate it.
Make it so you can provide the size of the buffer when you call these, rather than calling strlen (or equivalent) on all your arrays before you call them.
This is similar to the approach taken by the "safe string functions" that were created by Microsoft (some of which were proposed for standardization). Not sure if this is the perfect link, but you can google for additional links:
http://msdn.microsoft.com/en-us/library/ff565508(VS.85).aspx
There are two possibilities:
pButton doesn't point to a CMenuButton like you think it does, and the cast is causing undefined behavior.
The code that sets m_szNode is incorrect, overflowing the given size of 32 characters.
Since you haven't shown us either piece of code, it's difficult to see what's wrong. Your initialization of m_szNode looks OK.
Is there any reason that you didn't choose a CString for m_szNode?
My approach would be to make m_szNode a private member in CMenuButton, and explicitly NULL-terminate it in the mutator method.
class CMenuButton {
private:
TCHAR m_szNode[32];
public:
void set_szNode( TCHAR x ) {
// set m_szNode appropriately
m_szNode[ 31 ] = 0;
}
};