I'm trying to do some classic C development in Visual C++ 2008 that will modify the characters of a string like so:
void ModifyString(char *input)
{
// Change first character to 'a'
*input = 'a';
}
I'm getting a unhandled exception when I try to change a character. It seems like I could do this in Visual Studio 6 or using gcc, but maybe I'm just forgetting something. Does Visual Studio somehow pass char* by value (managing memory). If so, how do I turn this off?
You're probably passing a string literal somewhere:
ModifyString("oops"); // ERROR!
C and C++ allow you to implicitly cast from string literals (which have type const char[]) to char*, but such usage is deprecated. String constants are allowed to be allocated in read-only memory (and they usually are), so if you attempt to modify them, you'll get an access violation (aka segmentation fault or bus error). If the compiler doesn't put string constants in read-only memory, the program will still work, but it is undefined behavior.
The correct way to do this is to copy the string into a writeable buffer:
// one way:
char mystring[] = "test";
ModifyString(mystring); // ok
// another way:
char mystring[64]; // make sure this is big enough!!
strcpy(mystring, "test");
ModifyString(mystring); // ok
Is the input a string literal? That's probably the problem. Otherwise you'll need to post more code, as the pointer has somehow ended up pointing at a readonly location in memory.
It's impossible to answer this question without seeing how ModifyString is called. The function itself is correct assuming it's contract is to be passed a non-NULL value.
However it's possible for the call site to fail by doing any number of things
Passing NULL
Passing const char by means of an evil cast
I can't say exactly why this doesn't work, but the problem is in your code, not Visual Studio. For some reason, you are passing an invalid pointer to the function. It is either a null pointer, or it points to some address you don't have read access to.
If you post some more of the code (where is the function called from, and how is it called?), we may be able to point out the exact problem.
The reason it worked in GCC or VC6 is quite simply that it is undefined behavior. The C++ standard doesn't say that "this should work", or "this should cause a crash". Anything can happen if you write to memory you don't have access to. And depending on the compiler, and the system you're running the application on, the address you end up accessing will vary. By sheer luck, you hit an address that caused an access violation when compiled with VC2008. Under GCC and VC6, you weren't as lucky, and got code which appeared to work, and simply wrote to some garbage address.
Related
I've adapted Matt Gallagher's "Testing if an arbitrary pointer is a valid object pointer" in an iOS project which uses Objective-C++. It's working fine with Objective-C objects but it always tells me that my C++-Pointers are invalid regardless of whether it works or not. Sometimes the Code crashes at the pointer. Sometimes the code works fine. But the test-method always tells me the pointer is wrong.
Is here anybody who knows to adapt this code to C++ classes and objects too? I could imagine that the code is only working with Objective-C according to the use of "Class"
The contents of a pointer variable is either: A null pointer, a valid pointer to an object, a valid pointer to an array element or past the last element of an array, or some invalid pointer.
If it is an invalid pointer, then any attempt to use it invokes undefined behaviour. That includes any attempt to check that it is an invalid pointer. And there you are stuck. All you can do is check whether it is a null pointer, or whether it is equal to some other valid pointer.
You should go with the Objective-C philosophy: Trying to use an invalid pointer is a programming error. You don't try to detect and handle this at runtime. You let it crash and fix the bug in your code.
C++ pointers simply reference an address in memory. You could look at what's there in memory using a memory viewer tool, but that wouldn't even guarantee that the memory is still valid. For example:
char* test = new[13];
strcpy(test, "Hello World!");
delete[] test;
.
.
.
printf("%s", test);
In some cases this will print successfully. Sometimes it will print a garbage string. And sometimes it will segfault. There is nothing there to speak to the pointer's validity.
If you're looking at a program that has just segfaulted and you're trying to see what happened there are a few options available to you:
You can look at the memory through a memory viewer, that in combination with the line you faulted on can provide insight.
You can seed your memory before running to make this clearer use 0xbadfood5 or something similar.
Use Valgrind when running is a great tool, if you can deal with the overhead.
The best option is to do error checking in your code. It sounds like you don't have that or you wouldn't be here. Preconditions and postconditions are great and will save you a ton of time in the long run (like now.) However as a silver lining you should exploit this to exact better coding standards in your organization for the future.
In our C++ code, we have our own string class (for legacy reasons). It supports a method c_str() much like std::string. What I noticed is that many developers are using it incorrectly. I have reduced the problem to the following line:
const char* x = std::string("abc").c_str();
This seemingly innocent code is quite dangerous in the sense that the destructor on std::string gets invoked immediately after the call to c_str(). As a result, you are holding a pointer to a de-allocated memory location.
Here is another example:
std::string x("abc");
const char* y = x.substr(0,1).c_str();
Here too, we are using a pointer to de-allocated location.
These problems are not easy to find during testing as the memory location still contains valid data (although the memory location itself is invalid).
I am wondering if you have any suggestions on how I can modify class/method definition such that developers can never make such a mistake.
The modern part of the code should not deal with raw pointers like that.
Call c_str only when providing an argument to a legacy function that takes const char*. Like:
legacy_print(x.substr(0,1).c_str())
Why would you want to create a local variable of type const char*? Even if you write a copying version c_str_copy() you will just get more headache because now the client code is responsible for deleting the resulting pointer.
And if you need to keep the data around for a longer time (e.g. because you want to pass the data to multiple legacy functions) then just keep the data wrapped in a string instance the whole time.
For the basic case, you can add a ref qualifier on the "this" object, to make sure that .c_str() is never immediately called on a temporary. Of course, this can't stop them from storing in a variable that leaves scope before the pointer does.
const char *c_str() & { return ...; }
But the bigger-picture solution is to replace all functions from taking a "const char *" in your codebase with functions that take one of your string classes (at the very least, you need two: an owning string and a borrowed slice) - and make sure that none of your string class does cannot be implicitly constructed from a "const char *".
The simplest solution would be to change your destructor to write a null at the beginning of the string at destruction time. (Alternatively, fill the entire string with an error message or 0's; you can have a flag to disable this for release code.)
While it doesn't directly prevent programmers from making the mistake of using invalid pointers, it will definitely draw attention to the problem when the code doesn't do what it should do. This should help you flush out the problem in your code.
(As you mentioned, at the moment the errors go unnoticed because for the most part the code will happily run with the invalid memory.)
Consider using Valgrind or Electric Fence to test your code. Either of these tools should trivially and immediately find these errors.
I am not sure that there is much you can do about people using your library incorrectly if you warn them about it. Consider the actual stl string library. If i do this:
const char * lala = std::string("lala").c_str();
std::cout << lala << std::endl;
const char * lala2 = std::string("lalb").c_str();
std::cout << lala << std::endl;
std::cout << lala2 << std::endl;
I am basically creating undefined behavior. In the case where i run it on ideone.com i get the following output:
lala
lalb
lalb
So clearly the memory of the original lala has been overwritten. I would just make it very clear to the user in the documentation that this sort of coding is bad practice.
You could remove the c_str() function and instead provide a function that accepts a reference to an already created empty smart pointer that resets the value of the smart pointer to a new copy of the string. This would force the user to create a non temporary object which they could then use to get the raw c string and it would be destructed and free the memory when exiting the method scope.
This assumes though that your library and its users would be sharing the same heap.
EDIT
Even better, create your own smart pointer class for this purpose whose destructor calls a library function in your library to free the memory so it can be used across DLL boundaries.
I have some code I wrote a few years ago. It has been working fine, but after a recent rebuild with some new, unrelated code elsewhere, it is no longer working. This is the code:
//myobject.h
...
inline CMapStringToOb* GetMap(void) {return (m_lpcMap);};
...
The above is accessed from the main app like so:
//otherclass.cpp
...
CMapStringToOb* lpcMap = static_cast<CMyObject*>(m_lpcBaseClass)->GetMap();
...
Like I said, this WAS working for a long time, but it's just decided to start failing as of our most recent build. I have debugged into this, and I am able to see that, in the code where the pointer is set, it is correctly setting the memory address to an actual value. I have even been able to step into the set function, write down the memory address, then move to this function, let it get 0xfdfdfdfd, and then manually get the memory address in the debugger. This causes the code to work. Now, from what I've read, 0xfdfdfdfd means guarding bytes or "no man's land", but I don't really understand what the implications of that are. Supposedly it also means an off by one error, but I don't understand how that could happen, if the code was working before.
I'm assuming from the Hungarian notation that you're using Visual Studio. Since you do know the address that holds the map pointer, start your program in the debugger and set a data breakpoint when that map pointer changes (the memory holding the map pointer, not the map pointed to). Then you'll find out exactly when it's getting overwritten.
0xfdfdfdfd typically implies that you have accessed memory that you weren't supposed to.
There is a good chance the memory was allocated and subsequently freed. So you're using freed memory.
static_cast can modify a pointer and you have an explicit cast to CMyObject and an implicit cast to CMapStringToOb. Check the validity of the pointer directly returned from GetMap().
Scenarios where "magic" happens almost always come back to memory corruption. I suspect that somewhere else in your code you've modified memory incorrectly, and it's resulting in this peculiar behavior. Try testing some different ways of entering this part of the code. Is the behavior consistent?
This could also be caused by an incorrectly built binary. Try cleaning and rebuilding your project.
As far as I know the following code is bad. But, Visual Studio 2010 doesn't give me any warning.
char* CEmployee::GetEmployeeName()
{
char* szEmployeeName = "";
CEmployeeModel* model = GetSwitchMod();
if (model != NULL)
{
szEmployeeName = model->GetName();
}
return szEmployeeName;
}
It's not the compiler's job to debug your code.
lint or similar static checker might find this. Try running Code Analysis if you have one of the premium VS versions that includes it. Make sure you build with /W4 and fix all warning errors.
You're not returning a reference to a local variable, as you're returning by value, so the local variable — the pointer — is copied.
Don't confuse the pointer with its pointee.
If anything, you'd be returning a dangling pointer (though in practice the string literal buffer is likely to be in static memory somewhere). Dangling pointers don't tend to be diagnosed at compile-time.
If model->GetName() returns a dynamically-allocated buffer, making the pointer no longer point to the string literal, then your code is fine.
TRWTF is that you didn't write char const* szEmployeeName = "". Leaving out the const has been deprecated for over a decade, and is illegal in C++0x. It's a concern that so many people are still doing this.
It's even worse that there are still people using char* for strings, instead of std::string.
Returning szEmployeeName here is actually not an error - the string is allocated statically in read-only memory (the .rodata section in ELF executables). Quoting the (C++03) Standard:
2.13.4.1
An ordinary string literal has type
“array of n const char” and static
storage duration (3.7), where n is the
size of the string as defined below,
and is initialized with the given
characters.
3.7.1
All objects which neither have dynamic
storage duration nor are local have
static storage duration. The storage
for these objects shall last for the
duration of the program
On the other hand, trying to modify this string results in undefined behaviour - in this particular case, you'll most likely get a crash at runtime. szEmployeeName should be really declared as const char* (and there are historical reasons why the standard allows initializing a plain char * with a string literal). Again, quoting the Standard:
2.13.14.2
The effect of attempting to modify a
string literal is undefined.
You're returning a pointer to a char at the end. Are you sure the memory that the pointer is referring to is still active when the code leaves the function* (what is the lifetime of model->GetName()'s return)
*EDIT: "loop" is wrong.
This code isn't necessarily "wrong" in all cases. If the thing pointed to by the pointer returned from GetName is still alive, and the pointer returned from GetEmployeeName is not written to then the code appears to be well-formed. The compiler can't reasonably be expected to do a full analysis of all your code to tell you if there's an actual problem with your pointer manipulation.
You should be using std::string as #Tomalak Geret'kal noted in his answer. That then resolves all these lifetime issues.
There's a certain point at which you should be able to say "Why am I writing code this way???" and the compiler isn't going to go to extra-ordinary lengths to warn you about every possible undefined behavior in your program (it's undefined for a reason).
This code is fine. There's nothing going on here that could possibly cause the target of szEmployeeName to be freed.
If model is NULL, then you return a pointer to "". Using a non-const pointer certainly is questionable, but the string literal "" survives for the lifetime of your program, it's not an error to return it.
If model is non-null, you return the pointer returned by model->GetName(). Since CEmployee::GetEmployeeName() doesn't free any memory, the pointer is just as valid when returned as it was when you got it from model->GetName(). Specifically, either the pointer is valid, or it is a dangling pointer, indicating a bug in CEmployeeModel->GetName().
There are no circumstances where CEmployeeModel::GetName() is correct but CEmployee::GetEmployeeName returns a bad pointer.
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Why does simple C code receive segmentation fault?
Why code snippet 2 doesn't behave like snippet 1?
//Code snippet 1
char pstr[] = "helloworld";
char *p = pstr;
p[2] = 'd';
//Code snippet 2
char *p = "helloworld";
p[2] = 'd'; //error: access violation
P.S Forgive my ignorance.
The first snippet creates an array of char and initializes its contents to "helloworld". Then you change its third element.
The second one simply creates a pointer to char that points to a string literal. Then you attempt to change the third character of that literal. String literals are not writable in code produced by many modern compilers.
EDIT:
GCC used to have an -fwritable-strings option that enabled string literals to be writable, since there is legacy code around that depends on this behaviour. That option was removed in the GCC 4.0 release series.
"helloworld" is an array of const char. There's a hole in the type system which allows you to point to it with a char*, because a lot of code exists which uses a char * to point to readonly data and this is safe.
But const_cast rules apply, you can't actually write to the const data even if you make a non-const pointer to it.
It would help if you could tell us in what way they're behaving differently.
But as a guess, I think your problem is that the second form has 'p' pointing to a string in read-only memory. Attempts to write through the pointer 'p' would result in a program failure.
I can tell you that the Gnu c++ compiler will warn you about this.
I'm guessing when you say "doesn't behave like" you mean one throws an illegal access exception (or something similar) while the other gives a compile time warning or error?
The answer is that in the first case, you're creating a pointer to your own memory, and copying the c9ontents into it. At that point the compiler forgets it used to be a pointer to static memory; the run time system, however, doesn't forget.
In the other case, the compiler "knows" p is a pointer to static memory, and so has the chance to say "whoa, dude, can't do that".
But this is a bit of a guess without knowing exactly what it does differently. it's also going to be compiler and implementation dependent.