Why did this work with Visual C++, but not with gcc? - c++

I've been working on a senior project for the last several months now, and a major sticking point in our team's development process has been dealing wtih rifts between Visual-C++ and gcc. (Yes, I know we all should have had the same development environment.) Things are about finished up at this point, but I ran into a moderate bug just today that had me wondering whether Visual-C++ is easier on newbies (like me) by design.
In one of my headers, there is a function that relies on strtok to chop up a string, do some comparisons and return a string with a similar format. It works a little something like the following:
int main()
{
string a, b, c;
//Do stuff with a and b.
c = get_string(a,b);
}
string get_string(string a, string b)
{
const char * a_ch, b_ch;
a_ch = strtok(a.c_str(),",");
b_ch = strtok(b.c_str(),",");
}
strtok is infamous for being great at tokenizing, but equally great at destroying the original string to be tokenized. Thus, when I compiled this with gcc and tried to do anything with a or b, I got unexpected behavior, since the separator used was completely removed in the string. Here's an example in case I'm unclear; if I set a = "Jim,Bob,Mary" and b="Grace,Soo,Hyun", they would be defined as a="JimBobMary" and b="GraceSooHyun" instead of staying the same like I wanted.
However, when I compiled this under Visual C++, I got back the original strings and the program executed fine.
I tried dynamically allocating memory to the strings and copying them the "standard" way, but the only way that worked was using malloc() and free(), which I hear is discouraged in C++. While I'm curious about that, the real question I have is this: Why did the program work when compiled in VC++, but not with gcc?
(This is one of many conflicts that I experienced while trying to make the code cross-platform.)
Thanks in advance!
-Carlos Nunez

This is an example of undefined behavior. You're passing the result of string::c_str(), a const char*, to strtok, which takes a char*. By modifying the contents of the std::string data, you're invoking undefined behavior (you should be getting warnings for this unless you're casting).
When are you checking the value of a and b? In get_string, or in main? get_string is passed copies of a and b, so strtok will most likely not alter the originals in main. However, it could, as you are invoking undefined behavior.
The "right way" to do this is to use malloc/free or new[]/delete[]. You're using a C function, so you're already guilty of the same crime as you would be using malloc/free. A relatively elegant yet safe way to approach this is:
char *ap = strdup(a.c_str());
const char *a_ch = strtok(ap, ",");
/* do whatever it is you do */
free(ap);
Also bear in mind that strtok uses global state, so it won't play well with threads.

Tokens will be automatically replaced by a null-character by function strtok. That is not what you can do with constant data.
To make your code safe and cross-platform consider using boost::tokenizer.

I think the code is working because of differences in string implementation. VC++ string implementation must be making copies when you pass them to a function that could potentially modify the string.

Related

Are the methods in the <cstring> applicable for string class too?

I've tried out using memcpy() method to strings but was getting a "no matching function call" although it works perfectly when I use an array of char[].
Can someone explain why?
www.cplusplus.com/reference/cstring/memcpy/
std::string is an object, not a contiguous array of bytes (which is what memcpy expects). std::string is not char*; std::string contains char* (somewhere really deep).
Although you can pull out the std::string inner byte array by using &str[0] (see note), I strongly encourage you not to. Almost anything you need to do already is implemented as a std::string method. Including appending, subtracting, transforming and anything that makes sense with a text object.
So yes, you can do something as stupid as:
std::string str (100,0);
memcpy(&str[0],"hello world", 11);
but you shouldn't.
Even if you do need memcpy behaviuor, try to use std::copy instead.
Note: this is often done with C functions that expects some buffer, while the developer wants to maintain a RAII style in his code. So he or she produces std::string object but passes it as C string. But if you do clean C++ code you don't need to.
Because there's no matching function call. You're trying to use C library functions with C++ types.

Pass CString to fprintf

I have ran the code analyzer in visual studio on a large code base and i got about a billion of this error:
warning C6284: Object passed as parameter '3' when string is required in call to 'fprintf'
According to http://msdn.microsoft.com/en-us/library/ta308ywy.aspx "This defect might produce incorrect output or crashes." My colleague however states that we can just ignore all these errors without any problems. So one of my questions is do we need to do anything about this or can we just leave it as is?
If these errors need to be solved what is the nicest approach to solve it?
Would it work to do like this:
static_cast<const char*>(someCString)
Is there a better or more correct approach for this?
The following lines generate this warning:
CString str;
fprintf(pFile, "text %s", str);
I'm assuming that you're passing a Microsoft "CString" object to a printf()-family function where the corresponding format specifier is %s. If I'm right, then your answer is here: How can CString be passed to format string %s? (in short, your code is OK).
It seems that originally an implementation detail allowed CString to be passed directly to printf(), and later it was made part of the contract. So you're good to go as far as your program being correct, but if you want to avoid the static analysis warning, you may indeed need to use the static_cast to a char pointer. I'm not sure it's worth it here...maybe there's some other way to make these tools place nice together, since they're all from Microsoft.
Following the MSDN suggestions in C6284, you may cast the warnings away. Using C++ casts will be the most maintainable option to do this. Your example above would change to
fprintf(pFile, "text %s", static_cast<const TCHAR*>(str));
or, just another spelling of the same, to
fprintf(pFile, "text %s", static_cast<LPCTSTR>(str));
The most convincing option (100% cast-free, see Edits section) is
fprintf(pFile, "text %s", str.GetString());
Of course, following any of these change patterns will be a first porting step, and if nothing indicates a need for it, this may be harmful (not only for your team atmosphere).
Edits: (according to the comment of xMRi)
1) I added const because the argument is read-only for fprintf
2) notes to the cast-free solution CSimpleStringT::GetString: the CSimpleStringT class template is used for the definition of CStringT which again is used to typedef the class CString used in the original question
3) reworked answer to remove noise.
4) reduced the intro about the casting option
Technically speaking it is ok because the c-string is stored in such a way in CString that you can use it as stated but it is not good rely on how CString is implemented to do a shortcut. printf is a C-runtime function and knows nothing about C++ objects but here one is relying on an that the string is stored first in the CString - an implementation detail.
If I recall correctly originally CString could not be used that way and one had to cast the CString to a c-string to print it out but in later versions MS changed the implementation to allow for it to be treated as a c-string.
Another breaking issue is UNICODE, it will definitely not work if you one day decide to compile the program with UNICODE character set since even if you changed all string formatters to %ld, embedded 0s will sometimes prevent the string from being printed.
The actual problem is rather why are you using printf instead of C++ to print/write files?

VC++2010 seems to only allocate a single std::string in an fixed-array

This seems simple to me but I'm having trouble actually finding anything explicitly stating this.
Does std::string stringArray[3] not create an array of std::string objects, the way that SomeType typeArray[3] would? The actual number is irrelevant; I just picked 3 arbitrarily.
In the Visual Studio 2010 debugger, it appears to create a single string as opposed to an array of strings. Why?
Wild guess: is it calling the default std::string constructor, and then invoking an unused access of index 3? If so, why doesn't that cause an out of bounds exception on an empty string?
Does this have something to do with overloading the [] operator?
There are plenty of ways to code the same things without specifically using an array of std::string without issue, but what is the explanation/justification for this behavior? It seems counterintuitive to me.
Edit: I found this thread std::string Array Element Access in which the comments on the answer appear to observe the same behavior.
Visual studio's debugger will often show you only the first thing an array or pointer points to. You will have to use stringArray[1] stringArray[2] etc to see farther than that.
The OP is NOT losing his mind. But Visual Studio's integrated debugger certainly is. I believe this is a bust in the detection of operator [] for std::string, which is sad However, it does work correctly for std::vector<std::string>
An excellent reference to how you can get around this using the watch-window (no way to do it using the Auto or Locals window afaik) can be found at this previously posted question. This is also an extremely helpful method for viewing pointer-based dynamic arrays (arrays allocated with Type *p = new Type[n];). Hint: use "varname,n" where varname is the variable (pointer or fixed array), and 'n' is the number of elements to expand.
A demonstration is in order to show a declared C-array of std::string and what the OP was observing, and a std::vector<std::string> to show what things should look like:
int main(int argc, char *argv[])
{
std::string stringArray[3];
std::vector<std::string> vecStrings(3);
return 0; // set bp *here*
}
It creates an array. Either you compiled something different or you misinterpreted the results.
It doesn't create an array in these contexts:
void function(std::string stringArray[3]) { }
This function parameter is a pointer to std::string, you can't have function parameters of array type.
extern std::string strnigArray[3];
This declares but doesn't define an array. It tells the compiler there is an array of three strings somewhere in the program call stringArray but doesn't actually create it.
Otherwise, it creates an array of three strings.
You can check that with:
assert( sizeof(stringArray) == 3*sizeof(std::string) );
or in C++11
static_assert( sizeof(stringArray) == 3*sizeof(std::string), "three strings" );
Yeah the debugger in VS 2010 doesn't do a great job here, but as others have said, it's only a display issue in the debugger, it does work as you'd expect.
In VS 2012 they've done a much better job :-

Issue with using std::copy

I am getting warning when using the std copy function.
I have a byte array that I declare.
byte *tstArray = new byte[length];
Then I have a couple other byte arrays that are declared and initialized with some hex values that i would like to use depending on some initial user input.
I have a series of if statements that I use to basically parse out the original input, and based on some string, I choose which byte array to use and in doing so copy the results to the original tstArray.
For example:
if(substr1 == "15")
{
std::cout<<"Using byte array rated 15"<<std::endl;
std::copy(ratedArray15,ratedArray15+length,tstArray);
}
The warning i get is
warning C4996: 'std::copy': Function call with parameters
that may be unsafe
- this call relies on the caller to check that the passed
values are correct.
A possible solution is to to disable this warning is by useing -D_SCL_SECURE_NO_WARNINGS, I think. Well, that is what I am researching.
But, I am not sure if this means that my code is really unsafe and I actually needed to do some checking?
C4996 means you're using a function that was marked as __declspec(deprecated). Probably using D_SCL_SECURE_NO_WARNINGS will just #ifdef out the deprecation. You could go read the header file to know for sure.
But the question is why is it deprecated? MSDN doesn't seem to say anything about it on the std::copy() page, but I may be looking at the wrong one. Typically this was done for all "unsafe string manipulation functions" during the great security push of XPSP2. Since you aren't passing the length of your destination buffer to std::copy, if you try to write too much data to it it will happily write past the end of the buffer.
To say whether or not your usage is unsafe would require us to review your entire code. Usually there is a safer version they recommend when they deprecate a function in this manner. You could just copy the strings in some other way. This article seems to go in depth. They seem to imply you should be using a std::checked_array_iterator instead of a regular OutputIterator.
Something like:
stdext::checked_array_iterator<char *> chkd_test_array(tstArray, length);
std::copy(ratedArray15, ratedArray15+length, chkd_test_array);
(If I understand your code right.)
Basically, what this warning tells you is that you have to be absolutely sure that tstArray points to an array that is large enough to hold "length" elements, as std::copy does not check that.
Well, I assume Microsoft's unilateral deprecation of the stdlib also includes passing char* to std::copy. (They've messed with a whole range of functions actually.)
I suppose parts of it has some merit (fopen() touches global ERRNO, so it's not thread-safe) but other decisions do not seem very rational. (I'd say they took a too big swathe at the whole thing. There should be levels, such as non-threadsafe, non-checkable, etc)
I'd recommend reading the MS-doc on each function if you want to know the issues about each case though, it's pretty well documented why each function has that warning, and the cause is usually different in each case.
At least it seems that VC++ 2010 RC does not emit that warning at the default warning level.

use of char * vs std::string in different environments

I have been using std::string in my code. I was going to make a std::string and pass it by reference. However, someone suggested using a char * instead. Something about std::string is not reliable when porting code. Is that true? I have avoided using char * as I would need to do some memory management for it. Instead I find using the std::string much easier to use.
Basically I have a 10 digit output that I am storing in this string. Atm, I am not sure which would be better to use.
std::string is part of the C++ Standard, and has been since 1998. It is available in all the current C++ compilers. There really is no portability reason not to use it. If you have an API that needs to use a C-style string, you can use the std::string's c_str() member to get one from a string:
std::string s = "foo";
int n = strlen( s.c_str() );
In C++, almost every string should be std::string unless another library requires a cstring, in which case you should still be using an std::string and passing string.c_str(), unless you're using functions that work with buffers.
However, if you're writing a library and exporting functions, it's better to use const char* parameters rather than std::string parameters for portability.
Using a char * you are sure that you will not get portability issues among libraries.
If a library exports a function that uses an std::string, it might have problems communicating with another library that has been linked against a different version of the standard library.
I think that there is nothing to worry about unless you are going to provide some API to 3rd party.
Just use std::string
There's nothing unportable about std::string that isn't also an issue with char *. std::string actually uses a char * internally...
string is better. There is nothing unreliable about it on any platform. If you're worried about passing large classes, you can pass const references of your strings into functions. Makes coding faster and less bug prone.
In addition to the fact thata it's easier, std::string will probably be more efficient. Its small string optimization can keep the 10 digits in the std::string object itself, instead of putting them in another memory block off the heap.