I was looking over the following code:
string toUpper(string s) {
string result;
int (*toupperp)(int) = &toupper; // toupper is overloaded
transform(begin(s), end(s), back_inserter(result), toupperp);
return result;
}
I am confused by this line:
int (*toupperp)(int) = &toupper; // toupper is overloaded
1.Why is this line necessary?
2.I believe that & retrieves a pointer to something from memory. But toupper, the name of the function is already a pointer, no? Why can't we do this:
int (*toupperp)(int) = toupper;
3.Why is the function overloaded to int if it's used on a string?
1) It's not necessary, really. If you have used using namespace std directive, it's necessary to cast to the desired type to let the compiler know which overload you want. So, you might also say
transform(begin(s), end(s), back_inserter(result), static_cast<int(*)(int)>(&toupper));
Otherwise the following should be enough:
transform(begin(s), end(s), back_inserter(result), ::toupper);
2) Identifiers that are function names decay into pointers, yes, but they aren't exactly the same thing. That being said, in this case it should be fine to say
int (*toupperp)(int) = toupper;
or even (if you haven't used using namespace std directive):
auto toupperp = toupper;
3) it's for compatibility with C standard library. It's used on every element of s, which for string is a char.
What you are passing to transform is a pointer to the function toupper (see function pointers). You store this pointer into the local variable toupperp. The type of toupperp is a pointer to a function taking an int as argument and returning an int.
Unless toupper is defined in strange ways, the function is seemingly used by transform to change each input character to uppercase. Each single character is processed as an integer (with an implicit cast if needed).
Regarding your question 2, using the operator & you make more explicit you are taking the address of the function, but indeed you could omit it. See here (I learnt something today).
If toupper was overloaded, using the intermediate variable is a safe way to get exactly the desired overload. If the desired overload goes away this method will catch the problem at compile time. See here. (And that's something else I learnt today).
Related
I have a class SpecialString. It has an operator overload / conversion function it uses any time it's passed off as a const char*. It then returns a normal c-string.
class SpecialString
{
...
operator char* () const { return mCStr; }
...
};
This used to work a long time ago (literally 19 years ago) when I passed these directly into printf(). The compiler was smart enough to know that argument was meant to be a char* and it used the conversion function, but the now g++ complains.
SpecialString str1("Hello"), str2("World");
printf("%s %s\n", str1, str2);
error: cannot pass object of non-POD type 'SPECIALSTRING' (aka 'SpecialString') through variadic method; call will abort at runtime [-Wnon-pod-varargs]
Is there any way to get this to work again without changing the code? I can add a deref operator overload function that returns the c-string and pass the SpecialString objects around like this.
class SpecialString
{
...
operator CHAR* () const { return mCStr; }
char* operator * () const { return mCStr; }
...
};
SpecialString str1("Hello"), str2("World");
printf("%s %s\n", *str1, *str2);
But I'd prefer not to because this requires manually changing thousands of lines of code.
You could disable the warning, if you don't want to be informed about it... but that's a bad idea.
The behaviour of the program is undefined, you should fix it and that requires changing the code. You can use the exising conversion operator with static_cast, or you can use your unary * operator idea, which is going to be terser.
Even less change would be required if you used unary + instead which doesn't require introducing an overload, since it will invoke the implicit conversion instead. That may add some confusion to the reader of the code though.
Since you don't want to modify the existing code, you can write a "hack" instead. More specifically, a bunch of overloads to printf() that patch the existing code.
For example:
int printf(const char* f, const SpecialString& a, const SpecialString& b)
{
return printf(f, (const char*)a, (const char*)b);
}
With this function declared in your header, every call to printf() with those specific parameters will use this function instead of the "real" printf() you're familiar with, and perform the needed conversions.
I presume you have quite a few combinations of printf() calls in your code envolving SpecialString, so you may have to write a bunch of different overloads, and this is ugly af to say the least but it does fit your requirement.
As mentioned in another comment, it has always been undefined behavior that happens to work in your case.
With Microsoft CString class, it seems like the undefined behavior was so used (as it happen to work), that now the layout is defined in a way that it will still works. See How can CString be passed to format string %s?.
In our code base, I try to fix code when I modify a file to explicitly do the conversion by calling GetString()
There are a few things you could do:
Fix existing code everywhere you get the warning.
In that case, a named function like c_str or GetString is preferable to a conversion operator to avoid explicit casting (for ex. static_cast or even worst C-style case (const char *). The deref operator might be an acceptable compromise.
Use some formatting library
<iosteam>
fmt: https://github.com/fmtlib/fmt
many other choices (search C++ formatting library or something similar)
Use variadic template function so that conversion could be done.
If you only use a few types (int, double, string) and rarely more than 2 or 3 parameters, defining overloads might also be a possibility.
Not recommended: Hack your class to works again.
Have you done any change to your class definition that cause it to break or only upgrade the compiler version or change compiler options?
Such hack is working with undefined behavior so you must figure out how your compiler works and the code won't be portable.
For it to works the class must have the size of a pointer and the data itself must be compatible with a pointer. Thus essentially, the data must consist of a single pointer (no v-table or other stuff).
Side note: I think that one should avoid defining its own string class. In most case, standard C++ string should be used (or string view). If you need additional functions, I would recommend your to do write stand-alone function in a namespace like StringUtilities for example. That way, you avoid converting back and forth between your own string and standard string (or some library string like MFC, Qt or something else).
This is a simple operator overloading program I learned and I can't understand exactly why the parameterized constructor is using '*st' and not just 'st' or '&st'.
Now, I know the difference between passing by reference, address and value.
In the given example, I passed a string. If it was passing by reference, the argument in the parameterized constructor would have been '&st'. I'm not sure if it is passing the string by address.
Basically, I don't know how the code is working there. Please explain that and also why using '&st' in place of '*st' isn't working.
class tyst
{
int l;
char *p;
public:
tyst(){l=0;p=0;}
tyst(char *st)
{
l=strlen(st);
p = new char[l];
strcpy(p,st);
}
void lay();
tyst operator + (tyst b);
};
void tyst::lay()
{
cout<<p<<endl;
}
tyst tyst::operator+(tyst b)
{
tyst temp;
temp.l=l+b.l;
temp.p = new char[temp.l];
strcpy(temp.p,p);
strcat(temp.p,b.p);
return temp;
}
int main()
{
tyst s1("Markus Pear");
tyst s2("son Notch");
tyst s3;
s3=s1+s2;
s3.lay();
return 0;
}
So, I'd really appreciate if anyone can clear this up for me.
st is a C-style string. Passed by value
tyst(char st)
you would merely get a single character.
Passed by reference
tyst(char & st)
You would also get only a single character, but you could modify it. Not so useful in this case. You could also pass in a reference to a pointer, but I don't see much use to that, either.
How this is working
tyst(char * st)
says that the function will take a pointer, and that pointer may be pointing to a single character, the first of an unknown number of characters, absolutely nothing, or complete garbage. The last two possibilities are why use of references is preferred over pointers, but you can't use a reference for this. You could however use a std::string. In C++, this would often be the preferred approach.
Inside tyst, the assumption that an unknown, but null-terminated, number of characters is pointed at, and this is almost what is being provided. Technically, what you are doing here with
tyst s1("Markus Pear");
is illegal. "Markus Pear" is a string literal. It may be stored in non-writeable memory, so it is a const char *, not a char *.
The constructor is expecting a pointer to a c-string (null terminated character array). Applying pointer arithmetic to it will allow access to the entire string. Note that new[] returns a pointer to the first element; it is (one way) how pointers are used.
Aside from the syntax errors, it makes no sense to pass a single character by reference to the class. It isn't interested in a single character, it is interested in where the beginning of the array is.
It would be like asking somebody what their home address is, and they give you a rock from their lawn.
I was looking for a way to uppercase a standard string. The answer that I found included the following code:
int main()
{
// explicit cast needed to resolve ambiguity
std::transform(myString.begin(), myString.end(), myString.begin(),
(int(*)(int)) std::toupper)
}
Can someone explain the casting expression “(int(*) (int))”? All of the other casting examples and descriptions that I’ve found only use simple type casting expressions.
It's actually a simple typecast - but to a function-pointer type.
std::toupper comes in two flavours. One takes int and returns int; the other takes int and const locale& and returns int. In this case, it's the first one that's wanted, but the compiler wouldn't normally have any way of knowing that.
(int(*)(int)) is a cast to a function pointer that takes int (right-hand portion) and returns int (left-hand portion). Only the first version of toupper can be cast like that, so it disambiguates for the compiler.
(int(*)(int)) is the name of a function pointer type. The function returns (int), is a function *, and takes an (int) argument.
As others already mentioned, int (*)(int) is the type pointer to a function which takes and returns int. However what is missing here is what this cast expression does: Unlike other cast expressions it does not really cast (i.e. it does not convert a value into a different type), but it selects from the overloaded set of functions named std::toupper the one which has the signature int(int).
Note, however, that this method is somewhat fragile: If for some reason there's no matching function (for example because the corresponding header was not included) but only one non-matching function (so no ambiguity arises), then this cast expression will indeed turn into a cast, more exactly a reinterpret_cast, with undesired effects. To make sure that no unintended cast happens, the C++ style cast syntax should be used instead of the C style cast syntax: static_cast<int(*)(int)>(std::toupper) (actually, in the case of std::toupper this case cannot occur because the only alternative function is templated and therefore ambiguous, however it could happen with other overloaded functions).
Coincidentally, the new-style cast syntak is more readable in that case, too.
Another possibility, which works without any cast expression, is the following:
int (*ptoupper)(int) = &std::toupper; // here the context provides the required type information
std::transform(myString.begin(), myString.end(), myString.begin(), ptoupper);
Note that the reason why the context cannot provide the necessary information is that std::transform is templated on the last argument, therefore the compiler cannot determine the correct function to choose.
int function(int);
A function taking int and returning int.
int (*function_pointer)(int);
A pointer to a function taking int and returning int.
int (*)(int)
The type of a pointer to a function taking int and returning int.
std::toupper from <cctype> already has type int (*)(int), but the one in <locale> is templatized on charT, which I assume is the reason for the cast. But ptr_fun would be clearer.
I have a class called FileProc that runs File IO operations. In one instance I have declared two functions (which are sub-functions to operator= functions), both decisively different:
const bool WriteAmount(const std::string &Arr, const long S)
{
/* Do some string conversion */
return true;
}
const bool WriteAmount(const char Arr[], unsigned int S)
{
/* Do some string conversion */
return true;
}
If I make a call with a 'char string' to WriteAmount, it reports an ambiguity error - saying it is confused between WriteAmount for char string and WriteAmount for std::string. I know what is occurring under the hood - it's attempting to using the std::string constructor to implicitly convert the char string into a std::string. But I don't want this to occur in the instance of WriteAmount (IE I don't want any implicit conversion occurring within the functions - given each one is optimised to each role).
My question is, for consistency, without changing the function format (IE not changing number of arguments or what order they appear in) and without altering the standard library, is there anyway to prevent implicit conversion in the functions in question?
I forgot to add, preferably without typecasting, as this will be tedious on function calls and not user friendly.
You get the ambiguity because your second parameter is different. Trying to call it with long x = ...; WriteAmount("foo", x) will raise an ambiguity because it matches the second argument better with the first overload, but the first argument is matched better with the second overload.
Make the second parameter have the same type in both cases and you will get rid of the ambiguity, as then the second argument is matched equally worse/good for both overloads, and the first argument will be matched better with the second overload.
Can't you change the second argument and cast it to unsigned int ? It should not be able to use the first function call. I have not coded in C++ for ages..
I have just done what appears to be a common newbie mistake:
First we read one of many tutorials that goes like this:
#include <fstream>
int main() {
using namespace std;
ifstream inf("file.txt");
// (...)
}
Secondly, we try to use something similar in our code, which goes something like this:
#include <fstream>
int main() {
using namespace std;
std::string file = "file.txt"; // Or get the name of the file
// from a function that returns std::string.
ifstream inf(file);
// (...)
}
Thirdly, the newbie developer is perplexed by some cryptic compiler error message.
The problem is that ifstream takes const * char as a constructor argument.
The solution is to convert std::string to const * char.
Now, the real problem is that, for a newbie, "file.txt" or similar examples given in almost all the tutorials very much looks like a std::string.
So, is "my text" a std::string, a c-string or a *char, or does it depend on the context?
Can you provide examples on how "my text" would be interpreted differently according to context?
[Edit: I thought the example above would have made it obvious, but I should have been more explicit nonetheless: what I mean is the type of any string enclosed within double quotes, i.e. "myfilename.txt", not the meaning of the word 'string'.]
Thanks.
So, is "string" a std::string, a c-string or a *char, or does it depend on the context?
Neither C nor C++ have a built-in string data type, so any double-quoted strings in your code are essentially const char * (or const char [] to be exact). "C string" usually refers to this, specifically a character array with a null terminator.
In C++, std::string is a convenience class that wraps a raw string into an object. By using this, you can avoid having to do (messy) pointer arithmetic and memory reallocations by yourself.
Most standard library functions still take only char * (or const char *) parameters.
You can implicitly convert a char * into std::string because the latter has a constructor to do that.
You must explicitly convert a std::string into a const char * by using the c_str() method.
Thanks to Clark Gaebel for pointing out constness, and jalf and GMan for mentioning that it is actually an array.
"myString" is a string literal, and has the type const char[9], an array of 9 constant char. Note that it has enough space for the null terminator. So "Hi" is a const char[3], and so forth.
This is pretty much always true, with no ambiguity. However, whenever necessary, a const char[9] will decay into a const char* that points to its first element. And std::string has an implicit constructor that accepts a const char*. So while it always starts as an array of char, it can become the other types if you need it to.
Note that string literals have the unique property that const char[N] can also decay into char*, but this behavior is deprecated. If you try to modify the underlying string this way, you end up with undefined behavior. Its just not a good idea.
std::string file = "file.txt";
The right hand side of the = contains a (raw) string literal (i.a. a null-terminated byte string). Its effective type is array of const char.
The = is a tricky pony here: No assignment happens. The std::string class has a constructor that takes a pointer to char as an argument and this is called to create a temporary std::string and this is used to copy-construct (using the copy ctor of std::string) the object file of type std::string.
The compiler is free to elide the copy ctor and directly instantiate file though.
However, note that std:string is not the same thing as a C-style null-terminated string. It is not even required to be null-terminated.
ifstream inf("file.txt");
The std::ifstream class has a ctor that takes a const char * and the string literal passed to it decays to a pointer to the first element of the string.
The thing to remember is this: std::string provides (almost seamless) conversion from C-style strings. You have to look up the signature of the function to see if you are passing in a const char * or a std::string (the latter because of implicit conversions).
So, is "string" a std::string, a c-string or a char*, or does it depend on the context?
It depends entirely on the context. :-) Welcome to C++.
A C string is a null-terminated string, which is almost always the same thing as a char*.
Depending on the platforms and frameworks you are using, there might be even more meanings of the word "string" (for example, it is also used to refer to QString in Qt or CString in MFC).
The C++ standard library provides a std::string class to manage and represent character sequences. It encapsulates the memory management and is most of the time implemented as a C-string; but that is an implementation detail. It also provides manipulation routines for common tasks.
The std::string type will always be that (it doesn't have a conversion operator to char* for example, that's why you have the c_str() method), but it can be initialized or assigned to by a C-string (char*).
On the other hand, if you have a function that takes a std::string or a const std::string& as a parameter, you can pass a c-string (char*) to that function and the compiler will construct a std::string in-place for you. That would be a differing interpretation according to context as you put it.
Neither C nor C++ have a built-in string data type.
When the compiler finds, during the compilation, a double-quoted strings is implicitly referred (see the code below), the string itself is stored in program code/text and generates code to create even character array:
The array is created in static storage because it must persist to be referred later.
The array is made to constant because it must always contain the original data (Hello).
So at last, what you have is const char * to this constant static character array.
const char* v()
{
char* text = “Hello”;
return text;
// Above code can be reduced to:
// return “Hello”;
}
During the program run, when the control finds opening bracket, it creates “text”, the char* pointer, in the stack and constant array of 6 elements (including the null terminator ‘\0’ at the end) in static memory area. When control finds next line (char* text = “Hello”;), the starting address of the 6 element array is assigned to “text”. In next line (return text;), it returns “text”. With the closing bracket “text” will disappear from the stack, but array is still in the static memory area.
You need not to make return type const. But if you try to change the value in static array using non constant char* it will still give you an error during the run time because the array is constant. So, it’s always good to make return constant to make sure, it cannot be referred by non constant pointer.
But if the compiler finds a double-quoted strings is explicitly referred as an array, the compiler assumes that the programmer is going to (smartly) handle it. See the following wrong example:
const char* v()
{
char text[] = “Hello”;
return text;
}
During the compilation, compiler checks, quoted text and save it as it is in the code to fill the generated array during the runt time. Also, it calculate the array size, in this case again as 6.
During the program run, with the open bracket, the array “text[]” with 6 elements is created in stack. But no initialization. When the code finds (char text[] = “Hello”;), the array is initialized (with the text in compiled code). So array is now on the stack. When the compiler finds (return text;), it returns the starting address of the array “text”. When the compiler find the closing bracket, the array disappears from the stack. So no way to refer it by the return pointer.
Most standard library functions still take only char * (or const char *) parameters.
The Standard C++ library has a powerful class called string for manipulating text. The internal data structure for string is character arrays. The Standard C++ string class is designed to take care of (and hide) all the low-level manipulations of character arrays that were previously required of the C programmer. Note that std::string is a class:
You can implicitly convert a char * into std::string because the
latter has a constructor to do that.
You can explicitly convert a std::string into a const char * by using the c_str() method.
As often as possible it should mean std::string (or an alternative such as wxString, QString, etc., if you're using a framework that supplies such. Sometimes you have no real choice but to use a NUL-terminated byte sequence, but you generally want to avoid it when possible.
Ultimately, there simply is no clear, unambiguous terminology. Such is life.
To use the proper wording (as found in the C++ language standard) string is one of the varieties of std::basic_string (including std::string) from chapter 21.3 "String classes" (as in C++0x N3092), while the argument of ifstream's constructor is NTBS (Null-terminated byte sequence)
To quote, C++0x N3092 27.9.1.4/2.
basic_filebuf* open(const char* s, ios_base::openmode mode);
...
opens a file, if possible, whose name is the NTBS s