Why don't C++03 file streams accept string constructor parameters? - c++

Why does the following code compile in C++11 and does not in C++03? (both gcc and cl)
#include <string>
#include <iostream>
#include <fstream>
int main(int argc, char* argv[]) {
const std::string t("Hello");
std::ofstream out(t);
}
Why don't the C++03 streams accept std::string as the constructor parameter? Was this decision based on something or did it happen accidentally?

The code fails when compiled with a strictly conforming C++03 compiler because the constructor that takes a std::string was only added in C++11.
As to the question, "was it based on something smart", as the interface was added, it can be inferred that there was no technical reason for it to be omitted.
It's an addition of convenience as, if you have a std::string, you can always call .c_str() to get a C string suitable for use with the old interface. (As the documentation in C++11 says , the constructors that take std::string have exactly the same effect as calling the corresponding constructor which takes a const char* with the result of calling .c_str() on the string.)

As I recall, this was discussed on c.l.c++.m some years ago, and Andrew Koenig (I think it was Andrew, anyway) said it was actually brought up during some meetings, but the idea of accepting a string was quickly conflated with the idea of accepting a wstring as well, and from there turned into a discussion about support for internationalized character sets in file names, and ... shortly after that the whole idea was dropped because it had opened a big can of worms nobody was prepared to deal with right then.

They had simply forgotten about adding the string constructor in C++03. Now that's fixed. This time round other things were forgotten, like make_unique. There's always something more that one could have done. C++03 also forgot to specify default arguments for function templates, which are now included.
Edit: As #Charles says, it may not be a literal "forgetting", but rather, it's something that clearly should be there, but just hadn't been specified for some reason or another. Further examples are given by std::next/std::prev, which are a great relief, and std::to_string and std::stoi/d/ul/ull, which again make perfect sense, but nobody had gotten around to specifying them until this time round. There isn't necessarily a deep reason for their previous absence.

Related

C++ Split definition and assignment of variables

So I recently installed CLion and after some setup I started messing with some older code I'd written. Now a great thing about CLion is that it helps you with coding style etc. but one thing I found strange is that it recommended to me to split the definition and assignment of variables. I remember a specific case where I defined strings as follows:
string path = "mypath";
But the IDE recommended to write it like:
string path;
path = "mypath";
Now I started looking for this online to find pros and cons of both methods. Apparently the former is faster but the latter is more secure because it calls the copy constructor or something (I didn't quite understand that part). My question is basically: Since CLion recommends doing it the latter way, does that mean that it is always better? Is there one way that is preferred over the other or is it situational? Do they each have their pros/cons and if so, what are they?
Any help or reliable resources are greatly appreciated.
Thanks in advance,
Xentro
Your IDE needs to be consigned to the bin.
string path = "mypath"; is an infinitely better pattern. This is because, in general, there is no potential hazard of an indeterminate value of path. (Granted, in this case, that is not an issue since a std::string has a well-defined default constructor but the preferred way is a good habit to get into).
The compiler is also spared the headache of constructing then assigning a value to an object.
But the IDE recommended to write it like:
string path;
path = "mypath";
Time to uninstall the IDE. The recommendation is utter nonsense.
Perhaps the worst thing about it is how it prevents const correctness. If path never changes, then you should make it const, and you can only do that if you initialise the variable with the correct value right away:
std::string const path = "mypath";
Note that depending on the context of the code, you may also want to use auto (which will turn the variable into a char pointer, because the deduced type of "mypath" is a char array, but perhaps it turns out you don't even need a full-fletched std::string?) - and that also only works if you do not follow the stupid IDE recommendation:
auto const path = "mypath";
Apparently the former is faster
Nonsense. This is not about speed but about correctness and code maintainability.
but the latter is more secure because it calls the copy constructor or something (I didn't quite understand that part).
That's nonsense, too. It calls one of std::string's overloaded assignment operators, not the copy constructor. And it's not more "secure" in any usual sense of the word.
The advice is nonsense and other answers here tell that already.
I want to add a source for this: Straight from the creator of C++ Bjarne Stroustrup and the chairman on C++ comete (hope I don't get his postion wrong) Herb Sutter:
C++ core guidelines:
ES.20: Always initialize an object
Reason Avoid used-before-set errors
and their associated undefined behavior. Avoid problems with
comprehension of complex initialization. Simplify refactoring. Example
void use(int arg)
{
int i; // bad: uninitialized variable
// ...
i = 7; // initialize i
}
No, i = 7 does not initialize i; it assigns to it. Also, i can be read
in the ... part. Better:
void use(int arg) // OK
{
int i = 7; // OK: initialized
string s; // OK: default initialized
// ...
}
Note The always initialize rule is deliberately stronger than the an object must be set before used language rule. The latter, more
relaxed rule, catches the technical bugs, but:
It leads to less readable code
It encourages people to declare names in greater than necessary scopes
It leads to harder to read code
It leads to logic bugs by encouraging complex code
It hampers refactoring
The always initialize rule is a style rule aimed to improve
maintainability as well as a rule protecting against used-before-set
errors.
string path = "mypath";
The code above is called using a copy constructor
string path;
path = "mypath";
And the other one is called uses the assignment operator.
Theoretically, the pros and cons really depend on how the compiler going to deal with the code. So it vary from compiler to compiler. In general, the first way is trying to build a object by copying another existing object. While the second one is first create a new empty object, then using the assignment operator ("=", the equal mark) to give value to the variable.
You can see more discussions in the following links:
Why separate variable definition and initialization in C++?
and What's the difference between assignment operator and copy constructor?

Why all function in <cstring> must not have constexpr?

I just noticed that D0202R2 propose that all <cstring> functions must not have constexpr. I would like to understand why, during Jacksonville meeting, it was decided for a solution like this.
Take a function like std::strchr. I really do not see any reason for not being constexpr. Indeed, a compiler can easily optimize some dummy code like this at compile-time (even if I remove builtins, as you can see from the parameters). However, at the same time, it is not possible to rely on these functions within constexpr contexts or using static assertions.
We could obviously re-implement some of <cstring> functions to be constexpr (as I did in this other dummy code), but I do not understand why they must not have constexpr in the standard library.
What am I missing?
PS: Builtins!
At the beginning I was confused because constexpr functions using some <cstring> capabilities just worked, then I understood it was only thanks to GCC builtins. Indeed, if you add the -fno-builtin parameter, you can just use std::strlen instead of the custom version of the function.
Upon reviewing this more, and thinking more about the implications of the C++14 relaxation of rules surrounding constexpr, I have a different answer.
The <cstring> header is a wrapper around a bunch of C functions. C has no constexpr concept, and while it might be useful for it to have one, it's not likely to grow one anytime soon. So marking up those functions in that way would be cumbersome and require a lot of #ifdefs.
Also (and I think this is the most important reason) when those functions are not compiler intrinsics they are implemented in C and stored in a library file as object code. Object code in a library is not a form accessible to the C++ compiler to evaluate at compile time. They are not 'inline' like template code is.
Lastly, most of the really useful things they do can easily be implemented in terms of the C++ <algorithm> library. strlen(s) = (::std::string_view(s)).length(), memcpy(a, b, len) = ::std::copy(b, b + len, a) and so on. And D0202R2 proposes to make those algorithms constexpr. And, as you pointed out, it also proposes to make functions in ::std::string_view constexpr and these also give equivalent functionality. So, given the previously mentioned headaches, it seems that implementing constexpr for the <cstring> functions would be of dubious benefit.
As a side note, there's ::std::copy, ::std::move, ::std::copy_backward, and ::std::move_backward and it's up to you to figure out which you need to call. It would be nice if there was a function that could figure out whether or not x or x_backward was needed in that particular case like memmove does. But, because of the way iterators are defined, taking one iterator and comparing it to another iterator that may not be iterating over the same object at all just isn't possible to do in C++, even if they're random access iterators.

Why do streams still convert to pointers in C++11?

The canonical way to read lines from a text file is:
std::fstream fs("/tmp/myfile.txt");
std::string line;
while (std::getline(line, fs)) {
doThingsWith(line);
}
(no, it is not while (!fs.eof()) { getline(line, fs); doThingsWith(line); }!)
This works beacuse std::getline returns the stream argument by reference, and because:
in C++03, streams convert to void*, via an operator void*() const in std::basic_ios, evaluating to the null pointer value when the fail error flag is set;
see [C++03: 27.4.4] & [C++03: 27.4.4.3/1]
in C++11, streams convert to bool, via an explicit operator bool() const in std::basic_ios, evaluating to false when the fail error flag is set
see [C++11: 27.5.5.1] & [C++11: 27.5.5.4/1]
In C++03 this mechanism means the following is possible:
std::cout << std::cout;
It correctly results in some arbitrary pointer value being output to the standard out stream.
However, despite operator void*() const having been removed in C++11, this also compiles and runs for me in GCC 4.7.0 in C++11 mode.
How is this still possible in C++11? Is there some other mechanism at work that I'm unaware of? Or is it simply an implementation "oddity"?
I'm reasonably certain this is not allowed/can't happen in a conforming implementation of C++11.
The problem, of course, is that right now, most implementations are working on conforming, but aren't there completely yet. At a guess, for many vendors, this particular update is a fairly low priority. It improves error checking, but does little (or nothing) to enable new techniques, add new features, improve run-time efficiency, etc. This lets the compiler catch the error you've cited (some_stream << some_other_stream) but doesn't really make a whole lot of difference otherwise.
If I were in charge of updating a standard library for C++11, I think this would be a fairly low priority. There are other changes that are probably as easy (if not easier) to incorporate, and likely to make a much bigger difference to most programmers.
To use one of the examples you gave, if I were in charge of updating the VC++ standard library to take advantage of the compiler features added in the November CTP, my top priority would probably be to add constructors to the standard container types to accept initialization_lists. These are fairly easy to add (I'd guess one person could probably add and test them in under a week) and make quite an obvious, visible difference in what a programmer can do.
As late as GCC 4.6.2, the libstdc++ code for basic_ios is evidently still C++03-like.
I'd simply put this down to "they haven't gotten around to it yet".
By contrast, the libc++ (LLVM's stdlib implementation) trunk already uses operator bool().
This was a missed mini-feature buried in a pre-existing header. There are probably lots of missing error of omission and commission in pre-2011 components.
Really, if anyone comes up with things like this in gcc then it would do a world of good to go to Bugzilla and make a bug report. It may be a low priority bug but if you start a paper trail
I'll go out on a limb and extend this idea to all the other C++ compilers: clang, Visual Studio,etc.
This will make C++ a better place.
P.S. I entered a bug in Bugzilla.

No instance of function template remove_if matches argument list

I am trying to remove whitespaces from a string
line.erase(remove_if(line.begin(), line.end(), isspace), line.end());
But Visual Studio 2010 (C++ Express) tells me
1 IntelliSense: no instance of function template "std::remove_if" matches the argument list d:\parsertry\parsertry\calc.cpp 18
Full Source
Why is that? A simple piece of code
int main() {
string line = "hello world 111 222";
line.erase(remove_if(line.begin(), line.end(), isspace), line.end());
cout << line << endl;
getchar();
return 0;
}
Verifies the function works?
Funny thing is despite that, it runs giving correct result.
Don't question Intellisense, sometimes it's better to just ignore it. The parser or the database got screwed up somehow, so it doesn't work correctly anymore. Usually, a restart will fix the problem.
If you really want to know if the code is ill-formed, well, just hit F7 to compile.
Your source code compiles without even a warning with Visual C++ 11.0 (the compiler that ships with Visual Studio 2012).
Intellisense uses its own rules and isn't always reliable.
That said, your use of isspace is Undefined Behavior for all character sets except original 7-bit ASCII. Which means the heavily upvoted answer that you took it from, is just balderdash (which should not surprise). You need to cast the argument to (the C library's) isspace to unsigned char to avoid negative values and UB.
C99 §7.4/1 (from the N869 draft):
The header <ctype.h> declares several functions useful for testing and mapping
characters.
In all cases the argument is an int, the value of which shall be
representable as an unsigned char or shall equal the value of the macro EOF. If the
argument has any other value, the behavior is undefined.
A simple way to wrap the C function is
bool isSpace( char const c )
{
typedef unsigned char UChar;
return !!::isspace( UChar( c ) );
}
Why the typedef?
It makes the code easier to adapt when
you already have such a typedef, which is not uncommon;
it makes the code more clear; and
it avoids a C syntax cast, thereby avoiding a false positive when searching for such via a regular expression or other pattern matching.
But, why the !! (double application of the negation operator)? Considering there’s an automatic implicit conversion from int to bool? And, if one absolutely feels that the conversion should be explicit, shouldn’t it be a static_cast, and not !!?
Well, the !! avoids a silly-warning from the Visual C++ compiler,
“warning C4800: 'int' : forcing value to bool 'true' or 'false' (performance warning)”
and a static_cast doesn’t stop that warning. It’s good practice to quench that warning, and since Visual C++ is the main C++ compiler on the most used system, namely Windows, better do this in all code meant to be portable.
Oh, OK, but, since the function must be wrapped anyway, then … why use the old C libary isspace (single argument) function, when the <locale> header provides a far more more flexible C++ (two arguments) isspace function?
Well, first and foremost, the old C isspace function is the one used in the question, so that’s the one discussed in this answer. I have focused on discussing just how to not do this incorrectly, that is, how to avoid Undefined Behavior. Discussing how to do it right brings it to a whole different level.
But regarding the in-practice, the C++ level function of the same name can be considered to be broken, since with g++ compilers until recently (and perhaps even with g++ 4.7.2, i haven't checked lately) only the C locale mechanism worked, and the C++ level one didn't, in Windows. It may have been fixed since g++ now supports wide streams, I don’t know. Anyway, there C library isspace function, in addition to being in-practice more portable and generally working in Windows, is also simpler and, I believe, more efficient (although for efficiency one should always MEASURE if it is deemed important!).
Thanks to James Kanze for asking (essentially) the questions above, in the comments.
What is isspace? Depending on the includes headers and the compiler
you are using, it's likely that your code won't even compile. (I don't
know about IntelliSense, but it's possible that it's looking at all of
the standard headers, and sees the ambiguity.)
There are two isspace functions in the standard, and one is a
template. Passing a function template to a template argument of another
function template does not give the compiler nearly enough information
to be able to do template argument deduction: in order to resolve the
overload of isspace, it has to know the type expected by the
remove_if, which it only knows after template argument deduction has
succeeded. And to do template argument deduction on remove_if, it has
to know the types of the arguments, which means the type of isspace,
which it will only know once it has been able to resolve the overload on
it.
(I'm actually surprised that your little bit of code compiles: you
obviously include <iostream>, and typically, <iostream> will include
<locale>, which will bring in the function template isspace.)
Of course, the function template isspace must be called with two
arguments, so if it were ever chosen, the instantiation of remove_if
wouldn't compile (but the compiler does not try to instantiate
remove_if until it has chosen a function). And the isspace in
<ctype.h> will result in undefined behavior if passed a char, so you
can't use it. The usual solution is to create a set of predicate
objects for your tool box, and use them. Something like the following
should work if you're only concerned with char:
template <std::ctype<char>::mask m>
class Is : public std::unary_function<char, bool>
{
std::locale myLocale; // To ensure lifetime of following...
std::ctype<char> const* myCType;
public:
Is( std::locale const& loc = std::locale() )
: myLocale( loc )
, myCType( &std::use_facet<std::ctype<char> >( myLocale ) )
{
}
bool operator()( char ch ) const
{
return myCType->is( m, ch );
}
};
typedef Is<std::ctype_base::space> IsSpace;
It's trivial to add the additional typedef's so you get the complete
set, and I've found it useful to add an IsNot template as well. It's
simple, and it avoids all of the surrounding issues.

Why don't the std::fstream classes take a std::string?

This isn't a design question, really, though it may seem like it. (Well, okay, it's kind of a design question). What I'm wondering is why the C++ std::fstream classes don't take a std::string in their constructor or open methods. Everyone loves code examples so:
#include <iostream>
#include <fstream>
#include <string>
int main()
{
std::string filename = "testfile";
std::ifstream fin;
fin.open(filename.c_str()); // Works just fine.
fin.close();
//fin.open(filename); // Error: no such method.
//fin.close();
}
This gets me all the time when working with files. Surely the C++ library would use std::string wherever possible?
By taking a C string the C++03 std::fstream class reduced dependency on the std::string class. In C++11, however, the std::fstream class does allow passing a std::string for its constructor parameter.
Now, you may wonder why isn't there a transparent conversion from a std:string to a C string, so a class that expects a C string could still take a std::string just like a class that expects a std::string can take a C string.
The reason is that this would cause a conversion cycle, which in turn may lead to problems. For example, suppose std::string would be convertible to a C string so that you could use std::strings with fstreams. Suppose also that C string are convertible to std::strings as is the state in the current standard. Now, consider the following:
void f(std::string str1, std::string str2);
void f(char* cstr1, char* cstr2);
void g()
{
char* cstr = "abc";
std::string str = "def";
f(cstr, str); // ERROR: ambiguous
}
Because you can convert either way between a std::string and a C string the call to f() could resolve to either of the two f() alternatives, and is thus ambiguous. The solution is to break the conversion cycle by making one conversion direction explicit, which is what the STL chose to do with c_str().
There are several places where the C++ standard committee did not really optimize the interaction between facilities in the standard library.
std::string and its use in the library is one of these.
One other example is std::swap. Many containers have a swap member function, but no overload of std::swap is supplied. The same goes for std::sort.
I hope all these small things will be fixed in the upcoming standard.
Maybe it's a consolation: all fstream's have gotten an open(string const &, ...) next to the open(char const *, ...) in the working draft of the C++0x standard.
(see e.g. 27.8.1.6 for the basic_ifstream declaration)
So when it gets finalised and implemented, it won't get you anymore :)
The stream IO library has been added to the standard C++ library before the STL. In order to not break backward compatibility, it has been decided to avoid modifying the IO library when the STL was added, even if that meant some issues like the one you raise.
# Bernard:
Monoliths "Unstrung." "All for one, and one for all" may work for Musketeers, but it doesn't work nearly as well for class designers. Here's an example that is not altogether exemplary, and it illustrates just how badly you can go wrong when design turns into overdesign. The example is, unfortunately, taken from a standard library near you...
~ http://www.gotw.ca/gotw/084.htm
It is inconsequential, that is true. What do you mean by std::string's interface being large? What does large mean, in this context - lots of method calls? I'm not being facetious, I am actually interested.
It has more methods than it really needs, and its behaviour of using integral offsets rather than iterators is a bit iffy (as it's contrary to the way the rest of the library works).
The real issue I think is that the C++ library has three parts; it has the old C library, it has the STL, and it has strings-and-iostreams. Though some efforts were made to bridge the different parts (e.g. the addition of overloads to the C library, because C++ supports overloading; the addition of iterators to basic_string; the addition of the iostream iterator adaptors), there are a lot of inconsistencies when you look at the detail.
For example, basic_string includes methods that are unnecessary duplicates of standard algorithms; the various find methods, could probably be safely removed. Another example: locales use raw pointers instead of iterators.
C++ grew up on smaller machines than the monsters we write code for today. Back when iostream was new many developers really cared about code size (they had to fit their entire program and data into several hundred KB). Therefore, many didn't want to pull in the "big" C++ string library. Many didn't even use the iostream library for the same reasons, code size.
We didn't have thousands of megabytes of RAM to throw around like we do today. We usually didn't have function level linking so we were at the mercy of the developer of the library to use a lot of separate object files or else pull in tons of uncalled code. All of this FUD made developers steer away from std::string.
Back then I avoided std::string too. "Too bloated", "called malloc too often", etc. Foolishly using stack-based buffers for strings, then adding all kinds of tedious code to make sure it doesn't overrun.
Is there any class in STL that takes a string... I dont think so (couldnt find any in my quick search). So it's probably some design decision, that no class in STL should be dependent on any other STL class (that is not directly needed for functionality).
I believe that this has been thought about and was done to avoid the dependency; i.e. #include <fstream> should not force one to #include <string>.
To be honest, this seems like quite an inconsequential issue. A better question would be, why is std::string's interface so large?
Nowadays you can solve this problem very easily: add -std=c++11 to your CFLAGS.