Related
I don't see any logical reason. I mean you can easily overcome the requirement by using a structure containing an array member like this:
template <size_t n>
struct arr { int d[n]; };
auto fnReturningArray()
{
return arr<3>{0, 1, 2};
};
Which will behave the exact same way as if the array is directly returned with the small difference that you should first access the structure member 'd' to use it. Also the standard itself have added similar functionality by the 'std::array' type. So it seems that it is implementation possible. Why then ISO C++ have forbidden this action? Maybe legacy code compatibility (but I can hardly believe this is the case as with the other new things added it is long gone, like for example the new meaning of the 'auto' keyword).
Beside the fact that the standard doesn't allow it, and the historical reasons that could explain it, the issue is syntactic:
Imagine it would be permitted : how would you distinguish the naming of the whole array, vs the array address, vs a single element:
auto fnReturningArray()
{
int a[3] = {0, 1, 2};
return a; // what is meant here ? the address of the array ? or the whole array ?
};
If you'd change the meaning of existing rules (such as tating that a would be the whole array), you would have huge problems with legacy code.
The answer as I see it, is twofold:
Compatibility with C. C doesn't work this way. Why? No idea. C never was very logical to begin with in various aspects.
C++ prefers library features over language features. Seeing that C++98 was the first standard, and it mostly copied the basics from C (see point 1), this was corrected in the first major revision, C++11, which introduces the library type std::array, which, as a pure and simple library feature, solves all the abhorrent quirks that C-style arrays entail.
To summarise: even though it might make sense to have proper value semantics for arrays, it will never happen, because the apparent deficiency can be solved without making the language more complex than it already is. It is extremely difficult to get rid of legacy and backwards compatibility, so the current option of std::array is really what you want. Use it. It's simple. You'll like it. A lot.
For example, the loop:
std::vector<int> vec;
...
for (auto& c : vec) { ... }
will iterate over vec and copy each element by reference.
Would there ever be a reason to do this?
for (int& c : vec) { ... }
The two code snippets will result in the same code being generated: with auto, the compiler will figure out that the underlying type is int, and do exactly the same thing.
However, the option with auto is more "future-proof": if at some later date you decide that int should be replaced with, say, uint8_t to save space, you wouldn't need to go through your code looking for references to the underlying type that may need to be changed, because the compiler will do it for you automatically.
Use auto wherever it makes the code better, but nowhere else. Understand the impact using auto overly-liberally has on maintainability.
The question here really is if there is any reason why you shouldn't use auto for anything you can.
Well, let's get one thing out of the way first of all. The fundamental reason why auto was introduced in the first place was two-fold.
First, it makes declarations of complex-type variables simpler and easier to read and understand. This is especially true when declaring an iterator in a for loop. Consider the C++03 psudocode:
for (std::vector <Foo>::const_iterator it = myFoos.begin(); it != myFoos.end(); ++it)
This can become much more complex as the complexity of myFoos's type becomes more complex. Moreover if the type of myFoos is changed in a subtle way, but in a way that's insignifigant to the loop just written, the complex declaration must be revisited. This is a maintennance problem made simpler in C++11:
for (auto it = myFoos.begin(); it != myFoos.end(); ++it)
Second, there are situations which arise that are impossible to deal with without the facilities provided by auto and it's sibling, decltype. This comes up in templates. Consider (source):
template<typename T, typename S>
void foo(T lhs, S rhs) {
auto prod = lhs * rhs;
//...
}
In C++03 the type of prod cannot always be inferred if the types of lhs and rhs are not the same. In C++11 this is possible using auto. Alas it is also possible using decltype, but this was also added to C++11 along with auto.
Many of the C++ elite suggest that you should use auto anywhere possible. The reason for this was stated by Herb Sutter in a recent conference:
It's shorter. It's more convinient. It's more future-proof, because
if you change your function's return type, auto just works.
However they also acknowledge the "sharp edges." There are many cases where auto doesn't do what you want or expect. These sharp edges can cut you when you want a type conversion.
So on the one hand we have a highly respected camp telling us "use auto everywhere possible, but nowhere else." This doesn't feel right to me however. Most of the benefits that auto provide are provided at what I'm going to call "first-write time." The time when you are first writing a piece of code. As you're writing a big chink of brand-new code you can go ahead and use auto virtually everywhere and get exactly the behavior you expect. As you're writing the code you know exactly what's going on with your types and variables. I don't know about you, but as I write code there is a constant stream of thoughts going through my head. How do I want to create this loop? What kind of thing do I want that function to return so I can use it here? How can I write this so that is fast, correct, and easy to understand 6 months from now? Most of this is never expressed directly in the code that I write, except that the code that I write is a direct result of these thoughts.
At that time, using auto would make writing this code simpler and easier. I don't have to burden my mind with all the little minute of signed versus unsigned, reference vs value, 32 bit vs 64 bit, etc. I just write auto and everything works.
But here's my problem with auto. 6 months later when revisiting this code to add some major new functionality, that buzz of thought that was going through my mind as I first write the code has long been extinguished. My buffers were flushed long ago, and none of those thought are with me anymore. If I have done my job well, then the essence of those thoughts are expressed directly in the code I wrote. I can reverse-engineer what I was thinking by just looking at the structure of my functions and data types.
If auto is sprinkled everywhere, a big part of that cognizance is lost. I don't know what I was thinking would happen with this datatype because now it's inferred. If there was a subtle relationship taking place with an operator, that relationship is no longer expressed by the datatypes -- it's all just auto.
Maintenance becomes more difficult because no I have to re-create much of that thought. Subtle relationships become more clouded, and everything is just harder to understand.
So I'm not a fan of using auto everywhere possible. I think that makes maintenance harder than it has to be. That's not to say I believe that auto should only be used where it's required. Rather, it's a balancing act. The four criteria I use to judge the quality of my (or anyone's code) are: efficiency, correctness, robustness, and maintainability. In no particular order. If we design a spectrum of auto use where one side is "purely optional" and the other side is "strictly required", I feel that in general the closer to "purely optional" we get, the more maintainability suffers.
All this to say, finally, that my philosophy can be nutshelled as:
Use auto wherever it makes the code better, but nowhere else.
Understand the impact using auto overly-liberally has on
maintainability.
That depends on what you want to do with c:
You want to work with copies? Use auto c
You want to work with original items and possibly modify them? Use auto& c
You want to work with original items and not modify them? Use const auto& c
There are two conflicting insterests here. On the one side, it is just simpler to type auto& everywhere than figuring the exact type en each loop. Some people will claim that it is also more future-proof if the type stored in the container changes in the future --I don't particularly agree, I'd rather have the compiler complain* and let me figure out if the assumptions made in the loop about the type still hold.
On the other side, by using auto& (instead of the more explicit int&) you are hiding the type from the reader. The maintainer reading the code will have to think what auto means in this particular context, while int clearly has a single meaning. The same people, or at least a subset of them, will claim that you don't need to think, that the IDE will tell you the type if you hover the mouse over the variable... but I don't use an IDE nor particularly like the mouse...
Over all, this is mainly a matter of style. I prefer to see the types in the code rather than have the compiler infer them for me, as I find it easier to understand when I go back to the code some time later. But for quick coding when I don't envision having to maintain this code a week from now auto is more than sufficient.
* This assumes that you use a standard compliant compiler of which VisualStudio is not an example. The assumption is that if the type is wrong, a non-const reference won't bind to the value returned by dereferencing the iterator. In VS, the compiler will gladly convert types and bind a non-const reference to the rvalue! Maybe this is why Sutter, comming from the VS world suggests using auto everywhere!
I agree with dasblinkenlight's answer, but since you are asking when int& is better than auto&, I can paraphrase it this way:
Use int& when you would like your code to break if/when someone decides to change the type of your vector.
For example: your vectors usually contain int16_t, but this particular one requires greater precision (assuming int has 32-bit or greater precision). You don't want someone to change the type from int to int16_t and get a program that contains a silent overflow in calculations.
Another example: your code looks like this:
namespace joes_lib
{
int frobnicate(int);
}
for (int& c : vec) { c = frobnicate(c); }
Here, if someone changes vec to something like vector<int16_t> or, worse, vector<unsigned>, auto will silently succeed and lead to loss of precision in joe's library function. Compiler may or may not generate warnings about this.
These examples look clumsy and obscure, so you may want to comment such usage of non-auto types in loops.
I think that unions can be perfect for what i have in mind and especially when I consider that my code should run on a really heterogeneous family of machines, especially low-powered machine, what is bugging me is the fact that the people who creates compilers doesn't seem to care too much about introducing and offering a good union support, for example this table is practical empty when it comes to Unrestricted Unions support, and this is a real unpleasant view for my projects.
There are alternatives to union that can at least mimic the same properties ?
Union is well supported on most compilers, what's not well supported are unions that contains members that have non trivial constructor (unrestricted unions). In practice, you'd almost always want a custom constructor when creating unions, so not having unrestricted union is more of a matter of inconvenience.
Alternatively, you can always use void pointer pointing to a malloc-ed memory with sufficient size for the largest member. The drawback is you'd need explicit type casting.
One popular alternative to union is Boost.Variant.
The types you use in it have to be copy-constructible.
Update:
C++17 introduced std::variant. Its requirements are based on Boost.Variant. It is modernized to take into account the features in C++17. It does not carry the overhead of being compatible with C++98 compilers (like Boost). It comes for free with standard library. If available it is better to use it as alternative to unions.
You can always do essentially the same thing using explicit casts:
struct YourUnion {
char x[the size of your largest object];
A &field1() { return *(A*)&x[0]; }
int &field2() { return *(int*)&x[0]; }
};
YourUnion y;
new(&y.field1()) A(); // construct A
y.field1().~A(); // destruct A
y.field2() = 1;
// ...
(This is zero overhead compared to, e.g., Boost.Variant.)
EDIT: more union-like and without templates.
EDIT, 11 years after I asked this question: I feel vindicated for asking! C++20 finally did something close enough.
The original question follows below.
--
I have been using yield in many of my Python programs, and it really clears up the code in many cases. I blogged about it and it is one of my site's popular pages.
C# also offers yield – it is implemented via state-keeping in the caller side, done through an automatically generated class that keeps the state, local variables of the function, etc.
I am currently reading about C++0x and its additions; and while reading about the implementation of lambdas in C++0x, I find out that it was done via automatically generated classes too, equipped with operator() storing the lambda code. The natural question formed in my mind: they did it for lambdas, why didn't they consider it for support of "yield", too?
Surely they can see the value of co-routines... so I can only guess that they think macro-based implementations (such as Simon Tatham's) as an adequate substitute. They are not, however, for many reasons: callee-kept state, non-reentrant, macro-based (that alone is reason enough), etc.
Edit: yield doesn't depend on garbage collection, threads, or fibers. You can read Simon's article to see that I am talking about the compiler doing a simple transformation, such as:
int fibonacci() {
int a = 0, b = 1;
while (true) {
yield a;
int c = a + b;
a = b;
b = c;
}
}
Into:
struct GeneratedFibonacci {
int state;
int a, b;
GeneratedFibonacci() : state (0), a (0), b (1) {}
int operator()() {
switch (state) {
case 0:
state = 1;
while (true) {
return a;
case 1:
int c = a + b;
a = b;
b = c;
}
}
}
}
Garbage collection? No. Threads? No. Fibers? No. Simple transformation? Arguably, yes.
I can't say why they didn't add something like this, but in the case of lambdas, they weren't just added to the language either.
They started life as a library implementation in Boost, which proved that
lambdas are widely useful: a lot of people will use them when they're available, and that
a library implementation in C++03 suffers a number of shortcomings.
Based on this, the committee decided to adopt some kind of lambdas in C++0x, and I believe they initially experimented with adding more general language features to allow a better library implementation than Boost has.
And eventually, they made it a core language feature, because they had no other choice: because it wasn't possible to make a good enough library implementation.
New core language features aren't simply added to the language because they seem like a good idea. The committee is very reluctant to add them, and the feature in question really needs to prove itself. It must be shown that the feature is:
possible to implement in the compiler,
going to solve a real need, and
that a library implementation wouldn't be good enough.
In the case if a yield keyword, we know that the first point can be solved. As you've shown, it is a fairly simple transformation that can be done mechanically.
The second point is tricky. How much of a need for this is there? How widely used are the library implementations that exist? How many people have asked for this, or submitted proposals for it?
The last point seems to pass too. At least in C++03, a library implementation suffers some flaws, as you pointed out, which could justify a core language implementation. Could a better library implementation be made in C++0x though?
So I suspect the main problem is really a lack of interest. C++ is already a huge language, and no one wants it to grow bigger unless the features being added are really worth it. I suspect that this just isn't useful enough.
Adding a keyword is always tricky, because it invalidates previously valid code. You try to avoid that in a language with a code base as large as C++.
The evolution of C++ is a public process. If you feel yield should be in there, formulate an appropriate request to the C++ standard committee.
You will get your answer, directly from the people who made the decision.
They did it for lambdas, why didn't they consider it for supporting yield, too?
Check the papers. Did anyone propose it?
...I can only guess that they consider macro-based implementations to be an adequate substitute.
Not necessarily. I'm sure they know such macro solutions exist, but replacing them isn't enough motivation, on its own, to get new features passed.
Even though there are various issues around a new keyword, those could be overcome with new syntax, such as was done for lambdas and using auto as a function return type.
Radically new features need strong drivers (i.e. people) to fully analyze and push features through the committee, as they will always have plenty of people skeptical of a radical change. So even absent what you would view as a strong technical reason against a yield construct, there may still not have been enough support.
But fundamentally, the C++ standard library has embraced a different concept of iterators than you'd see with yield. Compare to Python's iterators, which only require two operations:
an_iter.next() returns the next item or raises StopIteration (next() builtin included in 2.6 instead of using a method)
iter(an_iter) returns an_iter (so you can treat iterables and iterators identically in functions)
C++'s iterators are used in pairs (which must be the same type), are divided into categories, it would be a semantic shift to transition into something more amenable to a yield construct, and that shift wouldn't fit well with concepts (which has since been dropped, but that came relatively late). For example, see the rationale for (justifiably, if disappointingly) rejecting my comment on changing range-based for loops to a form that would make writing this different form of iterator much easier.
To concretely clarify what I mean about different iterator forms: your generated code example needs another type to be the iterator type plus associated machinery for getting and maintaining those iterators. Not that it couldn't be handled, but it's not as simple as you may at first imagine. The real complexity is the "simple transformation" respecting exceptions for "local" variables (including during construction), controlling lifetime of "local" variables in local scopes within the generator (most would need to be saved across calls), and so forth.
So it looks like it didn't make it into C++11, or C++14, but might be on its way to C++17. Take a look at the lecture C++ Coroutines, a negative overhead abstraction from CppCon2015 and the paper here.
To summarize, they are working to extend c++ functions to have yield and await as features of functions. Looks like they have an initial implementation in Visual Studio 2015, not sure if clang has an implementation yet. Also it seems their may be some issues with using yield and await as the keywords.
The presentation is interesting because he speaks about how much it simplified networking code, where you are waiting for data to come in to continue the sequence of processing. Surprisingly, it looks like using these new coroutines results in faster/less code than what one would do today. It's a great presentation.
The resumable functions proposal for C++ can be found here.
In general, you can track what's going on by the committee papers, although it's better for keeping track rather than looking up a specific issue.
One thing to remember about the C++ committee is that it is a volunteer committee, and can't accomplish everything it wants to. For example, there was no hash-type map in the original standard, because they couldn't manage to make it in time. It could be that there was nobody on the committee who cared enough about yield and what it does to make sure the work got done.
The best way to find out would be to ask an active committee member.
Well, for such a trivial example as that, the only problem I see is that std::type_info::hash_code() is not specified constexpr. I believe a conforming implementation could still make it so and support this. Anyway the real problem is obtaining unique identifiers, so there might be another solution. (Obviously I borrowed your "master switch" construct, thanks.)
#define YIELD(X) do { \
constexpr size_t local_state = typeid([](){}).hash_code(); \
return (X); state = local_state; case local_state: ; } \
while (0)
Usage:
struct GeneratedFibonacci {
size_t state;
int a, b;
GeneratedFibonacci() : state (0), a (0), b (1) {}
int operator()() {
switch (state) {
case 0:
while (true) {
YIELD( a );
int c = a + b;
a = b;
b = c;
}
}
}
}
Hmm, they would also need to guarantee that the hash isn't 0. No biggie there either. And a DONE macro is easy to implement.
The real problem is what happens when you return from a scope with local objects. There is no hope of saving off a stack frame in a C-based language. The solution is to use a real coroutine, and C++0x does directly address that with threads and futures.
Consider this generator/coroutine:
void ReadWords() {
ifstream f( "input.txt" );
while ( f ) {
string s;
f >> s;
yield s;
}
}
If a similar trick is used for yield, f is destroyed at the first yield, and it's illegal to continue the loop after it, because you can't goto or switch past a non-POD object definition.
there have been several implementation of coroutines as user-space libraries. However, and here is the deal, those implementations rely on non-standard details. For example, nowhere on the c++ standard is specified how stack frames are kept. Most implementations just copy the stack because that is how most c++ implementations work
regarding standards, c++ could have helped coroutine support by improving the specification of stack frames.
Actually 'adding' it to the language doesn't sound a good idea to me, because that would stick you with a 'good enough' implementation for most cases that is entirely compiler-dependent. For the cases where using a coroutine matters, this is not acceptable anyways
agree with #Potatoswatter first.
To support coroutine is not the same thing as support for lambdas and not that simple transformation like played with Duff's device.
You need full asymmetric coroutines (stackful) to work like generators in Python. The implementation of Simon Tatham's and Chris' are both stackless while Boost.Coroutine is a stackfull one though it's heavy.
Unfortunately, C++11 still do not have yield for coroutines yet, maybe C++1y ;)
PS: If you really like Python-style generators, have a look at this.
I'm learning c++0x, at least the parts supported by the Visual C++ Express 2010 Beta.
This is a question about style rather than how it works. Perhaps it's too early for style and good practice to have evolved yet for a standard that isn't even released yet...
In c++0x you can define the return type of a method using -> type at the end of the function instead of putting the type at the start. I believe this change in syntax is required due to lambdas and some use cases of the new decltype keyword, but you can use it anywhere as far as I know.
// Old style
int add1(int a, int b)
{
return a + b;
}
// New style return type
auto add2(int a, int b) -> int
{
return a + b;
}
My question really then, is given that some functions will need to be defined in the new way is it considered good style to define all functions in this way for consistency? Or should I stick to only using it when necessary?
Do not be style-consistent just for being consistent. Code should be readable, i.e. understandable, that's the only real measure. Adding clutter to 95% of the methods to be consistent with the other 5%, well, that just does not sound right to me.
There is a huge codebase that uses the 'old'/current rules. I would bet that is going to be so for a long time. The problem of consistency is two-fold: who are you going to be consistent with, the few code that will require the new syntax or all existing code?
I will keep with the old syntax when the new one is not required for a bit, but then again, only time will tell what becomes the common usage.
Also note that the new syntax is still a little weird: you declare the return type as auto and then define what auto means at the end of the signature declaration... It does not feel natural (even if you do not compare it with your own experience)
Personally, I would use it when it is necessary. Just like this-> is only necessary when accessing members of a base class template (or when they are otherwise hidden), so auto fn() -> type is only necessary when the return type can't be determined before the rest of the function signature is visible.
Using this rule of thumb will probably help the majority of code readers, who might think "why did the author think we need to write the declaration this way?" otherwise.
I don't think it is necessary to use it for regular functions. It has special uses, allowing you to do easily what might have been quite awkward before. For example:
template <class Container, class T>
auto find(Container& c, const T& t) -> decltype(c.begin());
Here we don't know if Container is const or not, hence whether the return type would be Container::iterator or Container::const_iterator (can be determined from what begin() would return).
Seems to me like it would be changing the habit of a lifetime for a lot of C++ (and other C like) programmers.
If you used that style for every single function then you might be the only one doing it :-)
I am going to guess that the current standard will win out, as it has so far with every other proposed change to the definition. It has been extended, for sure, but the essential semantics of C++ are so in-grained that I don't think they are worth changing. They have influenced so many languages and style guides its ridiculous.
As to your question, I would try and separate the code into modules to make it clear where you are using old style vs new style. Where the two mix I would make sure and delineate it as much as possible. Group them together, etc.
[personal opinion]I find it really jarring to surf through files and watch the style morph back and forth, or change radically. It just makes me wonder what else is lurking in there [/personal opinion]
Good style changes -- if you don't believe me, look at what was good style in 98 and what is now -- and it is difficult to know what will considered good style and why. IMHO, currently everything related to C++0X is experimental and the qualification good or bad style just doesn't apply, yet.