I absolutely love the way Xcode offers insight into possible available member functions of the language and would prefer to use it relative to, say, text mate, if not for an oddity i noticed today.
When string s = "Test string"; the only available substr signature is as shown
From what i understand however, and what i see online the signature should be
string substr ( size_t pos = 0, size_t n = npos ) const;
Indeed s.substr(1,2); is both understood and works in Xcode.
Why does it not show when i try to method complete? (Ctrl-Space)
Xcode is performing the completion correctly, but it's not what you expect. You've actually answered the question yourself unknowingly. The function signature for string's substr() method, just as you said, is:
string substr ( size_t pos = 0, size_t n = npos ) const;
All arguments to substr() have default assignments, therefore to Xcode, s.substr() (with no arguments) is the valid code completion to insert because it's really s.substr(0, s.npos). You can confirm this with any number of standard C++ functions with default arguments. The easiest place to see this is with any STL container constructor.
Take for instance a vector. We all know that vectors can take an Allocator, but the default argument assigned Allocator is "good enough" for most casual uses. Sure enough, two of the signatures for vector constructors are:
explicit vector ( const Allocator& = Allocator() );
explicit vector ( size_type n, const T& value= T(), const Allocator& = Allocator() );
In both cases, the Allocator argument has a default assignment, and in the second, the T default value has a default assignment. Now, take a look at what Xcode suggests when constructing a vector:
The suggestion with no argument list is actually the constructor that takes just an Allocator. The suggestion that takes just a size_type is actually the constructor that takes a size_type, T, and Allocator.
Depending on how you think about this, it may or may not be an Xcode bug. Ideally, you want to see completions with default arguments for simpler functions like substr(), but for STL container constructors, you probably almost never want to see them. Perhaps it could be an option, but I wouldn't expect to see this corrected. I'd happily dup a radar with you though.
Related
Considering a code like this:
std::string str = "abcdef";
const size_t num = 50;
const size_t baselen = str.length();
while (str.length() < num)
str.append(str, 0, baselen);
Is it safe to call std::basic_string<T>::append() on itself like this? Cannot the source memory get invalidated by enlarging before the copy operation?
I could not find anything in the standard specific to that method. It says the above is equivalent to str.append(str.data(), baselen), which I think might not be entirely safe unless there is another detection of such cases inside append(const char*, size_t).
I checked a few implementations and they seemed safe one way or another, but my question is if this behavior is guaranteed. E.g. "Appending std::vector to itself, undefined behavior?" says it's not for std::vector.
According to §21.4.6.2/§21.4.6.3:
The function [basic_string& append(const charT* s, size_type n);] replaces the string controlled by *this with a string of length size() + n whose first size() elements are a copy of the original string controlled by *this and whose remaining elements are a copy of the initial n elements of s.
Note: This applies to every append call, as every append can be implemented in terms of append(const charT*, size_type), as defined by the standard (§21.4.6.2/§21.4.6.3).
So basically, append makes a copy of str (let's call the copy strtemp), appends n characters of str2 to strtemp, and then replaces str with strtemp.
For the case that str2 is str, nothing changes, as the string is enlarged when the temporary copy is assigned, not before.
Even though it is not explicitly stated in the standard, it is guaranteed (if the implementation is exactly as stated in the standard) by the definition of std::basic_string<T>::append.
Thus, this is not undefined behavior.
This is complicated.
One thing that can be said for certain. If you use iterators:
std::string str = "abcdef";
str.append(str.begin(), str.end());
then you are guaranteed to be safe. Yes, really. Why? Because the specification states that the behavior of the iterator functions is equivalent to calling append(basic_string(first, last)). That obviously creates a temporary copy of the string. So if you need to insert a string into itself, you're guaranteed to be able to do it with the iterator form.
Granted, implementations don't have to actually copy it. But they do need to respect the standard specified behavior. An implementation could choose to make a copy only if the iterator range is inside of itself, but the implementation would still have to check.
All of the other forms of append are defined to be equivalent to calling append(const charT *s, size_t len). That is, your call to append above is equivalent to you doing append(str.data(), str.size()). So what does the standard say about what happens if s is inside of *this?
Nothing at all.
The only requirement of s is:
s points to an array of at least n elements of charT.
Since it does not expressly forbid s pointing into *this, then it must be allowed. It would also be exceedingly strange if the iterator version allows self-assignment, but the pointer&size version did not.
I had a fairly quick question. The std::vector provides the following two constructors:
explicit vector( const Allocator& alloc = Allocator()); // default constructor
explicit vector( size_type count, // fill constructor
const T& value = T(),
const Allocator& alloc = Allocator());
Is there any particular reason the default constructor is not implemented with a default value of 0 for the first parameter in the fill constructor? I can imagine there must be a reason but I can't immediately see one.
Because you then can't pass just an allocator, without providing the count or default element (aka value) as well?
Putting count to 0 will result in ambiguity error.
Would be much more simpler if C++ had named params like Python. Boost has such a library, but again that incurs some runtime overhead :( (can't remember how much right now) I often used this Lib in testing, but not where the performance is important...
The reason is that the constructors place different requirements on the type contained in the vector. To use the second one, the type must copy-constructible, and if you use the default argument for value, it must be default-constructible too. The first constructor places no such requirements on the contained type.
Note that the constructors you're showing in the question only existed until C++11. There, it was sufficient to differentiate those two scenarios (since any type stored in a std::vector had to be copy-constructible). C++11 introduced move semantics, and the second constructor was split further:
explicit vector(size_type count);
vector(
size_type count,
const T& value,
const Allocator& alloc = Allocator()
);
That's because std::vector no longer requires its contained types to be copy-constructibe; move-constructibility is enough. Therefore the count-only constructor requires default constructibility (but not copy constructibility), and the count + prototype constructor requires copy constructibility (but not default constructibility).
The evolution of std::vector constructors is really quite complex. See their page on cppreference to see how much they've evolved. That evolution includes adding an optional allocator parameter to the count-only constructor in C++14, which was (I assume) mistakenly omitted.
I noticed that
std::string str;
str += 'b'; // works
str.append('b'); // does not work
str.append(1, 'b'); // works, but not as nice as the previous
Is there any reason why the append method does not support a single character to be appended? I assumed that the operator+= is actually a wrapper for the append method, but this does not seem to be the case.
I figure that operator+=() is intended to handle all the simple cases (taking only one parameter), while append() is for things that require more than one parameter.
I am actually more surprised about the existence of the single-parameter append( const basic_string& str ) and append( const CharT* s ) than about the absence of append( CharT c ).
Also note my comment above: char is just an integer type. Adding a single-parameter, integer-type overload to append() -- or the constructor (which, by design, have several integer-type overloads already) might introduce ambiguity.
Unless somebody finds some written rationale, or some committee members post here what they remember about the discussion, that's probably as good an explanation as any.
It is interesting to note that the form of append here;
string& append( size_type count, CharT ch );
Mirrors the constructor taking similar input.
basic_string( size_type count,
CharT ch,
const Allocator& alloc = Allocator() );
And some other methods that take a count with a character, such as resize( size_type count, CharT ch );.
The string class is large and it is possible that the particular use case (and overload) for str.append('b'); was not considered, or the alternatives were considered sufficient.
Just simple the introduction of a single overload for this could introduce ambiguity if the integrals int and char correspond (on some platforms this may be the case).
There are several alternatives to the append adding a single character.
Adding a string containing a single character can be done str.append("b");. Albeit that this not exactly the same, it has the same effect.
As mentioned there is operator+=
There is also push_back(), which is consistent with other standard containers
Point is, it was probably never considered as a use case (or strong enough use case), thus, a suitable overload/signature was not added to append to cater for it.
Alternative designs could be debated, but given the maturity of the standard and this class, it is unlikely they will be changed soon - it could very well break a lot of code.
Alternate signatures for append could also be considered; one possible solution could have been to reverse the order of the count and char (possibly adding a default);
string& append(CharT ch, size_type count = 1);
Another, as described in some of the critique of basic_string is to remove append, there are many methods to achieve what it does.
It just occurred to me I noticed that std::string's substr operation could be much more efficient for rvalues when it could steal the allocated memory from *this.
The Standard library of N3225 contains the following member function declaration of std::string
basic_string substr(size_type pos = 0, size_type n = npos) const;
Can an implementation that could implement an optimized substr for rvalues overload that and provide two versions, one of which could reuse the buffer for rvalue strings?
basic_string substr(size_type pos = 0) &&;
basic_string substr(size_type pos, size_type n) const;
I imagine the rvalue version could be implemented as follows, reusing the memory of *this an setting *this to a moved-from state.
basic_string substr(size_type pos = 0) && {
basic_string __r;
__r.__internal_share_buf(pos, __start + pos, __size - pos);
__start = 0; // or whatever the 'empty' state is
return __r;
}
Does this work in an efficient fashion on common string implementations or would this take too much housekeeping?
Firstly, an implementation cannot add an overload that steals the source, since that would be detectable:
std::string s="some random string";
std::string s2=std::move(s).substr(5,5);
assert(s=="some random string");
assert(s2=="rando");
The first assert would fail if the implementation stole the data from s, and the C++0x wording essentially outlaws copy on write.
Secondly, this wouldn't necessarily be an optimization anyway: you'd have to add additional housekeeping in std::string to handle the case that it's a substring of a larger string, and it would mean keeping large blocks around when there was no longer any strings referencing the large string, just some substring of it.
Yes, and maybe it should be proposed to the standards committee, or maybe implemented in a library. I don't really know how valuable the optimization would be. And that would be an interesting study all on its own.
When gcc grows support for r-value this, someone ought to try it and report how useful it is.
There are a few string classes out there implementing copy-on-write. But I wouldn't recommend adding yet another string type to your project unless really justified.
Check out the discussion in Memory-efficient C++ strings (interning, ropes, copy-on-write, etc)
I like to initialize 2-dimensional arrays as vector<vector<int> >(x,y). x is passed to vector<vector<int> >'s constructor and y is passed to vector<int>'s constructor, x times. Although this seems to be forbidden by C++03, because the constructor is explicit, it always works, even on Comeau. I can also call vector::assign like this. But, for some reason, not vector::push_back.
vector< vector< int > > v( 20, 40 ); // OK: convert 40 to const T&
v.assign( 30, 50 ); // OK
v.push_back( 2 ); // error: no conversion, no matching function, etc.
Are the first two examples actually compliant for some reason? Why can I convert 40 and 50 but not 2?
Epilogue: see http://gcc.gnu.org/onlinedocs/libstdc++/ext/lwg-defects.html#438 for why most compilers allow this, but the standard is shifting the other way.
Your assumption about Comeau implicitly calling an explicit constructor is most likely incorrect. The behavior is indeed broken, but the problem is different.
I suspect that this is a bug in the implementation of Standard Library that comes with Comeau, not with core Comeau compiler itself (although the line is blurry in this case).
If you build a quick dummy class that has constructor properties similar to std::vector and try the same thing, you'll discover that the compiler correctly refuses to perform construction.
The most likely reason why it accepts your code is the well-known formal ambiguity of two-parameter constructor of std::vector. It can be interpreted as (size, initial value) constructor
explicit vector(size_type n, const T& value = T(),
const Allocator& = Allocator());
or as (begin, end) template constructor with the latter accepting two iterators
template <class InputIterator>
vector(InputIterator first, InputIterator last,
const Allocator& = Allocator());
The standard library specification explicitly states that when two integral values are used as arguments, the implementation must make sure somehow that the formed constructor is selected, i.e. (size, initial value). How it is done - doesn't matter. It can be done at the library level, it can be hardcoded in the core compiler. It can be done in any other way.
However, in response to ( 20, 40 ) arguments Comeau compiler appears to erroneously select and instantiate the latter constructor with InputIterator = int. I don't know how it manages to compile the specialized version of the constructor, since integral values can't and won't work as iterators.
If you try this
vector< vector< int > > v( 20U, 40 );
you'll discover that the compiler reports an error now (since it can no longer use the two-iterator version of the constructor) and the explicit on the first constructor prevents it from converting 40 to a std::vector.
The same thing happens with assign. This certainly a defect of Comeau implementation, but, once again, experiments show that most likely the required behavior was supposed to be enforced at the library level (the core compiler seems to work OK), and somehow it got done incorrectly.
On the second thought, I see that the main idea in my explanation is correct, but the details are wrong. Also, I can be wrong about calling it a problem in Comeau. It is possible that Comeau is right here.
The standard says in 23.1.1/9 that
the constructor
template <class InputIterator>
X(InputIterator f, InputIterator l, const Allocator& a = Allocator())
shall have the same effect as:
X(static_cast<typename X::size_type>(f),
static_cast<typename X::value_type>(l),
a)
if InputIterator is an integral type
I suspect that if the above is interpreted literally, the compiler is allowed to assume that an explicit static_cast is implied there (well... so to say), and the code is legal for the same reason static_cast< std::vector<int> >(10) is legal, despite the corresponding constructor's being explicit. The presence of static_cast is what makes it possible for the compiler to use the explicit constructor.
If the behavior of Comeau compiler is correct (and I suspect that it is in fact correct, as required by the standard), I wonder whether this was the intent of the committee to leave such a loophole open, and allow implementations to work arount the explicit restriction possibly present on the constructor of vector element.
Are the first two examples actually compliant for some reason?
They are not compliant on the compiler I just tried. (gcc 4.4.1)
Why can I convert 40 and 50 but not 2?
Since the first two lines are not consistent with the standard, only Comeau may know why their inconsistency is inconsistent.
It is not an accident that the standard requires explicit conversions from int types to arbitrary vectors. It is done to prevent confusing code.
vector< vector< int > > v( 20, 40 ); use a constructor that you might not be familiar with. The constructor is called here is vector(iterator start, iterator end);
Internally, it specializes to an int iterator, so the first parameter is treated as count, and second parameter is the value to initialize the vector. Because there is a cast when assign the second parameter to a vector value, so the constructor of the inner vector<T>(int, const T&) will be called with value 40. Therefore, the inner vector is constructed with 40 0's.
Your examples aren't compliant. The constructor is explicit as you said, so you not allowed to pass an int (y) instead of a vector for the constructor (and y is not passed "x times" to vector constructor: the second paramater is only created once, to initilialize the inserted objects).
Your examples don't work under gcc (4.4 & 4.5).