Is this expression valid? - c++

I've met this code:
std::string str;
std::getline(std::cin, str);
std::string sub = str.substr(str.find('.') + 1);
first reaction was - this is invalid code. But after some thoughts it seems to be not so simple. So is it valid C++ expression (has predictable behavior)?
PS if it is not so clear question is mostly related to what will happen when '.' would not be found in str, but not limited to that, as there could be other issues.

str.find('.') returns the index of the first occurrence of the character in your string. substr with one argument returns the suffix of the string starting at the given index. So what the line does is it returns the tail of the string starting right after the first dot. So if str is "hello.good.bye", sub will be good.bye.
However, there is potentially a problem with the code if the string does not in fact contain any dots. It will return the whole string. This may or may not be intended. This happens because if there is no dot, find will return npos, which is the largest number std::string::size_type can hold. Add 1 to it, and you will get 0 (that's how unsigned types behave, modulo 2n).
So the code always has predictable behavior.

From http://en.cppreference.com/w/cpp/string/basic_string/npos it seems that std::string::npos = -1;. That value is returned when the find is unsuccessful. In that case, the str.substr(0) will return the entire string.
It would seem it is valid and predictable code.

If you're asking about what happens if . is not found, you don't have to worry. std::string::find then returns std::string::npos, which is defined to be -1 by the standard, which, after adding 1 overflows and makes the argument 0:
std::string sub = str.substr(0);
which gives you the whole string. I don't know if that's the desired behavior, but it's certainly not undefined behavior.

Actually, in this specific case--because the string that contains "...does not matter..." actually does--the find() call is not going to find what it is looking for and so will return std::string::npos. You then add 1 to this.
npos is of type std::string::size_type, which is generally size_t, which is usually an unsigned integer of some sort.
npos is defined as the greatest possible value for size_type.
With size_type being unsigned adding 1 to the greatest possible value creates 0.
So you're calling std::string::substr(0). What does this do? It creates a copy of the whole string you called it on because substr takes a starting position and length (defaulting to npos, or "all the way to the end").

As far as I can see it is valid though it isn't particularly readable.
If str is empty then the find() method will return std::string::npos. This is equivalent to the largest unsigned int representable by std::size_type. By adding 1 to this you're causing an integer overflow and it will wrap around to 0. This means the substr() method is attempting to create a string using chars from position 0 till the end of the string. If str is empty then sub is also empty.

Related

How to compare const char* with a string in C++?

I am working with C++ and I am trying to compare strings.
Below is my code which gives me back const char* -
const char* client_id() const {
return String(m_clientPos);
}
And now I am comparing the strings like this -
cout<<client_ptr->client_id()<< endl;
if (strcmp(client_ptr->client_id(), "Hello")) {
..
} else {
..
}
but it never goes into if statement. But my cout prints out Hello correctly. Is there anything wrong I am doing?
You need to do if (0 == strcmp(...
See http://www.cplusplus.com/reference/cstring/strcmp/
strcmp
Returns an integral value indicating the relationship between the strings:
A zero value indicates that both strings are equal.
A value greater than zero indicates that the first character that does not match has a greater value in str1 than in str2; And a value less than zero indicates the opposite.
it never goes into if statement.
The strcmp function returns zero when the strings are the same, so you should see the code hit the else branch when the two strings are equal to each other.
A zero value indicates that both strings are equal.
A value greater than zero indicates that the first character that does not match has a greater value in str1 than in str2;
And a value less than zero indicates the opposite.
Since String does not look like a built-in class and assuming that you have access to its source, you may be better off making the comparison with const char* a member function of the String class.

Does an empty string contain an empty string in C++?

Just had an interesting argument in the comment to one of my questions. My opponent claims that the statement "" does not contain "" is wrong.
My reasoning is that if "" contained another "", that one would also contain "" and so on.
Who is wrong?
P.S.
I am talking about a std::string
P.S. P.S
I was not talking about substrings, but even if I add to my question " as a substring", it still makes no sense. An empty substring is nonsense. If you allow empty substrings to be contained in strings, that means you have an infinity of empty substrings. What is the point of that?
Edit:
Am I the only one that thinks there's something wrong with the function std::string::find?
C++ reference clearly says
Return Value: The position of the first character of the first match.
Ok, let's assume it makes sense for a minute and run this code:
string empty1 = "";
string empty2 = "";
int postition = empty1.find(empty2);
cout << "found \"\" at index " << position << endl;
The output is: found "" at index 0
Nonsense part: how can there be index 0 in a string of length 0? It is nonsense.
To be able to even have a 0th position, the string must be at least 1 character long.
And C++ is giving a exception in this case, which proves my point:
cout << empty2.at( empty1.find(empty2) ) << endl;
If it really contained an empty string it would had no problem printing it out.
It depends on what you mean by "contains".
The empty string is a substring of the empty string, and so is contained in that sense.
On the other hand, if you consider a string as a collection of characters, the empty string can't contain the empty string, because its elements are characters, not strings.
Relating to sets, the set
{2}
is a subset of the set
A = {1, 2, 3}
but {2} is not a member of A - all A's members are numbers, not sets.
In the same way, {} is a subset of {}, but {} is not an element in {} (it can't be because it's empty).
So you're both right.
C++ agrees with your "opponent":
#include <iostream>
#include <string>
using namespace std;
int main()
{
bool contains = string("").find(string("")) != string::npos;
cout << "\"\" contains \"\": "
<< boolalpha << contains;
}
Output: "" contains "": true
Demo
It's easy. String A contains sub-string B if there is an argument offset such that A.substr(offset, B.size()) == B. No special cases for empty strings needed.
So, let's see. std::string("").substr(0,0) turns out to be std::string(""). And we can even check your "counter-example". std::string("").substr(0,0).substr(0,0) is also well-defined and empty. Turtles all the way down.
The first thing that is unclear is whether you are talking about std::string or null terminated C strings, the second thing is why should it matter?. I will assume std::string.
The requirements on std::string determine how the component must behave, not what its internal representation must be (although some of the requirements affect the internal representation). As long as the requirements for the component are met, whether it holds something internally is an implementation detail that you might not even be able to test.
In the particular case of an empty string, there is nothing that mandates that it holds anything. It could just hold a size member set to 0 and a pointer (for the dynamically allocated memory if/when not empty) also set to 0. The requirement in operator[] requires that it returns a reference to a character with value 0, but since that character cannot be modified without causing undefined behavior, and since strict aliasing rules allow reading from an lvalue of char type, the implementation could just return a reference to one of the bytes in the size member (all set to 0) in the case of an empty string.
Some implementations of std::string use small object optimizations, in those implementations there will be memory reserved for small strings, including an empty string. While the std::string will obviously not contain a std::string internally, it might contain the sequence of characters that compose an empty string (i.e. a terminating null character)
empty string doesn't contain anything - it's EMPTY. :)
Of course an empty string does not contain an empty string. It'll be turtles all the way down if it did.
Take String empty = ""; that is declaring a string literal that is empty, if you want a string literal to represent a string literal that is empty you would need String representsEMpty = """"; but of course, you need to escape it, giving you string actuallyRepresentsEmpty = "\"\"";
ps, I am taking a pragmatic approach to this. Leave the maths nonsense at the door.
Thinking about you amendment, it could be possible that your 'opponent' meant was that an 'empty' std::string still has an internal storage for characters which is itself empty of characters. That would be an implementation detail I am sure, it could perhaps just keep a certain size (say 10) array of characters 'just incase', so it will technically not be empty.
Of course, there is the trick question answer that 'nothing' fits into anything infinite times, a sort of 'divide by zero' situation.
Today I had the same question since I'm currently bound to a lousy STL implementation (dating back to the pre-C++98 era) that differs from C++98 and all following standards:
TEST_ASSERT(std::string().find(std::string()) == string::npos); // WRONG!!! (non-standard)
This is especially bad if you try to write portable code because it's so hard to prove that no feature depends on that behaviour. Sadly in my case that's actually true: it does string processing to shorten phone numbers input depending on a subscriber line spec.
On Cppreference, I see in std::basic_string::find an explicit description about empty strings that I think matches exactly the case in question:
an empty substring is found at pos if and only if pos <= size()
The referred pos defines the position where to start the search, it defaults to 0 (the beginning).
A standard-compliant C++ Standard Library will pass the following tests:
TEST_ASSERT(std::string().find(std::string()) == 0);
TEST_ASSERT(std::string().substr(0, 0).empty());
TEST_ASSERT(std::string().substr().empty());
This interpretation of "contain" answers the question with yes.

Does strcmp in C++ check every value in a string if the second parameter is "0"?

If my string input is 1234567890, and I do the following:
(strcmp(input,"0"))
Will that return 1 if there is a 0 in my character array of 1234567890 and 0 if there isn't?
I know I can test this, and I did, and the answer is yes, but I'm not sure why and I can't find absolute specifics on strcmp.
No. strcmp returns 0 if the two strings are the same, non 0 otherwise.
Looks like you didn't even bother to google!
No, it compares two strings.
strcmp() returns 0 only if both strings are the same. Otherwise, the return value says something about the first non-matching character.
In your case, this has to do with the comparison between '1' and your '0'. It makes no difference that the other string has a '0' at the end.
strcmp() will typically check all characters from the first one until the last one or until there's a mismatch in the two strings.
The exact internal implementation of strcmp(), if you're asking about that, is not specified in the language standard. In theory, it could find lengths of the two strings and if they are equal, compare the strings using units bigger than char and even do that backwards.
strcmp() compares strings, not searches for one in the other. It returns 0 if the strings are identical. Otherwise it returns either a positive or a negative value, representing the sign of the difference between the first mismatching characters (the characters' values being treated as unsigned).
Does strcmp in C++ check every value in a string if the second
parameter is “0”
No it does not, strcmp() is a string compare function which checks if one string equals another. If it does, it returns 0 if one string is ordinally greater than the other, it returns 1 and returns -1 otherwise.
To check if it does exist, I suggest you write your own function for this.
//return 1 if if the character exists, 0 otherwise
int DoesCharExist(const char *pData, char character)
{
char *data = pData;
while(*data++){
if(*data == character) return 1;
}
return 0;
}

Return substring starting from position past the end of string (C++)

The first parameter of C++ STL function substr(pos,n), pos, is said to have this behaviour.
"If the position passed is past the end of the string, an out_of_range exception is thrown."
However, if I do something like
string s("ab");
cout<<s.substr(2,666)<<endl;
then no exception is thrown even though the pos=2 is by definition past the end of the string.
The string::end defines the position "after the last character in the string" as "past the end of the string."
I noticed the returned character is always the '\0'. My question is if this is standard behaviour and if I can count on the fact that an empty string is returned in this case. Thank you.
The actual requirement is (§21.4.7.8):
1 Requires: pos <= size()
2 Throws: out_of_range if pos > size().
In your case, pos == size(), so you should never see an exception, and should always get an empty string.
Since the character at the position passed as the first parameter is included in the result, position 2 should not be considered to be past the end of the string: it is at the end of the string. The length of the string is a legal argument to pass to substr.

char[] vs LPCSTR strange behavior

Could you please explain why, in order to convert a char array like this:
char strarr[5] = {65,83,67,73,73}; //ASCII
Into LPCSTR to be accepted by GetModuleHandleA() and GetProcAddress(), I have to first append 0 to the end ?
i.e. I have:
char strarr[6] = {65,83,67,73,73,0};
And only then convert as (LPCSTR)&strarr.
For some reason I don't get the first one works only sometimes (i.e. if I do not add 0 at the end), while if I do add zero at the end - this work all the time. Why do I have to add zero?
Oh and a side question - why in C++ do I have to explicitly state the size of array in [], when I am initializing it with elements right away? (If I don't state the size, then it does not work)
Thanks.
Those functions expect NULL terminated strings.
Since you only give them a pointer to a char array, they can't possibily know its size, hence the need for a particular value (the terminating NULL character) to indicate the end of the string.