C++ compare two string literals - c++

When comparing a string literal with another string literal with the == operator (or !=), is the result well defined?
For example, are the following guaranteed to hold?
assert("a" == "a");
assert("a" != "b");
Please don't say stuff like "use std::string" instead. I just want to know this specific case.

"a" == "a"
This expression may yield true or false; there are no guarantees. The two "a" string literals may occupy the same storage or they may exist at two different locations in memory.
I think that the closest language in the C++ Standard is: "Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation defined" (C++11 §2.14.5/12). There are no other requirements or restrictions, so the result is left unspecified.
"a" != "b"
This expression must yield false because there is no way that these two string literals can occupy the same location in memory: "a"[0] != "b"[0].
When you compare string literals in this way, you are really comparing the pointers to the initial elements in the arrays.
Because we are comparing pointers, the relational comparisons (<, >, <=, and >=) are even more problematic than the equality comparisons (== and !=) because only a restricted set of pointer comparisons may be performed using the relational comparisons. Two pointers may only be relationally compared if they are both pointers into the same array or pointers into the same object.
If the two "a" string literals occupy the same location in memory, then "a" < "a" would be well-defined and would yield false, because both pointers point to the initial element ('a') of the same array.
However, if the two "a" string literals occupy different locations in memory, the result of "a" < "a" is undefined, because the two pointers being compared point into entirely unrelated objects.
Because "a" and "b" can never occupy the same location in memory, "a" < "b" always has undefined behavior. The same is true for the other relational comparison operators.
If you did, for some reason, want to relationally compare two string literals and have well-defined results, you can use the std::less comparer, which provides a strict-weak ordering over all pointers. There are also std::greater, std::greater_equal, and std::less_equal comparers. Given that string literals with the same contents may not compare equal, I don't know why one would ever want to do this, but you can.

The idea is that in C++ string literals are arrays. Since arrays do not have comparison operators defined for them, they are compared using the next best fit - the pointer comparison operator, as arrays will implicitly decay to pointers, so any comparison compares address and not content. Since "a" and "b" cannot be at the same memory location, "a" != "b" is a true assertion. It also forms a valid static assertion. No such guarantee can be made about "a" == "a", though GCC with -fmerge-constants (implied at -O1) can make a reasonably strong probability and -fmerge-all-constants can give you a guarantee (that potentially results in non-conforming behavior).
If you happen to want a content-based comparison, you can always use assert(!strcmp("a", "a")). Or, you can use some sort of constexpr based strcmp for a static assertion:
constexpr bool static_strequal_helper(const char * a, const char * b, unsigned len) {
return (len == 0) ? true : ((*a == *b) ? static_strequal_helper(a + 1, b + 1, len - 1) : false);
}
template <unsigned N1, unsigned N2>
constexpr bool static_strequal(const char (&str1)[N1], const char (&str2)[N2]) {
return (N1 == N2) ? static_strequal_helper(&(str1[0]), &(str2[0]), N1) : false;
}
static_assert(static_strequal("asdf", "asdf"), "no error - strings are equal");
static_assert(static_strequal("asdf", "jkl;"), "strings are not equal");
assert(!strcmp("asdf", "jkl;")); //no compile error - runtime error
//cannot use strcmp in static assert as strcmp is not constexpr...
Then, compile with g++ -std=c++0x (or -std=c++11 for gcc >= 4.7), and...
error: static assertion failed: "strings are not equal"

Related

String comparison of char* to uint8_t [duplicate]

This question already has an answer here:
Comparing uint8_t data with string
(1 answer)
Closed 4 years ago.
I'm new to C and C++, and can't seem to work out how I need to compare these values:
Variable I'm being passed:
typedef struct {
uint8_t ssid[33];
String I want to match. I've tried both of these:
uint8_t AP_Match = "MatchString";
unsigned char* AP_Match = "MatchString";
How I've attempted to match:
if (strncmp(list[i].ssid, "MatchString")) {
if (list[i].ssid == AP_Match) {
if (list[i].ssid == "MatchString") {
// This one fails because String is undeclared, despite having
// an include line for string.h
if (String(reinterpret_cast<const char*>(conf.sta.ssid)) == 'MatchString') {
I've noodled around with this a few different ways, and done some searching. I know one or both of these may be the wrong type, but I'm not sure to get from where I am to working.
There is no such type as "String" defined by any C standard. A string is just an array of characters that are stored as unsigned values based on the chosen encoding. 'string.h' provides various functions for comparison, concatenation, etc. but it can only work if the values you are passing to it are coherent.
The operator "==" is also undefined for string comparisons, because it would require comparing each character at each index, for two arrays that may not be the same size and ultimately may use different encodings, despite the same underlying unsigned integer representation (raising the prospect of false positive comparisons). You can possibly define your own function to do it (note C doesn't allow overloading operators), but otherwise you're stuck with what the standard libraries provide.
Note that strncmp() takes a size parameter for the number of characters to compare (your code is missing this). https://www.tutorialspoint.com/c_standard_library/c_function_strncmp.htm
Otherwise you would be looking at the function strcmp(), which requires the input strings to be null-terminated (last character equal to '\0'). Ultimately it's up to you to consider what the possible combinations of inputs could be and how they are stored and to use a comparison function that is robust to all possibilities.
As a final side note
if (list[i].ssid == "MatchString") {
Since ssid is an array, you should know that when you do this comparison, you are not actually accessing the contents of ssid, but rather the address of the first element of ssid. When you pass list[i].ssid into strcmp (or strncmp), you are passing a pointer to the first element of the array in memory. The function then iterates over the entire array until it reaches the null character (in the case of strcmp) or until it has compared the specified number of elements (in the case of strncmp).
To match two strings use strcmp:
if (0==strcmp(str1, str2))
str1 and str2 are addresses to memory holding a null terminated string. Return value zero means the strings are equal.
In your case one of:
if (0==strcmp(list[i].ssid, AP_Match))
if (0==strcmp(list[i].ssid, "MatchString"))

Unexpected results when comparing strings

I am comparing these two strings: "code" and "test"
When I type this in Visual Studio:
cout<<("t"<"c")<<endl;
cout<<("c"<"t")<<endl;
cout<<("code"<"test")<<endl;
cout<<("test"<"cose")<<endl;
The result is:
1
0
1
1
Which does not make sense, when I tried to try it on ideone.com, the result becomes:
0
1
1
1
What is going wrong here?
You're comparing pointer values, not strings (note: "cose" is a different literal than "code", ¹guaranteed giving a different pointer).
Use std::string from the <string> header to get meaningful string operations.
Then you can also use literals like "code"s.
#include <iostream>
#include <string>
using namespace std;
auto main() -> int
{
cout << boolalpha;
cout << ("t"s < "c"s) << endl;
cout << ("c"s < "t"s) << endl;
cout << ("code"s < "test"s) << endl;
cout << ("test"s < "cose"s) << endl;
}
Formally the code in the question,
cout<<("t"<"c")<<endl;
cout<<("c"<"t")<<endl;
cout<<("code"<"test")<<endl;
cout<<("test"<"cose")<<endl;
… has implementation defined behavior, because
C++11 §5.9/2 2nd dash (expr.rel):
” If two pointers p and q of the same type point to different objects that are not members of the same object or elements of the same array or to different functions, or if only one of them is null, the results of p<q, p>q, p<=q, and p>=q are unspecified.
You can however compare such pointers in a well-defined way via std::less and family, because
C++11 20.8.5/8 (comparisons):
” For templates greater, less, greater_equal, and less_equal, the specializations for any pointer type yield a total order, even if the built-in operators <, >, <=, >= do not.
But on the third and gripping hand, while the pointer comparisons can be useful in some situations, you probably wanted to compare the string literals. The standard library offers e.g. strcmp in order to do that. But preferably use std::string, as noted at the start.
The literal "code" denotes an immutable null-terminated string of char values. With the final null-byte it's a total of five char values. Hence the type is char const[5].
As an expression used in a context where a pointer is expected, the expression denoting this array (namely, the "code" literal) decays to a pointer to the first item, a char const* pointer.
This is the usual decay of array expression to pointer, but in C++03 and earlier there was also a special rule for literals that allowed a decay to just char* (no const).
Notes:
¹ Two identical string literals can give different pointers, or the same pointer, depending on the compiler and options used.
String literals, like e.g. "t" are actually constant arrays of characters (including terminator).
When you use a string literal then what you get is a pointer to its first character.
So when you do "t" < "c" you are comparing two unrelated pointers. If "t" < "c" is true or not depends on where the compiler have decided to put the string literal arrays.
If you want to compare strings, you either should use std::string, or the old C-function strcmp.

Using strcmp on a vector

I have a vector of strings and I want to compare the first element of the vector with a bunch of different "strings".
Here is what i wanted to do:
if (strcmp(myString[0], 'a') == 0)
but strcmp doesnt work. I basically want to check the contents of myString[0] with a bunch of different characters to see if there is a match. So it would be something like
if (strcmp(myString[0], 'a') == 0){
}
else if (strcmp(myString[0], 'ah') == 0){
}
else ifif (strcmp(myString[0], 'xyz') == 0)
etc..
What can i use to do this comparison? Compiler complains about "no suitable conversion from std:string to "constant char*" exists so i know it doesnt like that im doing a string to char comparison, but i cant figure out how to correctly do this.
std::string overloads operator== to do a string comparison, so the equivalent to
if (strcmp(cString, "other string") == 0)
is
if (cppString == "other string")
So your code becomes (for example)
else if (myString[0] == "ah")
'a' is not a string, it is a character constant. You need to use "a", "ah", "xyz", etc.
Also, if you want to use strcmp, you need to use:
if (strcmp(myString[0].c_str(), "a") == 0)
You can also use the overloaded operator== to compare a std::string with a char const*.
if (myString[0] == "a")
You have marked this post as C++.
compare the first element of the vector with a bunch of different
"strings".
If I am reading your post correctly, the first element of the vector is a std::string.
std::string has a function and an operator to use for string-to-string comparison.
The function is used like:
if (0 == pfnA.compare(pfnB))
As described in cppreference.com:
The return value from std::string.compare(std::string) is
negative value if *this appears before the character sequence specified by the arguments, in lexicographical order
positive value if *this appears after the character sequence specified by the arguments, in lexicographical order
zero if both character sequences compare equivalent
The operator==() as already described, returns true when the two strings are the same.

Str[i] auto boolean check in the for loop

printArrayWithoutLength(char str [])
{
for(int i=0;str[i];i++)
cout<< str[i]<< endl;
}
Why does the above work? I am not using a boolean check on the length.
In C, any condition that isn't a direct boolean expression (that is, some other type than boolean and doesn't involve a comparison operator [>, <, ==, !=, etc]) will automatically compare as a not equal to zero, so you could rewrite your code as:
for(int i=0;str[i] != 0;i++)
or
for(int i=0;str[i] != '\0';i++)
or
for(int i=0; 0 != str[i]; i++)
with exactly the same result and exactly the same code being generated. Just a bit more or less typing, and depending on the familiarity with C or C++, you may find that it's more or less easy to read one over another.
Of course, this only works for traditional C-style strings that are terminated with a nul-character (character with the value zero). There are other ways to store strings, and this code, in whichever form would naturally not work if the string is not actually terminated with a zero character.
Writing str[i] as a conditional expression of for is equivalent to str[i] != '\0' (your string should be null terminated). When str[i] becomes \0, loop will terminate.
Whether this function (without return type)
printArrayWithoutLength(char str [])
{
for(int i=0;str[i];i++)
cout<< str[i]<< endl;
}
will or will not work depends on what the array used as the argument contains. If it contains a string that is a sequence of characters terminated by zero then this function will work because inside the loop there is condition
str[i]
that will be equal to false if the value of str[i] contains zero for some i.
According to the C++ Standard (4.12 Boolean conversions [conv.bool])
1 A prvalue of arithmetic, unscoped enumeration, pointer, or pointer
to member type can be converted to a prvalue of type bool. A zero
value, null pointer value, or null member pointer value is converted
to false; any other value is converted to true.
Otherwise if the array does not contain a string the behaviour of the function is undefined.

Comparison of letters and strings in C++

Why this code
if ("j" > "J")
return false, but this:
string a = "j";
string b = "J";
if (a > b)
return true? Which is the correct answer and how can i fix it?
That is happenig because "j" and "J" are const char []. For exampe "j" is array of chars that
c[0]='j' and c[1]='\0'. In C and C++ you can't compare two arrays.
it is better to use
strcmp("j","J");
witch is in
When you type
string a="j"
you run constructor in class string. But in class string you have overloaded operator< to compare two strings.
You can use single quotes to compare symbols: if ('j' > 'J')
This is because "j" and "J" are string literals, which are compared as const char pointers. The result of the comparison is therefore arbitrary, because the placement of literals in memory is implementation defined.
On the other hand, when you make std::string objects from these string literals, the < operator (along with other comparison operators) is routed to an override provided by the std::string class. This override does a lexicographic comparison of the strings, as opposed to comparing pointer values, so the results of comparison look correct.
You can try
if ("J" < "j")
and may be get a different result.
In fact "J" and "j" are constant C strings and may be placed on .data or .text sections which is determined by the output binary file format. So when you compare them, you are comparing their address in the memory.
But std::string is a C++ class which overloads the > operator, so it's not address/pointer comparing but content comparing.