How can char* be a condition in for loop? - c++

In a book I am reading there is a piece of code :
string x;
size_t h=0;
for(const char* s=x.c_str();*s;++s)
h=(h*17)^*s;
Regarding this code, I have two questions:
how can *s be a condition? what does it mean?
what does "h=(h*17)^*s" mean?
Thanks for help!

how can *s be a condition? what does it mean?
It means "while the value pointed to by s is not zero." C strings are null-terminated, so the last character in the string returned by c_str() will be the null character (\0, represented by all bits zero).
what does "h=(h*17)^*s" mean?
It multiplies h by 17 then xors it with the value pointed to by s.

In C (or C++) any value can be used as a "boolean". A numeric value of 0, or a NULL pointer, means "false". Anything else means "true".
Here, *s is "the character value currently pointed to by s". The loop stops if that character is a 0 (not the "0" digit, with ASCII encoding 48, but the byte with ASCII encoding 0). This is conventionally the "end-of-string" marker, so the loop stops when it reaches the end of the string.
"^" is the bitwise XOR operator. The left "*" is a plain multiplication, while the other "*" is the pointer dereference operator (i.e. the thing which takes the pointer s and looks at the value to which this pointer points). "=" is assignment. In brief, the value of h is multiplied by 17, then XORed with the character pointed to by s, and the result becomes the new value of h.

*s detects the string termination character '\0'
(h*17)^*s is what it says: h multiplied by 17 and xor-ed with the content of the character pointed by s. Seems a simple hashing funciton.

As other answers have explained, the basic answer is that any expression that evaluates to 0 gets interpreted as a 'false' condition in C or C++, and *s will evaluate to 0 when the s pointer reaches the null termination character of the string ('\0').
You could equivalently use the expression *s != 0, and some developers might argue that this is what should be used, giving the opinion that the 'fuller' expression is more clear. Whether or not you agree with that opinion, you need to be able to understand the use of the terse alternative, since it's very commonly used in C/C++ code. You'll come across these expressions a lot, even if you prefer to use the more explicit comparision.
The more rigorous explanation from the standard (for some reason I feel compelled to bring this into the discussion, even though it doesn't really change or clarify anything. In fact, it probably will muddle things unnecessarily for some people - if you don't care to get into this level of trivia, you'll miss absolutely nothing by clicking the back button right now...):
In C, the *s expression is in what the standard calls 'expression-2' of the for statement, and this particular for statement example is just taking advantage of the standard's definition of the for statement. The for statement is classified as an 'iteration statement', and among the semantics of any iteration statement are (6.8.5/4 "Iteration statements"):
An iteration statement causes a statement called the loop body to be executed repeatedly
until the controlling expression compares equal to 0.
Since the 'expression-2' part of the for statement is the controlling expression, this means that the for loop will execute repeatedly until *s compares equal to 0.
The C++ standard defines things a little differently (but with the same result). In C++, the for statement is defined in terms of the while statement, and the condition part of the while statement controls the the iteration (6.5.1/1 "The while statement"):
until the value of the condition becomes false
Earlier in the C++ standard, the following describes how expressions are converted to bool (4.12 "boolean conversions"):
An rvalue of arithmetic, enumeration, pointer, or pointer to member type can be converted to an rvalue of type bool. A zero value, null pointer value, or null member pointer value is converted to false; any other value is converted to true
Similar wording in the standard (in both languages) apply to the controlling expression/condition of all selection or iteration statements. All this language-lawyerese boils down to the fact that if an expression evaluates to 0 it's the same as evaluating to false (in the English sense of the word, since C doesn't have a built-in false keyword).
And that's the long, confusing explanation of the simple concept.

*s is the character that s currently points to, so it's a character. The for loop goes on until it becomes \0, meaning until the string ends.
h is assigned the value of h * 17 xored with the (ascii value of) character *s.
Here's a good tutorial about pointers.

1) *s in the condition checks whether *s!=NUL
2) h=(h*17)^*s implies multiply h by 17 and perform exclusive-OR operation with the value pointed to by s.

In C and C++, true and false are the same as non-zero, and zero. So code under if (1){ will always execute, as will code under if (-1237830){, but if (0){ is always false.
Likewise, if the value of the pointer is ever 0, the condition is the same as false, i.e. you will exit the loop.

Related

Character array initialization with the first element being null

I was recently faced with a line of code and four options:
char fullName[30] = {NULL};
A) First element is assigned a NULL character.
B) Every element of the array is assigned 0 ( Zeroes )
C) Every element of the array is assigned NULL
D) The array is empty.
The answer we selected was option C, as, while the array is only initialized with a single NULL, C++ populates the rest of the array with NULL.
However, our professor disagreed, stating that the answer is A, he said:
So the very first element is NULL, and when you display it, it's displaying the first element, which is NULL.
The quote shows the question in its entirety; there was no other information provided. I'm curious to which one is correct, and if someone could explain why said answer would be correct.
The question is ill-defined, but Option B seems like the most correct answer.
The result depends on how exactly NULL is defined, which depends on the compiler (more precisely, on the standard library implementation). If it's defined as nullptr, the code will not compile. (I don't think any major implementation does that, but still.)
Assuming NULL is not defined as nullptr, then it must be defined as an integer literal with value 0 (which is 0, or 0L, or something similar), which makes your code equivalent to char fullName[30] = {0};.
This fills the array with zeroes, so Option B is the right answer.
In general, when you initialize an array with a brace-enclosed list, every element is initialized with something. If you provide fewer initializers than the number of elements, the remaining elements are zeroed.
Regarding the remaining options:
Option C is unclear, because if the code compiles, then NULL is equivalent to 0, so option C can be considered equivalent to Option B.
Option A can be valid depending on how you interpret it. If it means than the remaining elements are uninitialized, then it's wrong. If it doesn't specify what happens to the remaining elements, then it's a valid answer.
Option D is outright wrong, because arrays can't be "empty".
char fullName[30] = {NULL};
This is something that should never be written.
NULL is a macro that expands to a null pointer constant. A character - not a pointer - is being initialised here, so it makes no sense to use NULL.
It just so happens that some null pointer constants are also integer literals with value 0 (i.e. 0 or 0L for example), and if NULL expands to such literal, then the shown program is technically well-formed despite the abuse of NULL. What the macro expands to exactly is defined by the language implementation.
If NULLinstead expands to a null pointer constant that is not an integer literal such as nullptr - which is entirely possible - then the program is ill-formed.
NULL shouldn't be written in C++ at all, even to initialise pointers. It exists for backwards compatibility with C to make it easier to port C programs to C++.
Now, let us assume that NULL happens to expand to an integer literal on this particular implementation of C++.
Nothing in the example is assigned. Assignment is something that is done to pre-existing object. Here, and array is being initialised.
The first element of the array is initialised with the zero literal. The rest of the elements are value initialised. Both result in the null character. As such, the entire array will be filled with null characters.
A simple and correct way to write the same is:
char fullName[30] = {};
B and C are equally close to being correct, except for wording regarding "assignment". They fail to mention value initialisation, but at least the outcome is the same. A is not wrong either, although it is not as complete because it fails to describe how the rest of the elements are initialised.
If "empty" is interpreted as "contains no elements", then D is incorrect because the array contains 30 elements. If it is interpreted as "contains the empty string", then D would be a correct answer.
You are almost correct.
The professor is incorrect. It is true that display finishes at the first NULL (when some approaches are used), but that says nothing about the values of the remainder of the array, which could be trivially examined regardless.
[dcl.init/17.5]:: [..] the
ith array element is copy-initialized with xi for each 1 ≤ i ≤ k, and value-initialized for each k < i ≤ n. [..]
However, none of the options is strictly correct and well-worded.
What happens is that NULL is used to initialise the first element, and the other elements are zero-initialised. The end result is effectively Option B.
Thing is, if NULL were defined as an expression of type std::nullptr_t on your platform (which it isn't, but it is permitted to be), the example won't even compile!
NULL is a pointer, not a number. Historically it has been possible to mix and match the two things to some degree, but C++ has tried to tighten that up in recent years, and you should avoid blurring the line.
A better approach is:
char fullName[30] = {};
And the best approach is:
std::string fullName;
Apparently, Your Professor is right, let's see how
char someName[6] = "SAAD";
how the string name is represented in memory:
0 1 2 3 4 5
S A A D
Array-based C string
The individual characters that make up the string are stored in the elements of the array. The string is terminated by a null character. Array elements after the null character are not part of the string, and their contents are irrelevant.
A "null string" is a string with a null character as its first character:
0 1 2 3 4 5
/0
Null C string
The length of a null string is 0.

In c++ what does if(a=b) mean? versus if(a==b)

Probably been answered, but could not find it.
In c++ what does
if(a=b)
mean?
versus
if(a==b)
I just spent two hours debugging to find that
if(a=b)
compiles as
a=b
Why does compiler not flag
if(a=b)
as an error?
In c++ what does if(a=b) mean?
a=b is an assignment expression. If the type of a is primitive, or if the assignment operator is generated by the compiler, then the effect of such assignment is that the value of a is modified to match b. Result of the assignment will be lvalue referring to a.
If the operator is user defined, then it can technically have any behaviour, but it is conventional to conform to the expectations by doing similar modification and return of the left operand.
The returned value is converted to bool which affects whether the following statement is executed.
versus
if(a==b)
a==b is an equality comparison expression. Nothing is assigned. If the types are primitive, or if the comparison operator is generated by the compiler, then the result will be true when the operands are equal and otherwise false.
If the operator is user defined, then it can technically have any behaviour, but it is conventional to conform to the expectations by doing similar equality comparison.
Why does compiler not flag
if(a=b)
as an error?
Because it is a well-formed expression (fragment) as long as a is assignable with b.
if(a=b) is a conventional pattern to express the operation of setting value of a variable, and having conditional behaviour depending on the new value.
Some compilers do optionally "flag" it with a warning.
Note that if you would assign value int a = 1 and you make an if statement
if (a = 2)
{
std::cout << "Hello World!" << std::endl;
}
It still works, even though they are two different values, they will do the std::cout
However if you use a double equals sign == it will not.
The reason for this is if you use the standard double equals sign == you are asking the code if a is equivalent to 2, if it is 2. Obviously it's not, so it doesn't std::cout. But if you use an equals sign, you are changing the value a to 2, so it continues with the if statement.
And, to prove this, try taking away the int a = 1 from before the if statement and add an int before a in the if statement, it works.

Using NULL with C's char* strings [duplicate]

This question already has answers here:
What is the difference between NULL, '\0' and 0?
(11 answers)
Closed 4 years ago.
As we all know, strings in C are null-terminated. Does that mean that according to the standard it is legal to use the NULL constant as the terminator? Or is the similarity of the name of NULL pointer and null-terminator for a string only a happy coincidence?
Consider the code:
char str1[] = "abc";
char str2[] = "abc";
str1[3] = NULL;
str2[3] = '\0';
Here, we change the terminator of str1 to NULL. Is this legal and well-formed C code and str1 adheres to C's definition of null-terminated string? Will it be the same in case of C++?
In practice, I have always used NULL instead of '\0' in my code for strings and everything worked - but is such practice 100% legal?
EDIT: I understand that it's very bad style and refrain from endorsing it and now understand the difference between 0, NULL and '\0' (as in a duplicate What is the difference between NULL, '\0' and 0). I'm still quite curious as for the legality of this code - and voices here seem to be mixed - and the duplicate does not give an authoritative answer to that in my opinion.
Does that mean that according to the standard it is legal to use the NULL constant as the terminator? (OP)
str1[3] = NULL;
Sometimes. Further: does it always properly cause a character array to form a string without concerns?
First, it looks wrong. Akin to int z = 0.0;. Yes it is legal well defined code, but unnecessarily draws attention to itself.
In practice, I have always used NULL instead of '\0' (OP)
I doubt you will find any modern style guide or group of coders endorsing that. NULL is best reserved for pointer contexts.1
These are 2 common and well understood alternatives.
str1[3] = '\0';
str1[3] = 0;
strings in C are null-terminated (OP)
The C spec consistently uses null character, not just null.
The macros are NULL which expands to an implementation-defined null pointer constant; and ... C11 §7.19 3
OK, now what is a null pointer constant?
An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant. §6.3.2.3 5
If the null pointer constant is a void* then we have something like
str1[3] = (void*) 0;
The above can warn about converting a pointer to a char. This is something best avoided.
Will it be the same in case of C++? (OP)
Yes, the above applies. (Aside: str1[3] = 0 may warn.) Further, NULL is less preferred than nullptr. So NULL is rarely the best to use in C++ in any context.
1Note: #Joshua reports a style that matches OP's in 1995 Turbo C 4.5
The bottom line is that in C/C++, NULL is for pointers and is not the same as the null character, despite the fact that both are defined as zero. You might use NULL as the null character and get away with it depending on the context and platform, but to be correct, use '\0'. This is described in both standards:
C specifies that the macro NULL is defined as a macro in <stddef.h> which "expands to an implementation-defined null pointer constant" (Section 7.17.3), which is itself defined as "an integer constant expression with the value 0, or such an expression cast to type void *" (Section 6.3.2.3.3).
The null character is defined in section 5.2.1.2: "A byte with all bits set to 0, called the null character, shall exist in the basic execution character set; it is used to terminate a character string." That same section explains that \0 will be the representation of this null character.
C++ makes the same distinctions. From section 4.10.1 of the C++ standard: "A null pointer constant is an integer literal (2.13.2) with value zero or a prvalue of type std::nullptr_t." In section 2.3.3, it describes the as "null character (respectively, null wide character), whose value is 0". Section C.5.2 further confirms that C++ respects NULL as a standard macro imported from the C Standard Library.
No, I don't think it's strictly legal.
NULL is specified to be either:
an integer constant expression with the value ​0​
an integer constant expression with the value 0 cast to the type void*
In an implementation that uses the first format, using it as the string terminator will work.
But in an implementation that uses the second format, it's not guaranteed to work. You're converting a pointer type to an integer type, and the result of this is implementation-dependent. It happens to do what you want in common implementations, but nothing requires it.
If you have the second type of implementation, you're likely to get a warning like:
warning: incompatible pointer to integer conversion assigning to 'char' from 'void *' [-Wint-conversion]
If you want to use a macro, you can define:
#define NUL '\0'
and then use NUL instead of NULL. This matches the official name of the ASCII null character.

What does "compares less than 0" mean?

Context
While I was reading Consistent comparison, I have noticed a peculiar usage of the verb to compare:
There’s a new three-way comparison operator, <=>. The expression a <=> b
returns an object that compares <0 if a < b, compares >0 if a > b, and
compares ==0 if a and b are equal/equivalent.
Another example found on the internet (emphasis mine):
It returns a value that compares less than zero on failure. Otherwise,
the returned value can be used as the first argument on a later call
to get.
One last example, found in a on GitHub (emphasis mine):
// Perform a circular 16 bit compare.
// If the distance between the two numbers is larger than 32767,
// and the numbers are larger than 32768, subtract 65536
// Thus, 65535 compares less than 0, but greater than 65534
// This handles the 65535->0 wrap around case correctly
Of course, for experienced programmers the meaning is clear. But the way the verb to compare is used in these examples is not standard in any standardized forms of English.
Questions*
How does the programming jargon sentence "The object compares less than zero" translate into plain English?
Does it mean that if the object is compared with0 the result will be "less than zero"?
Why would be wrong to say "object is less than zero" instead of "object compares less than zero"?
* I asked for help on English Language Learners and English Language & Usage.
"compares <0" in plain English is "compares less than zero".
This is a common shorthand, I believe.
So to apply this onto the entire sentence gives:
The expression a <=> b returns an object that compares less than zero
if a is less than b, compares greater than zero if a is greater than
b, and compares equal to zero if a and b are equal/equivalent.
Which is quite a mouthful. I can see why the authors would choose to use symbols.
What I am interested in, more exactly, is an equivalent expression of "compares <0". Does "compares <0" mean "evaluates to a negative number"?
First, we need to understand the difference between what you quoted and actual wording for the standard. What you quoted was just an explanation for what would actually get put into the standard.
The standard wording in P0515 for the language feature operator<=> is that it returns one of 5 possible types. Those types are defined by the library wording in P0768.
Those types are not integers. Or even enumerations. They are class types. Which means they have exactly and only the operations that the library defines for them. And the library wording is very specific about them:
The comparison category types’ relational and equality friend functions are specified with an anonymous parameter of unspecified type. This type shall be selected by the implementation such that these parameters can accept literal 0
as a corresponding argument. [Example: nullptr_t satisfies this requirement. —
end example] In this context, the behaviour of a program that supplies an argument other than a literal 0 is undefined.
Therefore, Herb's text is translated directly into standard wording: it compares less than 0. No more, no less. Not "is a negative number"; it's a value type where the only thing you can do with it is comparing it to zero.
It's important to note how Herb's descriptive text "compares less than 0" translates to the actual standard text. The standard text in P0515 makes it clear that the result of 1 <=> 2 is strong_order::less. And the standard text in P0768 tells us that strong_order::less < 0 is true.
But it also tells us that all other comparisons are the functional equivalent of the descriptive phrase "compares less than 0".
For example, if -1 "compares less than 0", then that would also imply that it does not compare equal to zero. And that it does not compare greater than 0. It also implies that 0 does not compare less than -1. And so on.
P0768 tells us that the relationship between strong_order::less and the literal 0 fits all of the implications of the words "compares less than 0".
"acompares less than zero" means that a < 0 is true.
"a compares == 0 means that a == 0 is true.
The other expressions I'm sure make sense now right?
Yes, an "object compares less than 0" means that object < 0 will yield true. Likewise, compares equal to 0 means object == 0 will yield true, and compares greater than 0 means object > 0 will yield true.
As to why he doesn't use the phrase "is less than 0", I'd guess it's to emphasize that this is all that's guaranteed. For example, this could be essentially any arbitrary type, including one that doesn't really represent an actual value, but instead only supports comparison with 0.
Just, for example, let's consider a type something like this:
class comparison_result {
enum { LT, GT, EQ } res;
friend template <class Integer>
bool operator<(comparison_result c, Integer) { return c.res == LT; }
friend template <class Integer>
bool operator<(Integer, comparison_result c) { return c.res == GT; }
// and similarly for `>` and `==`
};
[For the moment, let's assume the friend template<...> stuff is all legit--I think you get the basic idea, anyway).
This doesn't really represent a value at all. It just represents the result of "if compared to 0, should the result be less than, equal to, or greater than". As such, it's not that it is less than 0, only that it produces true or false when compared to 0 (but produces the same results when compared to another value).
As to whether <0 being true means that >0 and ==0 must be false (and vice versa): there is no such restriction on the return type for the operator itself. The language doesn't even include a way to specify or enforce such a requirement. There's nothing in the spec to prevent them from all returning true. Returning true for all the comparisons is possible and seems to be allowed, but it's probably pretty far-fetched.
Returning false for all of them is entirely reasonable though--just, for example, any and all comparisons with floating point NaNs should normally return false. NaN means "Not a Number", and something that's not a number isn't less than, equal to or greater than a number. The two are incomparable, so in every case, the answer is (quite rightly) false.
I think the other answers so far have answered mostly what the result of the operation is, and that should be clear by now. #VTT's answer explains it best, IMO.
However, so far none have answered the English language behind it.
"The object compares less than zero." is simply not standard English, at best it is jargon or slang. Which makes it all the more confusing for non-native speakers.
An equivalent would be:
A comparison of the object using <0 (less than zero) always returns true.
That's quite lengthy, so I can understand why a "shortcut" was created:
The object compares less than zero.
It means that the expression will return an object that can be compared to <0 or >0 or ==0.
If a and b are integers, then the expression evaluates to a negative value (probably -1) if a is less than b.
The expression evaluates to 0 if a==b
And the expression will evaluates to a positive value (probably 1) if a is greater than b.

What does an exclamation mark in array index do?

While perusing through my organization's source repository I came across this little gem:
RawParameterStorage[!ParameterWorkingIdx][ParameterDataOffset] = ...
Is this valid code? (It compiles) What does the exclamation mark here do?
An invert ~ operator might make sense, since it's commonly confused with the not ! operator in boolean expressions. However, it doesn't seem to make logical sense to impose the not ! operator on an array index. Any Thoughts?
!ParameterWorkingIdx Means ParameterWorkingIdx is 0, If it is, !ParameterWorkingIdx evaluates as true which might be implicitly converted to the indexer type (For example, 1 for integer indexer as in an array), otherwise, it evaluates as false.
If ParameterWorkingIdx == 0 then [!ParameterWorkingIdx] == [1].
If ParameterWorkingIdx != 0 then [!ParameterWorkingIdx] == [0].
It also depends on other stuff like:
The type of ParameterWorkingIdx.
overloading of ! operator by the type of ParameterWorkingIdx.
indexer overloading by the type of RawParameterStorage.
etc...
Taking a bit of a guess here, but that looks like a double-buffer pattern. ParameterWorkingIdx would flip-flop between 0 and 1 (probably with ParameterWorkingIdx = !ParameterWorkingIdx;).
Then, at any time, RawParameterStorage[ParameterWorkingIdx] would be the current buffer, and RawParameterStorage[!ParameterWorkingIdx] would be the previous buffer.
it doesn't seem to make logical sense to impose the not ! operator on an array index
It might: all it does here is convert zero to one, and any other number to zero.
We can infer from this code that RawParameterStorage probably has two elements at the top level.
P. S. Here, I assume that RawParameterStorage is an array (as you say it is). Furthermore, I assume that ParameterWorkingIdx is an integer (as its name implies). If, for example, either is a class with overloaded operators than the semantics could be completely different.
Is this valid code?
Yes it is. Suppose ParameterWorkingIdx to be an int, for !ParameterWorkingIdx, when used with operators !, it'll be contextually convertible to bool,
The value zero (for integral, floating-point, and unscoped enumeration) and the null pointer and the null pointer-to-member values become false. All other values become true.
Then integral promoted to be used as the array index.
the type bool can be converted to int with the value false becoming ​0​ and true becoming 1.
So !ParameterWorkingIdx is equivalent with ParameterWorkingIdx == 0 ? 1 : 0, which is much more clear IMO.