'a' == 'b'. It's a good way to do? - c++

What happens if I compare two characters in this way:
if ('a' == 'b')
doSomething();
I'm really curious to know what the language (and the compiler) does when it finds a comparison like this. And, of course, if it is a correct way to do something, or if I have to use something like strcmp().
EDIT
Wait wait.
Since someone haven't understood what I really mean, I decided to explain in another way.
char x, y;
cout << "Put a character: ";
cin >> x;
cout << "Put another character: ";
cin >> y;
if (x == y)
doSomething();
Of course, in the if brackets you can replace == with any other comparison operator.
What really I want to know is: how the character are considered in C/C++? When the compiler compares two characters, how does it know that 'a' is different than 'b'? It refers to the ASCII table?

you can absolutely securely compare fundamental types by comparison operator ==

In C and C++, single character constants (and char variables) are integer values (in the mathematical sense, not in the sense of int values). The compiler compares them as integers when you use ==. You can also use the other integer comparison operators (<, <=, etc.) You can also add and subtract them. (For instance, a common idiom to change a digit character into its numerical value is c - '0'.)

For single chars, this form is correct. If both operands are known at compile time as in your example, then the condition can (and almost certainly will) be evaluated at compile time and not result in any code.
Note that a char ('a') is different from a single-character string ("a"). For the latter, comparison has a different meaning: it would compare the pointers rather than the characters.

Your processor would subtract both operands and if it's zero, zero condition bit is set, your values were the same.
For example: on arm machines you have the nzcv (negative, zero, carry, overflow) bits which tell you what happened.

Nothing will happen as a doesn't equal b.
If you question is just about is that the correct way, then the answer is yes.

First 'a' and 'b' are not strings, they are characters. The nuance is important because of its implications.
You can compare characters to characters just fine the same way you can compare integers to integers and floats with floats. It's usually not done because the outcome will always be the same. i.e. 'a' == 'b' will always be false.
If you're comparing strings, however, you'll have to use something like strcmp().

Compiler simply inserts an instruction for comparing two bytes for equality - a very efficient operation. Of course in your case 'a'=='b' is equivalent to a constant false.

The compiler will compare the numeric ASCII codes. So, 'a' is never equal to 'b'. But,
'a' < 'b' evaluates to true, since 'a' appears before 'b' in the ASCII table.

Of course, you want to use variables like
char myChr = 'a' ;
if( myChr == 'b' ) puts( "It's b" ) ;
Now you can start to think about "Yoda conditions", where you would do
if( 'b' == myChr ) puts( "It's a b" ) ;
so that in case you accidently typed one equals sign in the 2nd example:
if( 'b' = myChr ) puts( "It's a b" ) ;
that would raise a compiler error

Related

Omit "> 0" in conditional?

I have recently inherited an old project to make some optimization and add new features. In this project I have seen this type of condition all over the code:
if (int_variable)
instead of
if (int_variable > 0)
I have used the first option only with boolean type of variables.
Do you think the first option is a "correct" way to check if a number is positive?
negative numbers evaluate to true also,so you should go with if (int_variable > 0) to check if a number is positive.
No. The first option is a correct way to check if a number is non-zero.
Any non-zero value will be considered true, so your version with > is a bit sketchy if the variable truly is int, i.e. signed. For clarity I prefer to make that explicit in many cases, i.e. write
if (int_variable != 0)
This is perhaps a bit convoluted (basically computing a Boolean where one can be automatically inferred), I would never use if (bool_variable == true) for instance but I think the integer case makes the test clearer. Reading the two cases out loud works way better for the integer.
if (int_variable) has the same meaning as if (int_variable != 0).
This is not the same as if (int_variable > 0) unless you otherwise know that the value will never be negative.
When using an integer value as a truth value in C, zero yields false while any non-zero value yields true.
In your case, assuming int_variable is signed, (int_variable) is not equivalent to (int_variable > 0) since int_variable being negative will yield true and false, respectively.
So
if (int_variable) { … }
is not a correct way of checking whether int_variable is positive.
In C any expression which evaluates to integer 0 or a null pointer is considered false. Anything else is true.
So basically something like
int i = ...;
if ( i )
Test if i is 0. However, if i is negative, the condition will also be true, so this is no replacement for i > 0. Unless i is unsigned. Simply because they cannot be negative. But your compiler may warn about such constructs as they often show a wrong concept.
Coming to coding style, if ( i ) is bad style if you are really comparing to the value 0. You better compare explicitly: if ( i != 0 ). However, if i contains some boolean result, you should use a "speaking" name (e.g. isalpha('A') to use a ctypes.h function instead of a variable) and do not compare. Compare the readbility:
char ch = ...;
if ( isalpha(ch) )
if ( isalpha(ch) != 0 )
For the latter you have to think a second about the comparison - unnecessarily.

array of char is equal to int?

I am trying to figure out why an array of char is assigned to a int value, now I am a little confused in using cast operator.
I didn't get what is in do statement, I hope somebody can explain
char *readword()
{
int c,i;
char t[255];
char *p;
//jump over chars who aren't letters
while ((c=getchar())<'A'|| (c>'Z' && c<'a') || c>'z')
if (c==EOF) return 0;
i=0;
do {
t[i++]=c;// shouldn't be like (char)c
} while ((c=getchar())>='A' && c<='Z' || c>='a' && c<='z');
//keep the word in heap memory
if ( c==EOF)
return 0;
t[i++]='\0';
if ((p=(char *)malloc(i))==0)
{
printf(" not enough memory\n");
exit(1);
}
strcpy(p,t);
return p;
}
The getchar() function returns an int type; and it is important to use an int to capture the getchar() return value. This is due to if getchar() fails, it returns an (int)(EOF)(as per chux comment. When it successfully returns, it will return a value that is suitable for a char.
The question code is building a char string or array, one char at a time:
t[i++]=c;
The above line could be written:
t[i++]=(char)c;
Either is suitable due to the compiler automatically converting the first case.
The mixture of char and int is fairly simple: EOF is intended as a file that can be distinguished from any value you could have read from the file.
To support that, you need to initially read the data from the file into something larger than a char, so it can accommodate at least one value that couldn't possibly have come from the file. The type they chose for that purpose was int.
So, you read a character from the file, into an int. You compare that to EOF to see if it's really a character that came from the file or not. If (and only if) you verify that it really came from the file, you save the value into a char, because you now know that's what it really represents.
That said, I'd consider it pretty poor code as it stands right now. Just for one particularly obvious example, instead of the c<'A'|| (c>'Z' && c<'a') || c>'z') type of code, you almost certainly want to use isalpha(c) instead.
It's also a lot easier to do this with scanf instead.
You can assign any int value to a char. Only the lowest 8 bits will be used. A cast would be more "correct" in terms of communicating your intent - people might not otherwise remember that anything larger than an 8-bit value will get truncated and results are likely to be unexpected.
Note that since you didn't say "unsigned char t[255]" that you actually get 7 bits and the most significant (8th) bit will be interpreted as a sign. So for example if you were to say
char t = 0xFF;
then you would in fact get -1 assigned to t.
If you assign numbers > 0xFF then all bits higher than the 8th bit will get stripped. So if you were to say:
char t = 0x101;
Than in fact you'd get the value 1 assigned to t.
The code in question is correct because getchar() returns an int and -1 is an error value so it's important to check it. For non-error cases the return will fit in an 8-bit char.

Non-Integer numbers in an String and using atoi

If there are non-number characters in a string and you call atoi [I'm assuming wtoi will do the same]. How will atoi treat the string?
Lets say for an example I have the following strings:
"20234543"
"232B"
"B"
I'm sure that 1 will return the integer 20234543. What I'm curious is if 2 will return "232." [Thats what I need to solve my problem]. Also 3 should not return a value. Are these beliefs false? Also... if 2 does act as I believe, how does it handle the e character at the end of the string? [Thats typically used in exponential notation]
You can test this sort of thing yourself. I copied the code from the Cplusplus reference site. It looks like your intuition about the first two examples are correct, but the third example returns '0'. 'E' and 'e' are treated just like 'B' is in the second example also.
So the rules are
On success, the function returns the converted integral number as an int value.
If no valid conversion could be performed, a zero value is returned.
If the correct value is out of the range of representable values, INT_MAX or INT_MIN is returned.
According to the standard, "The functions atof, atoi, atol, and atoll need not affect the value of the integer expression errno on an error. If the value of the result cannot be represented, the behavior is undefined." (7.20.1, Numeric conversion functions in C99).
So, technically, anything could happen. Even for the first case, since INT_MAX is guaranteed to be at least 32767, and since 20234543 is greater than that, it could fail as well.
For better error checking, use strtol:
const char *s = "232B";
char *eptr;
long value = strtol(s, &eptr, 10); /* 10 is the base */
/* now, value is 232, eptr points to "B" */
s = "20234543";
value = strtol(s, &eptr, 10);
s = "123456789012345";
value = strtol(s, &eptr, 10);
/* If there was no overflow, value will contain 123456789012345,
otherwise, value will contain LONG_MAX and errno will be ERANGE */
If you need to parse numbers with "e" in them (exponential notation), then you should use strtod. Of course, such numbers are floating-point, and strtod returns double. If you want to make an integer out of it, you can do a conversion after checking for the correct range.
atoi reads digits from the buffer until it can't any more. It stops when it encounters any character that isn't a digit, except whitespace (which it skips) or a '+' or a '-' before it has seen any digits (which it uses to select the appropriate sign for the result). It returns 0 if it saw no digits.
So to answer your specific questions: 1 returns 20234543. 2 returns 232. 3 returns 0. The character 'e' is not whitespace, a digit, '+' or '-' so atoi stops and returns if it encounters that character.
See also here.
If atoi encounters a non-number character, it returns the number formed up until that point.
I tried using atoi() in a project, but it wouldn't work if there were any non-digit characters in the mix and they came before the digit characters - it'll return zero. It seems to not mind if they come after the digits, for whatever reason.
Here's a pretty bare bones string to int converter I wrote up that doesn't seem to have that problem (bare bones in that it doesn't work with negative numbers and it doesn't incorporate any error handling, but it might be helpful in specific instances). Hopefully it might be helpful.
int stringToInt(std::string newIntString)
{
unsigned int dataElement = 0;
unsigned int i = 0;
while ( i < newIntString.length())
{
if (newIntString[i]>=48 && newIntString[i]<=57)
{
dataElement += static_cast<unsigned int>(newIntString[i]-'0')*(pow(10,newIntString.length()-(i+1)));
}
i++;
}
return dataElement;
}
I blamed myself up to this atoi-function behaviour when I was learning-approached coding program with function calculating integer factorial result given input parameter by launching command line parameter.
atoi-function returns 0 if value is something else than numeral value and "3asdf" returns 3. C -language handles command line input parameters in char -array pointer variable as we all already know.
I was told that down at the book "Linux Hater's Handbook" there's some discussion appealing for computer geeks doesn't really like atoi-function, it's kind of foolish in reason that there's no way to check validity of given input type.
Some guy asked me why I don't brother to use strtol -function located on stdlib.h -library and he gave me an example attached to my factorial-calculating recursive method but I don't care about factorial result is bigger than integer primary type value -range, out of ranged (too large base number). It will result in negative values in my program.
I solved my problem with atoi-function first checking if given user's input parameter is truly numerical value and if that matches, after then I calculate the factorial value.
Using isdigit() -function located on chtype.h -library is following:
int checkInput(char *str[]) {
for (int x = 0; x < strlen(*str); ++x)
{
if (!isdigit(*str[x])) return 1;
}
return 0;
}
My forum-pal down in other Linux programming forum told me that if I would use strtol I could handle the situations with out of ranged values or even parse signed int to unsigned long -type meaning -0 and other negative values are not accepted.
It's important upper on my code check if charachter is not numerical value. Negotation way to check this one the function returns failed results when first numerical value comes next to check in string. (or char array in C)
Writing simple code and looking to see what it does is magical and illuminating.
On point #3, it won't return "nothing." It can't. It'll return something, but that something won't be useful to you.
http://www.cplusplus.com/reference/clibrary/cstdlib/atoi/
On success, the function returns the converted integral number as an int value.
If no valid conversion could be performed, a zero value is returned.
If the correct value is out of the range of representable values, INT_MAX or INT_MIN is returned.

quiz ; does this compile and if so what does it return (I know the answer)

I found this typo recently:
if (name.find('/' != string::npos))
Obviously the dev meant to type
if(name.find('/') != string::npos)
But I was amazed that to find that the error even compiles with -Wall -Werror (didnt try with -pedantic)
So, coffee quiz: does it evaluate to true or false?
'/' doesn't equal string::npos since npos is required to be negative, and none of the characters in the basic execution character set is allowed to be negative. Therefore, it's going to look for a value of 1 in the string (presumably a string anyway) represented by name. That's a pretty unusual value to have in a string, so it's usually not going to find it, which means it'll return std::string::npos, which will convert to true.
Edit: as Johannes pointed out, although the value assigned to npos must be negative 1 (as per 21.3/6) that's being assigned to a size_type, which must be unsigned, so the result won't be negative. This wouldn't normally make any real difference though -- the '/' would be compared to npos using unsigned arithmetic, so the only way they could have the same value would be if 1) '/' was encoded as -1 (not allowed as above) or char had the same range as size_type.
In theory, the standard allows char to have the same range as other integral types. In fact, quite a bit of I/O depends on EOF having a value that couldn't originate from the file, which basically translates to a requirement that char have a range that's smaller than int, not just smaller than or equal to (as the standard directly requires).
That does leave one loophole, though it's one that would generally be quite horrible: that char and short have the same range, size_type is the same as unsigned short, and int has a greater range than char/short. Giving char and short the same range wouldn't be all that horrible, but restricting size_type to the same range as short normally would be -- in a typical case, short is 16 bits, so it would restrict containers to 64K. That kind of restriction was problematic 20 years ago under MS-DOS; it simply wouldn't be accepted in most markets today.
It depends on if name starts with a char equal to 1.
You shouldn't be amazed it compiles, there's nothing wrong about it. '/' != std::string:npos evaluates to true, and the only overload of find that would work is the char c, size_t pos version, as bool can be converted to an integer 1.
So now we're looking for (char)1, and what that returns depends on the string. If it starts with (char)1, it returns 0 and that's false. In any other case, it returns a non-zero integer, or true.
'/' != string::npos evaluates to true. true is promoted to int (value = 1). find probably doesn't find a value of 1. if expression probably returns string::npos, which is typically -1, which is not zero, and is therefore true. My guess: true.
I'd say false, unless name contains a char with value 0x01.
I'm surprised the implicit cast from bool to char doesn't emit a warning... as far as I can tell, it'll return true unless name begins with '\001'.
It will evaluate to true if name contains a char == SOH
otherwise false
Others have posted the correct answer already: The result of the boolean expression should be 1 (a truth value), because '/' should have a value smaller than the unsigned string::npos (defined to be the largest value a size_t can hold). Because 1 is an integer, and because 1 can't possibly be an address, the compiler finds the only overload of string::find() it can call is the one with char c, size_t pos.
But that's not the end of the story. Try to change the boolean expression from '/' != string::npos to '/' == string::npos. Now the result of the expression is 0, again an integer. Because there is no overload for string::find() that takes an int, the compiler must cast 0 -- but to what? It can cast it to a char and it can cast it to a pointer. Both are valid choices, so that's an ambiguous call.
So there you go: your code changes from a valid warning-free function call to an ambiguous function call by changing an operator from != to ==.

Why is there no ^^ operator in C/C++?

& has &&. | has ||. Why doesn't ^ have ^^?
I understand that it wouldn't be short-circuiting, but it would have different semantics. In C, true is really any non-zero value. Bitwise XOR is not always the same thing as logical XOR:
int a=strcmp(str1,str2);// evaluates to 1, which is "true"
int b=strcmp(str1,str3);// evaluates to 2, which is also "true"
int c=a ^^ b; // this would be false, since true ^ true = false
int d=a ^ b; //oops, this is true again, it is 3 (^ is bitwise)
Since you can't always rely on a true value being 1 or -1, wouldn't a ^^ operator be very helpful? I often have to do strange things like this:
if(!!a ^ !!b) // looks strange
Dennis Ritchie answers
There are both historical and practical reasons why there is no ^^ operator.
The practical is: there's not much use for the operator. The main point of && and || is to take advantage of their short-circuit evaluation not only for efficiency reasons, but more often for expressiveness and correctness.
[...]
By contrast, an ^^ operator would always force evaluation of both arms of the expression, so there's no efficiency gain. Furthermore, situations in which ^^ is really called for are pretty rare, though examples can be created. These situations get rarer and stranger as you stack up the operator--
if (cond1() ^^ cond2() ^^ cond3() ^^ ...) ...
does the consequent exactly when an odd number of the condx()s are true. By contrast, the && and || analogs remain fairly plausible and useful.
Technically, one already exists:
a != b
since this will evaluate to true if the truth value of the operands differ.
Edit:
Volte's comment:
(!a) != (!b)
is correct because my answer above does not work for int types. I will delete mine if he adds his answer.
Edit again:
Maybe I'm forgetting something from C++, but the more I think about this, the more I wonder why you would ever write if (1 ^ 2) in the first place. The purpose for ^ is to exclusive-or two numbers together (which evaluates to another number), not convert them to boolean values and compare their truth values.
This seems like it would be an odd assumption for a language designer to make.
For non-bool operands, I guess what you would want is for a ^^ b to be evaluated as:
(a != 0) ^ (b != 0)
Well, you have the above option and you have a few options listed in other answers.
The operator ^^ would be redundant for bool operands. Talking only about boolean operands, for the sake of argument, let's pretend that ^ was bitwise-only and that ^^ existed as a logical XOR. You then have these choices:
& - Bitwise AND -- always evaluates both operands
&& - Logical AND -- does not always evaluate both operands
| - Bitwise OR -- always evaluates both operands
|| - Logical OR -- does not always evaluate both operands
^ - Bitwise XOR -- must always evaluate both operands
^^ - Logical XOR -- must always evaluate both operands
Why didn't they create ^^ to essentially convert numerical values into bools and then act as ^? That's a good question. Perhaps because it's more potentially confusing than && and ||, perhaps because you can easily construct the equivalent of ^^ with other operators.
I can't say what was in the heads of Kernighan and Ritchie when they invented C, but you made a brief reference to "wouldn't be short-circuiting", and I'm guessing that's the reason: It's not possible to implement it consistently. You can't short-circuit XOR like you can AND and OR, so ^^ could not fully parallel && and ||. So the authors might well have decided that making an operation that sort of kind of looks like its parallel to the others but isn't quite would be worse than not having it at all.
Personally, the main reason I use && and || is for the short-circuit rather than the non-bitwise. Actually I very rarely use the bitwise operators at all.
Another workaround to the ones posted above (even if it requires another branch in the code) would be:
if ( (a? !b : b ) )
that is equivalent to xor.
In Java the ^ operator indeed does do logical XOR when used on two boolean operands (just like & and | in Java do non-short-circuiting logical AND and OR, respectively, when applied to booleans). The main difference with C / C++ is that C / C++ allows you to mix integers and booleans, whereas Java doesn't.
But I think it's bad practice to use integers as booleans anyway. If you want to do logical operations, you should stick to either bool values, or integers that are either 0 or 1. Then ^ works fine as logical XOR.
An analogous question would be to ask, how would you do non-short-circuiting logical AND and OR in C / C++? The usual answer is to use the & and | operators respectively. But again, this depends on the values being bool or either 0 or 1. If you allow any integer values, then this does not work either.
Regardless of the case for or against ^^ as an operator, you example with strcmp() sucks. It does not return a truth value (true or false), it returns a relation between its inputs, encoded as an integer.
Sure, any integer can be interpreted as a truth value in C, in which case 0 is "false" and all other values are "true", but that is the opposite of what strcmp() returns.
Your example should begin:
int a = strcmp(str1, str2) == 0; // evaluates to 0, which is "false"
int b = strcmp(str1, str3) == 0; // evaluates to 0, which is also "false"
You must compare the return value with 0 to convert it to a proper boolean value indicating if the strings were equal or not.
With "proper" booleans, represented canonically as 0 or 1, the bitwise ^ operator works a lot better, too ...