How to understand following single and double quotes terminology in C++? - c++

Help me understand the following:
cout<<'a'; //prints a and it's okay but
cout<<'ab'; //prints 24930 but I was expecting an error due to term 'ab' having two character in single quote
cout<<'a'+1; //prints 98
cout<<"ab"; // prints ab and it's okay but
cout<<"ab"+1; // prints b, why?
cout<<"a"+1; // prints nothing ?
cout<<'a'+'b'; // prints 195 ?
cout<<"a"+"b"; // gives error ?
Please help me to understand all these things in details. I am very confused. I would be very thankful.

'a' is a char type in C++. std::cout overloads << for a char to output the character, rather than the character number.
'ab' is a multicharacter literal in C++. It must have an int type. Its value is implementation defined, but 'a' * 256 + 'b' is common. With ASCII encoding, that is 24930. The overloaded << operator for an int outputs the number.
'a' + 1 is an arithmetic expression. 'a' is converted to an int type prior to the addition according to the standard integral type promotion rules.
"ab" + 1 is executing pointer arithmetic on the const char[3] type, so it's equivalent to "b". Remember that << has lower precedence than +.
"a" + 1 is similar to the above but only the NUL-terminator is output.
'a' + 'b' is an int type. Both arguments are converted to an int prior to the addition.
The arguments of "a" + "b" decay to const char* types prior to the addition. But that's the addition of two pointers, which is not valid C++.

Related

characters in c and c++

What I ask about is(is my understanding totally true ?)
char x='b'; (x has one value can be represented with two ways one of them is int and the other is char in the two languages ( C and C++))
in C
if I let the compiler choose it will choose (Will give priority to) the int value because the standard of the C makes character literal as int (give priority to int not char) like in this code
int main(){
printf("%d", sizeof('b'));
return 0;}
output is: 4 (int size) that is meaning the compiler treat b as 98 first and get the size it as int
but I can use the char value if I choose like in this code
int main(){
char x = 'b';
printf("%c", x);
return 0;}
output is: b (char)
in C ++
if I let the compiler choose it will choose (Will give priority to) the char value because the standard of the C++ makes character literal as char (give priority to char not int) like in this code
int main(){
printf("%d", sizeof('b'));
return 0;}
output is:1 (char size) that is meaning the compiler treat b as char
but I can use the int value if I choose like in this code
int main(){
char x = 'b';
printf("%d", x);
return 0;}
output is 98 (98 is the int which represents b in the ASCII Code )
Character constants, this: 'A', are of type int in C and of type char in C++. The compiler picks the type specified by the respective language standard.
Declared character variables of type char are always char and 1 byte large (typically 8 bits on non-exotic systems).
printf("%c", some_char); is a variadic function (accepts any number of parameters) and those have special implicit type promotion rules. Something called default argument promotion will integer-promote the passed character variable to int. Read about integer promotion here: Implicit type promotion rules.
printf expects that promotion to happen, so %c will mean that parameter is converted back to char according to the internals of printf. This holds true in C++ as well, though stdio.h should be avoided.
printf("%d", x); In case x is a char this would pedantically be undefined behavior. But in practice the above mentioned integer promotion is likely to occur and so it prints an integer. Also note that there's nothing magic with char as such, they are just 1 byte large integers. So it already has the value 98 before conversion.
Some of the things you've said are true, some are mistaken. Let's go through them in turn.
char x='b'; (x has two values one of them is int and the other is char in the two languages ( C and C++))
Well, no, not really. When you say char x you get a variable generally capable of holding one character, and this is perfectly, equally true in both C and C++. The only difference, as we'll see, is the type (not the value) of the character constant 'b'.
in C if I let the compiler choose
I'm not sure what you mean by "choose", because there's not really any choice in the matter.
it will choose (Will give priority to) the int value like in this code
printf("%d", sizeof('b'));
output is: 4 (int size) that is meaning the compiler change b to 98 first and get the size it as int
Not exactly. You got 4 because, yes, the type of a character constant like 'b' is int. In ASCII, the character constant 'b' has the value 98 no matter what. The compiler didn't change a character to an int here, and the value didn't change from 'b' to 98 here. It was an int all along (because that's the type of character constants in C), and it had the value 98 all along (because that's the value of the letter b in ASCII).
but I can use the char value if I choose like in this code
`printf("%c", x);
Right. But there's nothing magical about that. Consider:
char c1 = 'b';
int i1 = 'b';
char c2 = 98;
int i2 = 98;
printf("%c %c %c %c\n", c1, i1, c2, i2);
printf("%d %d %d %d\n", c1, i1, c2, i2);
This prints
b b b b
98 98 98 98
You can print an int, or a char, using %c, and you'll get the character with that value.
You can print an int, or a char, using %d, and you'll get a numeric value.
(More on this later.)
in C ++ if I let the compiler choose it will choose (Will give priority to) the char value like in this code
printf("%d", sizeof('b'));
output is:1 (char size) that is meaning the compiler did not change b to int
What you're seeing is one of the big differences between C and C++: character constants like 'b' are type int in C, but type char in C++.
(Why is it this way? I'll speculate on that later.)
but I can use the int value if I choose like in this code
char x = 'b';
printf("%d", x);
output is 98 (98 is the int which represents b in the ASCII Code )
Right. And this would work exactly the same in C and C++. (Also my earlier printfs of c1, i1, c2, and i2 would work exactly the same in C and C++.)
So why are the types of character constants different? I'm not sure, but I believe it's like this:
C likes to promote everything to int. (Incidentally, that's why we were able to pass characters straight to printf and print them using %d: the characters all get promoted to int before being passed to printf, so it works just fine.) So there would be no point having character constants of type char, because any time you used one, for anything, it would get promoted to int. So character constants might as well start out being int.
In C++, on the other hand, the type of things matters more. And you might have two overloaded functions, f(int) and f(char), and if you called f('b'), clearly you want the version of f() called that accepts a char. So in C++ there was a reason, a good reason, to have character constants be type char, just like it looks like they are.
Addendum:
The fundamental issue that you're asking about here, that we've been kind of dancing around in this answer and these comments, is that in C (as in most languages) there are several forms of constant, that let you write constants in forms that are convenient and meaningful to you. For any different form of constant you can write, there are several things of interest, but most importantly what is the actual value? and what is the type?.
It may be easier to show this by example. Here is a rather large number of ways of representing the constant value 98 in a C or C++ program:
Form of constant
base
type
value
98
10
int
98
0142
8
unsigned
98
0x62
16
unsigned
98
98.
10
double
98
9.8e1
10
double
98
98.f
10
float
98
9.8e1f
10
float
98
98L
10
long
98
98U
10
unsigned
98
'b'
ASCII
int/char
98
This table is not even complete; there are more ways than these to write constants in C and C++.
But with one exception, every row in this table is equally true of both C and C++, except the last row. In C, the type of a character constant is int. In C++, the type of a character constant is char.
The type of a constant determines what happens when a constant appears in a larger expression. In that respect the type of a constant functions analogously to the type of a variable. If I write
int a = 1, b = 3;
double c = a / b;
it doesn't work right, because the rule in C is that when you divide an int by an int, you get truncating integer division. But the point is that the type of an operand directly determines the meaning of an expression. So the type of a constant becomes very interesting, too, as seen by the different behavior of these two lines:
double c2 = 1 / 3;
double c3 = 1. / 3;
Similarly, it can make a difference whether a constant has type int or type char. An expression that depends on whether a character constant has type int or type char will behave slightly differently in C versus C++. (In practice, pretty much the only difference that can be easily seen concerns sizeof.)
For completeness, it may be useful to look at the several other forms of character constant, in the same framework:
Form of constant
base
type
value
'b'
ASCII
int/char
98
'\142'
8
int/char
98
'\x62'
16
int/char
98

How do C++ compiler interpret comparison logic in string/character?

When we compare numbers in a string/character format, how does the c++ compiler interpret it? The example below will make it clear.
#include <iostream>
using namespace std;
int main() {
// your code goes here
if ('1'<'2')
cout<<"true";
return 0;
}
The output is
true
What is happening inside the compiler? Is there an implicit conversion happening from string to integer just like when we refer an index in an array using a character,
arr['a']
=> arr[97]
'1' is a char type in C++ with an implementation defined value - although the ASCII value of the character 1 is common, and it cannot be negative.
The expression arr['a'] is defined as per pointer arithmetic: *(arr + 'a'). If this is outside the bounds of the array then the behaviour of the program is undefined.
Note that '1' < '2' is true on any platform. The same cannot be said for 'a' < 'b' always being true although I've never come across a platform where it is not true. That said, in ASCII 'A' is less than 'a', but in EBCDIC (in all variants) 'A' is greater than 'a'!
The behaviour of an expression like "ab" < "cd" is unspecified. This is because both const char[3] constants decay to const char* types, and the behaviour of comparing two pointers that do not point to objects in the same array is unspecified.
(A final note: in C '1', '2', and 'a' are all int types.)
The operands '1' and '2' are not strings, they're char literals.
The characters represent specific numbers of type char, typically defined by the ASCII table, specifically 49 for '1' and 50 for '2'.
The operator < compares those numbers, and since the number representation of '1' is lesser than that of '2', the result of '1'<'2' is true.

subtracting characters of numbers

I am trying to understand what is going on with the code:
cout << '5' - '3';
Is what I am printing an int? Why does it automatically change them to ints when I use the subtraction operator?
In C++ character literals just denote integer values.
A basic literal like '5' denotes a char integer value, which with almost all extant character encodings is 48 + 5 (because the character 0 is represented as value 48, and the C++ standard guarantees that the digit values are consecutive, although there's no such guarantee for letters).
Then, when you use them in an arithmetic expression, or even just write +'5', the char values are promoted to int. Or less imprecisely, the “usual arithmetic conversions” kick in, and convert up to the nearest type that is int or *higher that can represent all char values. This change of type affects how e.g. cout will present the value.
* Since a char is a single byte by definition, and since int can't be less than one byte, and since in practice all bits of an int are value representation bits, it's at best only in the most pedantic formal that a char can be converted up to a higher type than int. If that possibility exists in the formal, then it's pure language lawyer stuff.
What you're doing here is subtracting the value for ASCII character '5' from the value for ASCII character '3'. So '5' - '3' is equivalent to 53 - 51 which results in 2.
The ASCII value of character is here
Every character in C programming is given an integer value to represent it. That integer value is known as ASCII value of that character. For example: ASCII value of 'a' is 97. For example: If you try to store character 'a' in a char type variable, ASCII value of that character is stored which is 97.
Subtracting between '5' and '3' means subtracting between their ASCII value. So, replace cout << '5' - '3'; with their ASCII value cout << 53 - 51;. Because Every character in C programming is given an integer value to represent it.
There is a subtraction operation between two integer number, so, it prints a integer 2

Trouble understanding this output

Can someone please guide me as to how these answers are being produced. For ii.) Why are the letters being turned into numbers? For iii.) What is going on here?
Problem 23: Suppose that a C++ program called prog.cpp is compiled and correctly executed on venus with the instructions:
venus> g++ prog.cpp
venus> a.out file1 file2 file3
For each of the following short segments of the program prog.cpp write exactly what output is produced. Each answer should consist of those symbols printed by the given part of the program and nothing else.
(ii)
char a = ’a’;
while (a <= ’f’) {
cout << ’a’ - a;
a = a + 1; }
Answer:
0-1-2-3-4-5
(iii)
int main(int argc, char *argv[]) {
cout << argc;
Answer:
4
a-'a'
returns a number since the ASCII number of the char a, which is 97, is subtracted from the ASCII value of the variable a. So, the difference in the ASCII value is printed as the integer.
The second case, argc prints the number of commandline arguments given running the program.
cout << ’a’ - a;
This is actually a very interesting case. Clearly the values should be 0, then -1, then -2, etc., but if you cout those numbers with char type (e.g. cout << char(0) << char(-1);) you'll get "garbage" (or perhaps nothing) on your terminal - 0 is a non-printable NUL character, and (char)-1, (char)-2 might end up being rendered as some strange graphics character (e.g. blank and a square dot per http://www.goldparser.org/images/chart-set-ibm-pc2.gif)...
The reason they're being printed as readable numbers is that the 'a' - a expression evaluates to an int type - not char - and ints do print in a human-readable numeric format.
'a' is a character literal of type char, encoded using the ASCII value 97. a is a variable but also of type char. And yet, before one can be subtracted from the other they undergo Integral Promotion to type int; from the C++11 Standard:
4.5 Integral promotions [conv.prom]
1 A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t whose integer
conversion rank (4.13) is less than the rank of int can be converted
to a prvalue of type int if int can represent all the values of the
source type; otherwise, the source prvalue can be converted to a
prvalue of type unsigned int.
Given 4.13 says char has lower rank than int, this means char can be converted to int if needed, but why is it needed?
The compiler's conceptually providing a int operator-(int, int) function but no char operator-(char, char), and that forces subtraction of two char values to be shoehorned into the int operator after undergoing the Integral Promotion.
(ii)
char a = ’a’;
while (a <= ’f’) {
cout << ’a’ - a;
a = a + 1;
}`
Answer: 0-1-2-3-4-5
Explanation:
This is a loop that would consider the ascii value of 'a' and compare it to 'f'. Thus the loop will progress from 'a' to 'f' and terminate on 'f'.
Inside the loop it will subtract the value of the variable a from ascii value of 'a' thus you see 1-2-3-4-5. To understand this better you can put print statements before each statement
(iii)
int main(int argc, char *argv[]) {
cout << argc;
}
Answer: 4
Explanation: Since the Answer is 4 i presume there is no syntax error and that was just a type in your function (ending brace in the question '}')
argv - The array of pointers to character
argc - This is the count of elements in argv
when you run the executable you pass it some command line parameters these will be stored in argv and the count/number of elements would be stored in argc.
argc is 4 means it was called as myexe.exe param1 param2 param3, where params are 3 parameters along with the myexe.exe making the count to 4.

return value of sizeof() operator in C & C++ [duplicate]

This question already has answers here:
Size of character ('a') in C/C++
(4 answers)
Closed 9 years ago.
#include<stdio.h>
int main()
{
printf("%d", sizeof('a'));
return 0;
}
Why does the above code produce different results when compiling in C and C++ ?
In C, it prints 4 while in C++, it is the more acceptable answer i.e. 1.
When I replace the 'a' inside sizeof() with a char variable declared in main function, the result is 1 in both cases!
Because, and this might be shocking, C and C++ are not the same language.
C defines character literals as having type int, while C++ considers them to have type char.
This is a case where multi-character constants can be useful:
const int foo = 'foo';
That will generate an integer whose value will probably be 6713199 or 7303014 depending on the byte-ordering and the compiler's mood. In other words, multiple-character character literals are not "portable", you cannot depend on the resulting value being easy to predict.
As commenters have pointed out (thanks!) this is valid in both C and C++, it seems C++ makes multi-character character literals a different type. Clever!
Also, as a minor note that I like to mention when on topic, note that sizeof is not a function and that values of size_t are not int. Thus:
printf("the size of a character is %zu\n", sizeof 'a');
or, if your compiler is too old not to support C99:
printf("the size of a character is %lu\n", (unsigned long) sizeof 'a');
represent the simplest and most correct way to print the sizes you're investigating.
In C, the 'a' is a character constant, which is treated as an integer, so you get a size of 4, whereas in C++ it's treated as a char.
possible duplicate question Size of character ('a') in C/C++
You example is one of the cases where C++ is not compatible with C. In C APIs that return a single character (like getch) return and int because this allows space for an EOF marker (-1). So for this reason it sees 'c' as an int.
C++ introduces operator and function overloading which means that we probably want to handle
cout << 'c'
differently to:
cout << 99