Is the following code correct?
char mychar = 200;
printf("%x", mychar);
According to http://www.cplusplus.com/reference/clibrary/cstdio/printf/ %x expects an integer (4 bytes with my compiler) and I only pass 1 byte here. Since printf makes use of varargs, my fear is that this only works because of byte alignment on the stack (i.e. a char always uses 4 bytes when pushed on the stack).
I think it would be better to write:
char mychar = 200;
printf("%x", static_cast<int>(mychar));
Do you think the first code is safe anyway? If not, do you think I could get a different output if I switch to a bigendian architecture?
In your example, the argument is actually of type int. mychar is promoted to int due to the default argument promotions.
(C99, 6.5.2.2p6) "If the expression that denotes the called function has a type that does not include a prototype, the integer promotions are performed on each argument, and arguments that have type float are promoted to double. These are called the default argument promotions."
and (emphasis mine):
(C99, 6.5.2.2p7) "If the expression that denotes the called function has a type that does include a prototype, the arguments are implicitly converted, as if by assignment, to the types of the corresponding parameters, taking the type of each parameter to be the unqualified version of its declared type. The ellipsis notation in a function prototype declarator causes argument type conversion to stop after the last declared parameter. The default argument
promotions are performed on trailing arguments."
Note that technically the x conversion specifier requires an unsigned int argument but int and unsigned int types are guaranteed to have the same representation.
This only works for you because your platform is able to interpret an int (to which your char is promoted) as an unsigned, this is what %x expects. To be sure that this always works you should use and appropriate format specifier, namely %d.
Since it seems that you want to use char as numerical quantity it would be better to use the two alternative types signed char or unsigned char to make your intention clear. char can be a signed or unsigned type on function of your platform. The correct specifier for unsigned char would then be %hhx.
Simply use printf("%hhx", mychar). And please, don't use cplusplus.com as a reference. This question only proves its reputation of containing many errors and leaving out a lot of information.
If you're using C++, then you might want to use streams instead.
You will need to cast characters to integers, otherwise they will be printed as glyphs. Also, signed characters will be sign extended during the cast, so it's better to use unsigned char. For example a char holding 0xFF will print as 0xFFFFFFFF unless casted to unsigned char first.
#include <iostream>
#include <iomanip>
int main(void){
char c = 0xFF;
std::cout << std::hex << std::setw(2) << std::setfill('0');
std::cout << static_cast<int>(static_cast<unsigned char>(c)) << std::endl;
return 0;
}
Related
I'm doing some research about variadic functions and arguments.
I noticed that va_arg is able to cast objects into other objects. For example, when the next argument is a char, but you are using va_arg like it should be an int, it casts the char to an int:
Function call:
AddNumbers(3, 5, 'a', 20);
Function declaration:
int AddNumbers(int count, ...) {
int result;
va_list va;
va_start(va, count);
for(int i = 0; i < count; ++i) {
result += va_arg(va, int);
}
return result;
}
So my question is: what kind of casting does this function use (C-style, dynamic_cast, etc.)?
It has nothing to do with casting. In this case it works because of argument promotion.
When you use a varargs function, integer types smaller than int (like e.g. char) are promoted to int.
That's why it works when passing character literals to the function. If you try some other type, like e.g. a floating point value, then it will no longer work.
As for what the va_arg macro (it's mostly implemented as a macro) does, it is totally implementation dependent. It might not do any casting at all, but use some other form of type punning instead.
It is not exactly casting, because va_arg is a macro, not a function. So the type of va_arg(va, int) is by construction int.
But promotions do occur when calling the variadic function (except for the argument preceding the ellipsis.
So in AddNumbers(3, 5, 'a', 20);, the character 'a' is promoted to an int. You can then either use it as an int as your code does, or convert it back to a char (a conversion not a cast) which is defined because the int value can be represented as a char (by construction)
If I understand your question correctly:
C performs certain "casts" by default when calling a variadic function. GCC calls these default argument promotions. They are:
Signed or unsigned char and short int become signed or unsigned int.
float becomes double.
This isn't really a matter of va_arg doing a cast within AddNumbers; instead, when calling AddNumbers, the caller promotes the arguments to AddNumbers as part of pushing them onto the stack for AddNumbers to use.
What are some reasons you would cast a char to an int in C++?
It rarely makes sense to cast a char value to int.
First, a bit of terminology. A cast is an explicit conversion, specified via a cast operator, either the C-style (type)expr or one of the C++ casts such as static_cast<type>(expr). An implicit conversion is not a cast. You'll sometimes see the phrase "implicit cast", but there is no such thing in C++ (or C).
Most arithmetic operators promote their operands if they're of an integer type narrower than int or unsigned int. For example, in the expression '0' + 1, the char value of 0 is promoted to int before the addition, and the result is of type int.
If you want to assign a char value to an int object, just assign it. The value is implicitly converted, just as it would be if you used a cast.
In most cases, implicit conversions are preferred to casts, partly because they're less error-prone. An implicit conversion specified by the language rules usually does the right thing.
There are cases where you really do need to cast a char to int. Here's one:
char c = 'A';
std::cout << "c = '" << c << "' = " << static_cast<int>(c) << "\n";
The overloaded << operator accepts either char or int (among many other types), so there's no implicit conversion. Without the cast, the same char value would be used twice. The output (assuming an ASCII-based character set) is:
c = 'A' = 65
This is an unusual case because the << operator treats types char and int very differently. In most contexts, since char is already an integer type, it doesn't really matter whether you use a char or int value.
I can think of one other very obscure possibility. char values are almost always promoted to int. But for an implementation in which plain char is unsigned and char and int are the same width, plain char is promoted to unsigned int. This can only happen if CHAR_BIT >= 16 and sizeof (int) == 1. You're very unlikely to encounter such an implementation. Even on such a system, it usually won't matter whether char is promoted to int or to unsigned int, since either promotion will yield the correct numeric value.
In general, this should rarely ever happen as it's fairly explicit when to use a char and when to use an int.
However if you were interested in performing arithmetic on a group of chars, you would require more memory to store the overall value, for this reason you could would usually use an int(or any other data type) to store the overall value.
By doing so you would more then likely implicitly cast the chars to the chosen data type.
However you can also explicitly cast these chars before or during calculation(Later versions of C++ take care of this for you though).
This is one such and more common use for the casting of chars.
However in practice, this can usually be avoided as it makers for stronger cleaner code in the long run
Consider the following two C program. My question is in first program unsigned keyword prints -12 but I think it should print 4294967284 but it does not print it for %d specifier. It prints it for %u specifier. But if we look on second program, the output is 144 where it should be -112. Something is fishy about unsigned keyword which I am not getting. Any help friends!
#include <stdio.h>
int main()
{ unsigned int i = -12;
printf(" i = %d\n",i);
printf(" i = %u\n",i);
return 0;
}
Above prorgam I got from this link : Assigning negative numbers to an unsigned int?
#include <stdio.h>
int main(void)
{unsigned char a=200, b=200, c;
c = a+b;
printf("result=%d\n",c);
return 0;
}
Each printf format specifier requires an argument of some particular type. "%d" requires an argument of type int; "%u" requires an argument of type unsigned int. It is entirely your responsibility to pass arguments of the correct type.
unsigned int i = -12;
-12 is of type int. The initialization implicitly converts that value from int to unsigned int. The converted value (which is positive and very large) is stored in i. If int and unsigned int are 32 bits, the stored value will be 4294967284 (232-12).
printf(" i = %d\n",i);
i is of type unsigned int, but "%d" requires an int argument. The behavior is not defined by the C standard. Typically the value stored in i will be interpreted as if it had been stored in an int object. On most systems, the output will be i = -12 -- but you shouldn't depend on that.
printf(" i = %u\n",i);
This will correctly print the value of i (assuming the undefined behavior of the previous statement didn't mess things up).
For ordinary functions, assuming you call them correctly, arguments will often be implicitly converted to the declared type of the parameter, if such a conversion is available. For a variadic function like printf, which can take a varying number and type(s) of arguments, no such conversion can be done, because the compiler doesn't know what type is expected. Instead, arguments undergo the default argument promotions. An argument of a type narrow than int is promoted to int if int can hold all values of the type, or to unsigned int otherwise. An argument of type float is promoted to double (which is why "%f" works for both float and double arguments).
The rules are such an argument of a narrow unsigned type will often (but not always) be promoted to (signed) int.
unsigned char a=200, b=200, c;
Assuming 8-bit bytes, a and b are set to 200.
c = a+b;
The sum 400 is too bit to fit in an unsigned char. For unsigned arithmetic and conversion, out-of-range results are reduced to the range of the type. c is set to 144.
printf("result=%d\n",c);
The value of c is promoted to int; even though the argument is of an unsigned type, int is big enough to hold all possible values of the type. The output is result=144.
In the first program the behaviour is undefined. It's your responsibility to make sure that the format specifier matches the data type of the argument. The compiler emits code that assumes you got it right; at runtime it does not have to do any checks (and often, cannot do any checks even if it wanted to).
(For example, the library implementation printf function does not know what arguments you gave it , it only sees some bytes and it has to assume those are the bytes for the type that you specified using %d).
You appear to be trying to infer something unsigned means based on the output of a program with undefined behaviour. That won't work. Stick to well-defined programs (and preferably just read the definition of unsigned).
In a comment you say:
could give me any reference of unsigned keyword. Still concept is not getting cleared to me. Unsigned definition in C/C++ standard.
In the C99 standard read section 6.2.5, from part 6 onwards.
The definition of unsigned int is an integer type that can hold values from 0 up to a positive number UINT_MAX (which should be one less than a power of two), which must be at least 65535, and typically is 4294967295.
When you write unsigned int i = -12;, the compiler sees that -12 is outside of the range of permitted values for unsigned int, and it performs a conversion. The definition of that conversion is to add or subtract UINT_MAX+1 until the value is in range.
The second part of your question is unrelated to all this. There are no unsigned int in that program; only unsigned char.
In that program, 200 + 200 gives 400. As mentioned above, since this is out of range the compiler converts it by subtracting UCHAR_MAX+1 (i.e. 256) until it is in range. 400 - 256 = 144.
The %d and %u specifiers of printf have capability of (or, are responsible for) typecasting the input integer into int and unsigned int, respectively.
In fact printf (in general, any variadic functions) and arithmetic operators can accept only three types of arguments (except for the format string): 4-byte int, 8-byte long long and double (warning: very inaccurate description!) Any integral arguments whose size is less than int are extended into int. Any float arguments are extended into double. These rules improve uniformity of the input parameters of printf and arithmetic operators.
Regarding your 2nd example: the following steps take place
The + operator requires (unsigned) char operands to be extended into (unsigned) int values (which are 4-bytes integers in your case, I assume.)
The resulting sum is 400 of a 4-bytes unsigned int.
Only the least significant 1 byte of the above sum can fit into unsigned char c, so c has the value of 400 % 256 == 144.
printf requires all the smaller integral arguments to be expanded into int, thus what printf receives is 400 of a 4-bytes int.
The %d specifier prints the above argument as "400".
Google for "default argument promotion" for more details.
I'm using a QByteArray to store raw binary data. To store the data I use QByteArray's append function.
I like using unsigned chars to represent bytes since I think 255 is easier to interpret than -1. However, when I try to append a zero-valued byte to a QByteArray as follows:
command.append( (unsigned char) 0x00));
the compiler complains with call of overloaded append(unsigned char) is ambiguous. As I understand it this is because a zero can be interpreted as a null pointer, but why doesn't the compiler treat the unsigned char as a char, rather than wondering if it's a const char*? I would understand if the compiler complained about command.append(0) without any casting, of course.
Both overloads require a type conversion - one from unsigned char to char, the other from unsigned char to const char *. The compiler doesn't try to judge which one is better, it just tells you to make it explicit. If one were an exact match it would be used:
command.append( (char) 0x00));
unsigned char and char are two different, yet convertible types. unsigned char and const char * also are two different types, also convertible in this specific case. This means that neither of your overloaded functions is an exact match for the argument, yet in both cases the arguments are convertible to the parameter types. From the language point of view both functions are equally good candidates for the call. Hence the ambiguity.
You seem to believe that the unsigned char version should be considered a "better" match. But the language disagrees with you.
It is true that in this case the ambiguity stems from the fact that (unsigned char) 0x00 is a valid null-pointer constant. You can work around the problem by introducing an intermediate variable
unsigned char c = 0x0;
command.append(c);
c does not qualify as null-pointer constant, which eliminates the ambiguity. Although, as #David Rodríguez - dribeas noted in the comments you can eliminate the ambiguity by simply casting your zero to char instead of unsigned char.
How does C/C++ deal if you pass an int as a parameter into a method that takes in a byte (a char)? Does the int get truncated? Or something else?
For example:
void method1()
{
int i = //some int;
method2(i);
}
void method2(byte b)
{
//Do something
}
How does the int get "cast" to a byte (a char)? Does it get truncated?
If byte stands for char type, the behavior will depend on whether char is signed or unsigned on your platform.
If char is unsigned, the original int value is reduced to the unsigned char range modulo UCHAR_MAX+1. Values in [0, UCHAR_MAX] range are preserved. C language specification describes this process as
... the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
If char type is signed, then values within [SCHAR_MIN, SCHAR_MAX] range are preserved, while any values outside this range are converted in some implementation-defined way. (C language additionally explicitly allows an implementation-defined signal to be raised in such situations.) I.e. there's no universal answer. Consult your platform's documentation. Or, better, write code that does not rely on any specific conversion behavior.
Just truncated AS bit pattern (byte is in general unsigned char, however, you have to check)
int i = -1;
becomes
byte b = 255; when byte = unsigned char
byte b = -1; when byte = signed char
i = 0; b = 0;
i = 1024; b = 0;
i = 1040; b = 16;
Quoting the C++ 2003 standard:
Clause 5.2.2 paragrah 4: When a function is called, each parameter (8.3.5) shall be initialized (8.5, 12.8, 12.1) with its corresponding
argument.
So, b is initialized with i. What does that mean?
8.5/14 the initial value of the object being initialized is the (possibly converted) value of the initializer
expression. Standard conversions (clause 4) will be used, if necessary, to convert the initializer
expression to the … destination type; no user-defined conversions are considered
Oh, i is converted, using the standard conversions. What does that mean? Among many other standard conversions are these:
4.7/2 If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source
integer (modulo 2n where n is the number of bits used to represent the unsigned type).
4.7/3 If the destination type is signed, the value is unchanged if it can be represented in the destination type (and
bit-field width); otherwise, the value is implementation-defined.
Oh, so if char is unsigned, the value is truncated to the number of bits in a char (or computed modulo UCHAR_MAX+1, whichever way you want to think about it.)
And if char is signed, then the value is unchanged, if it fits; implementation-defined otherwise.
In practice, on the computers and compilers you care about, the value is always truncated to fit in 8 bits, regardless of whether chars are signed or unsigned.
You don't tell what a byte is, but if you pass a parameter that is convertible to the parameter type, the value will be converted.
If the types have different value ranges there is a risk that the value is outside the range of the parameter type, and then it will not work. If it is within the range, it will be safe.
Here's an example:
1) Code:
#include <stdio.h>
void
method1 (unsigned char b)
{
int a = 10;
printf ("a=%d, b=%d...\n", a, b);
}
void
method2 (unsigned char * b)
{
int a = 10;
printf ("a=%d, b=%d...\n", a, *b);
}
int
main (int argc, char *argv[])
{
int i=3;
method1 (i);
method2 (i);
return 0;
}
2) Compile (with warning):
$ gcc -o x -Wall -pedantic x.c
x.c: In function `main':
x.c:22: warning: passing arg 1 of `method2' makes pointer from integer without a cast
3) Execute (with crash):
$ ./x
a=10, b=3...
Segmentation fault (core dumped)
'Hope that helps - both with your original question, and with related issues.
There are two cases to worry about:
// Your input "int i" gets truncated
void method2(byte b)
{
...
// Your "method2()" stack gets overwritten
void method2(byte * b)
{
...
It will be cast to a byte the same as if you casted it explicitly as (byte)i.
Your sample code above might be a different case though, unless you have a forward declaration for method2 that is not shown. Because method2 is not yet declared at the time it is called, the compiler doesn't know the type of its first parameter. In C, functions should be declared (or defined) before they are called. What happens in this case is that the compiler assumes (as an implicit declaration) that method2's first parameter is an int and method2 receives an int. Officially that results in undefined behaviour, but on most architectures, both int and byte would be passed in the same size register anyway and it will happen to work.