Why does printf not print out just one byte when printing hex? - c++

pixel_data is a vector of char.
When I do printf(" 0x%1x ", pixel_data[0] ) I'm expecting to see 0xf5.
But I get 0xfffffff5 as though I was printing out a 4 byte integer instead of 1 byte.
Why is this? I have given printf a char to print out - it's only 1 byte, so why is printf printing 4?
NB. the printf implementation is wrapped up inside a third party API but just wondering if this is a feature of standard printf?

You're probably getting a benign form of undefined behaviour because the %x modifier expects an unsigned int parameter and a char will usually be promoted to an int when passed to a varargs function.
You should explicitly cast the char to an unsigned int to get predictable results:
printf(" 0x%1x ", (unsigned)pixel_data[0] );
Note that a field width of one is not very useful. It merely specifies the minimum number of digits to display and at least one digit will be needed in any case.
If char on your platform is signed then this conversion will convert negative char values to large unsigned int values (e.g. fffffff5). If you want to treat byte values as unsigned values and just zero extend when converting to unsigned int you should use unsigned char for pixel_data, or cast via unsigned char or use a masking operation after promotion.
e.g.
printf(" 0x%x ", (unsigned)(unsigned char)pixel_data[0] );
or
printf(" 0x%x ", (unsigned)pixel_data[0] & 0xffU );

Better use the standard-format-flags
printf(" %#1x ", pixel_data[0] );
then your compiler puts the hex-prefix for you.

Use %hhx
printf("%#04hhx ", foo);

Then length modifier is the minimum length.

Width-specifier in printf is actually min-width. You can do printf(" 0x%2x ", pixel_data[0] & 0xff) to print lowes byte (notice 2, to actually print two characters if pixel_data[0] is eg 0xffffff02).

Related

C/C++ Printing bytes in hex, getting weird hex values

I am using the following to print out numbers from an array in hex:
char buff[1000];
// Populate array....
int i;
for(i = 0; i < 1000; ++i)
{
printf("[%d] %02x\n", i,buff[i]);
}
but I sometimes print weird values:
byte[280] 00
byte[281] 00
byte[282] 0a
byte[283] fffffff4 // Why is this a different length?
byte[284] 4e
byte[285] 66
byte[286] 0a
Why does this print 'fffffff4'?
Use %02hhx as the format string.
From CppReference, %02x accepts unsigned int. When you pass the arguments to printf(), which is a variadic function, buff[i] is automatically converted to int. Then the format specifier %02x makes printf() interprets the value as int, so potential negative values like (char)-1 get interpreted and printed as (int)-1, which is the cause of what you observed.
It can also be inferred that your platform has signed char type, and a 32-bit int type.
The length modifier hh will tell printf() to interpret whatever supplied as char type, so %hhx is the correct format specifier for unsigned char.
Alternatively, you can cast the data to unsigned char before printing. Like
printf("[%d] %02x\n", i, (unsigned char)buff[i]);
This can also prevent negative values from showing up as too long, as int can (almost) always contain unsigned char value.
See the following example:
#include <stdio.h>
int main(){
signed char a = +1, b = -1;
printf("%02x %02x %02hhx %02hhx\n", a, b, a, b);
return 0;
}
The output of the above program is:
01 ffffffff 01 ff
Your platform apparantly has signed char. On platforms where char is unsigned the output would be f4.
When calling a variadic function any integer argument smaller than int gets promoted to int.
A char value of f4 (-12 as a signed char) has the sign bit set, so when converted to int becomes fffffff4 (still -12 but now as a signed int) in your case.
%x02 causes printf to treat the argument as an unsigned int and will print it using at least 2 hexadecimal digits.
The output doesn't fit in 2 digits, so as many as are required are used.
Hence the output fffffff4.
To fix it, either declare your array unsigned char buff[1000]; or cast the argument:
printf("[%d] %02x\n", i, (unsigned char)buff[i]);

bit shift for unsigned int, why negative?

Code:
unsigned int i = 1<<31;
printf("%d\n", i);
Why the out put is -2147483648, a negative value?
Updated question:
#include <stdio.h>
int main(int argc, char * argv[]) {
int i = 1<<31;
unsigned int j = 1<<31;
printf("%u, %u\n", i, j);
printf("%d, %d\n", i, j);
return 0;
}
The above print:
2147483648, 2147483648
-2147483648, -2147483648
So, does this means, signed int & unsigned int have the same bit values, the difference is how you treat the 31st bit when convert it to a number value?
%d prints the int version of the unsigned int i. Try %u for unsigned int.
printf("%u\n", i);
int main(){
printf("%d, %u",-1,-1);
return 0;
}
Output: -1, 4294967295
i.e The way a signed integer is stored and how it gets converted to signed from unsigned or vice-versa will help you. Follow this.
To answer your updated question, its how the system represents them i.e in a 2's complement (as in the above case where
-1 =2's complement of 1 = 4294967295.
Use '%u' for unsigned int
printf("%u\n", i);
.......................................................
Response to updated question: any sequence of bits can be interpreted as a signed or unsigned value.
printf("%d\n", i); invokes UB. i is unsigned int and you try to print it as signed int. Writing 1 << 31 instead of 1U << 31 is undefined too.
Print it as:
printf("%u\n", i);
or
printf("%X\n", i);
About your updated question, it also invokes UB for the very same reasons (If you use '1U' instead of 1, then for the reason that an int is initialized with 1U << 31 which is out of range value. If an unsigned is initialized with out of range value, modular arithmetic come into picture and remainder is assigned. For signed the behavior is undefined.)
Understanding the behavior on your platformOn your platform, int appears to be 4 byte. When you write something like 1 << 31, it converts to bit patters 0x80000000 on your machine.
Now when you try to print this pattern as signed, it prints signed interpretation which is -231 (AKA INT_MIN) in 2s completement system. When you print this as unsigned, you get expected 231 as output.
Learnings
1. Use 1U << 31 instead of 1 << 31
2. Always use correct print specifiers in printf.
3. Pass correct argument types to variadic functions.
4. Be careful when implicit typecast (unsigned -> signed, wider type -> narrow type) takes place. If possible, avoid such castings completely.
Try
printf("%u\n", i);
By using %d specifier, printf expects argument as int and it typecast to int.
So, use %u for unsigned int.
You are not doing the shift operation on an unsigned but on a signed int.
1 is signed.
the shift operation then shifts into the sign bit of
that signed (assuming that int is 32 bit wide), that is already undefined behavior
then you assign whatever the compiler thinks he wants for that value to an unsigned.
then you print an unsigned as a signed, again no defined behavior.
%u is for unsigned int.
%d is for signed int.
in your programs output :
2147483648, 2147483648 (output for unsigned int)
-2147483648, -2147483648 (output for signed int )

Getting the char integer value from a std::string & std::wstring

I am attempting to convert a string into a number by summing the int value of each letter together in C++ WinAPI. So in ASCII; the std::string "AA" would equal 130 (65+65)
The string can either be a std::string or an std::wstring.
Why does the following function always return the value of zero no matter what letter I put in it? Shouldn't it return either the ASCII or Unicode integer value of the letter?
printf("TEST a: %d \n", _tstoi(_T("a")));
printf("TEST A: %d \n", _tstoi(_T("A")));
printf("TEST b: %d \n", _tstoi(_T("b")));
My VC++ application is currently in Unicode, & the previous code prints out zero for each letter. I remember hearing that Unicode is very different to ASCII strings, can you clear up what exactly is different other than Unicode has a library of characters which is something like 30,000 long whilst ASCII is 256(I think?)?
The msdn article says:
"The input string is a sequence of characters that can be interpreted
as a numerical value of the specified type. The function stops reading
the input string at the first character that it cannot recognize as
part of a number."
If you test the code with unicode strings containing actual numbers, you'll see the correct output:
printf("TEST 1: %d \n", _tstoi(_T("1")));
output:
TEST 1: 1
Like #Ylisar said, the *toi functions are used to convert number values from strings to integer variables instead.
The following code will output the number representation instead, but watch out for the pointer representation of the const variables. I've left both versions so you can see the difference:
printf("TEST 1: %d \n", _tstoi(_T("1")));
printf("TEST a: %d \n", _tstoi(_T("a")));
WCHAR* b(_T("b"));
printf("TEST A: %d \n", _T("A"));
printf("TEST b: %d \n", *b);
Output:
TEST 1: 1
TEST a: 0
TEST A: 13457492
TEST b: 98
Check out more at http://msdn.microsoft.com/en-us/library/yd5xkb5c%28v=vs.80%29.aspx
If you want to sum up (accumulate) the values, I would recommend you checking out the STL range functions which does wonders on such things. For example
#include <numeric>
#include <string>
printf("TEST a: %d \n", *_T("a")); // 97
printf("TEST b: %d \n", *_T("b")); // 98
wstring uString(_T("ba"));
int result = accumulate(uString.begin(), uString.end(), 0);
printf("TEST accumulated: %d \n", result);
Results:
TEST a: 97
TEST b: 98
TEST accumulated: 195
This way you don't have to have for-loops going through all the values. The range functions really are nice for stuff like this.
Check out more at: http://www.sgi.com/tech/stl/accumulate.html
the *toi family of functions converts a string representation to integer representation, that is, "10" becomes 10. What you actually want to do is no conversion at all. Change it to:
printf("TEST a: %d \n", _T('a'));
printf("TEST A: %d \n", _T('A'));
printf("TEST b: %d \n", _T('b'));
As for unicode, the underlying representation depends on the encoding ( for example UTF-8, which is very popular, maps the LSB with the ASCII table ).
The first question, why printf does not work as intened has already been answered by Ylisar. The other question about summing the hexadecimal representation of a character is a little more complex. The conversion from strings to number values with the _tstoi() function will only work if the given string represents a number like "123" gets converted to 123. What you want is the sum of the characters representation.
In case of Unicode code points below 0x7F (0...127) this is simply the sum of the 1 Byte UTF-8 representation. However on Windows compiled with UNICODE flag a 2 Byte per character representation is used. Running the following code in the debugger will releal this.
// ASCII 1 Byte per character
const char* letterA = "A";
int sumOfLetterA = letterA[0] + letterA[0]; // gives 130
// 2 Bytes per character (Windows)
const wchar_t* letterB = TEXT("B");
int sumOfLetterB = letterB[0] + letterB[0]; // gives 132

Converting an unsigned char* to readable string & whats this function doing

I have googled alot to learn how to convert my unsigned char* to a printable hex string. So far I am slightly understanding how it all works & the difference between signed & unsigned chars.
Can you tell me what this function I found does? And help me devlop a function that converts a unsigned char*(which is a hashed string) to a printable string?
Does the following function do this:
- it iterates over every second character of the char array string
- on each loop it reads the char at the position string[x], converts it to an unsigned number(with a precision of 2 decimal places) then copies that converted char(number?) to the variables uChar.
- finally it stores the unsigned char uChar in hexstring
void AppManager :: stringToHex( unsigned char* hexString, char* string, int stringLength )
{
// Post:
unsigned char uChar = 0;
for ( int x = 0; x<stringLength; x+=2 )
{
sscanf_s(&string[x], "%02x", &uChar);
hexString[x] = uChar;
}
}
So I guess that means that it converts the character in string to unsigned(& 2dcp) to ensure that it can be correctly stored the hexstring. Why to 2 decimal places, & wont a simple conversion from signed(if that character is signed) to unsigned result in a completely different string?
If I have a unsigned char* how can I go about converting it to something that will let me print it out on screen?
Those aren't decimal places, they're digits. You're saying "don't give me a string shorter than 2; if it's shorter than 2 digits, then pad it with a zero."
This is so that if you have a hex sequence 0x0A it'll actually print 0A and not just A.
Also, there is no signed/unsigned conversion here. Hex strings are hex strings - they don't have a sign. They're a binary representation of the data, and depending on how they're interpreted may be read as two's complement signed integers, unsigned integers, strings, or anything else.

Using sizeof in C

I am having the following program, which is crashing. Does anybody know why it is crashing?
/* writes a, b, c into dst
** dst must have enough space for the result
** assumes all 3 numbers are positive */
void concat3(char *dst, int a, int b, int c) {
sprintf(dst, "%08x%08x%08x", a, b, c);
}
/* usage */
int main(void) {
printf("The size of int is %d \n", sizeof(int));
char n3[3 * sizeof(int) + 1];
concat3(n3, 0xDEADFACE, 0xF00BA4, 42);
printf("result is 0x%s\n", n3);
return 0;
}
You're confusing the size of the binary data (which is what sizeof) gives you, with the size of a textual representation in hexadecimal, which is what you're trying to store.
On most current systems, sizeof(int) evaluates to 4. Your buffer n3 will therefore be capable of storing 13 characters (3 * 4 + 1 == 13).
Then, you format three integers into 8-character hex format, which will require 3 * 8 + 1 == 25 characters to store. The resulting buffer overflow causes the crash.
It should be obvious that the size of the data type int doesn't matter, when you're formatting it as text (and specifying the field width yourself!).
Try 3*2*sizeof(int)+1, where 2*sizeof(int) is the number of bytes needed to print each byte worth of an int, in hex. Of course since you're using that %08X format and expecting fixed-width results, you really should be using uint32_t. By the way, your program is also incorrectly passing 0xDEADBEEF as int, which it probably doesn't fit in, and thus entering the realm of implementation-defined conversion-to-signed-type.
Here is a version with those corrections:
#include <inttypes.h>
#include <stdio.h>
/* writes a, b, c into dst
** dst must have enough space for the result
** assumes all 3 numbers are positive */
void concat3(char *dst, uint32_t a, uint32_t b, uint32_t c) {
sprintf(dst, "%08"PRIX32"%08"PRIX32"%08"PRIX32, a, b, c);
}
/* usage */
int main(void) {
printf("The size of int is %d \n", sizeof(int));
char n3[25];
concat3(n3, 0xDEADFACE, 0xF00BA4, 42);
printf("result is 0x%s\n", n3);
return 0;
}
I don't really understand what sizeof has anything to do in your code. In concat3, you're attempting to print a text representation of each provided integer as a 8 char hexadecimal string : the required buffer size should thus be equal to 8 * 3 + 1 = 25, and sizeof(int) has nothing to do with it.
You seem to be mixing the size occupied in memory by an int, and the length of it's textual representation (which in your case is easily determined as it's fixed by your sprintf format string).
On a side note : sprintf is a truly unsafe function that you should consider deprecated.
It is crashing because sizeof(int) is (most likely on your system) 4, meaning that n3 is 13 bytes long. You then try to write 8 + 8 + 8 = 24 characters to it.
Use snprintf instead of sprintf. Think of the kittens!
But seriously, you should not be creating interfaces with buffer pointers but no length information. concat should have a max length parameter. Then use snprintf inside. The length to give to concat is sizeof (n3).
It still won't work, but it won't crash either. The other answers explain how to get the functionality right.
(Oh, and don't use gets() either. Just because it is in the standard library doesn't mean it is good code.)