Bitwise operations on elements from array of chars - c++

I have made array of hexadecimal numbers that I would like to add together bitwise. In my program I want to add 0xFF with 0x7F00. Here is my approach
#include <iostream>
using namespace std;
int main() {
char data[2] = {0xFF, 0x7F};
cout << (data[0] | (data[1] << 8)) << endl;
system("pause");
return 0;
}
I expect the result to be 0x7FFF which is 32767 in decimal, but I get -1 (0xFF in hex).

The problem you're having stems from two facts:
The bitwise operators requires integral promotion of both operands.
char can be either signed or unsigned
Promotion will convert values of smaller types (like char or short) to int, and as part of that signed values will be sign-extended. If char is signed, then the value 0xff will be converted to the (32-bit) int value 0xffffffff, which is -1.
It doesn't matter what value you use in the bitwise OR, the result will still be 0xffffffff.
The simple solution is to explicitly use unsigned char (or even better uint8_t) as the type for the array elements:
uint8_t data[2] = {0xFF, 0x7F};

Related

C++ Encrypt Char with a Key and Mask

I need to write a code which encrypt a char with a special key.
I must use own key and mask with XOR operation.
But I have a problem with understanding and implementation of this.
Firstly, I created my own key which I use to encrypt a char. This key is e.q number: 123456789. In binary representation number 123456789 is: 00010101 11001101 01011011 00000111. I divided this to 4 x 8 bits because I enter a type of char.
I must use this to encrypt my char with a mask too. My mask is 0xFF because it resets the oldest bits and the youngest 8 bits are left to operate with XOR operation.
It means when I enter char "a" it should encrypt this with this key and a mask with XOR operation.
What's more, I wanted to check what my compilator shows with key[0] position. It means I have "int key[0] = {00000111}" As I think it should show a number of 7 as a binary number, but compilator shows number 73. Why?
I would appreciate if somebody can help me to resolve this problem.
Here is my code:
#include <iostream>
using namespace std;
void encryption(char chars[], const int size);
int main() {
const int size1 = 4;
char chars1[size1];
unsigned int keys = 123456789;
int key[] = {00000111}; // why does it show number 73 instead of 7 ?
cout << "Enter a char to encrypt: " << endl;
cin >> chars1[0];
return 0; }
void encryption(char chars[], const int size) {
unsigned int keys = 123456789;
unsigned int key[] = {00010101, 11001101, 01011011, 00000111};
unsigned int mask = 0xFF;
int temp[4] = {0};
temp[0] = chars[0] ^ (keys & mask);
temp[1] = chars[0] ^ ((keys >> 8) & mask);
temp[2] = chars[0] ^ ((keys >> 16) & mask);
temp[3] = chars[0] ^ ((keys >> 24) & mask);
}
There are several problems:
unsigned int keys = 123456789 is in hex 075bcd15
00010101, 11001101, 01011011, 00000111 is in hex 15cd5b07
While both are 4-bytes notice the reverse order of the bytes, that is because the computer used what is known as little-endian byte order for integers. So if you get the bytes by casting to an unsigned character you will get the reversed byte order from what you expect.
unsigned int key[] is an array of int which seem to be 32-bits (4-bytes).
For a single unsigned int use
unsigned int keyBytes3 = {0x075bcd15};
For an array of 4 unsigned char:
unsigned char keyBytes[] = {0x07, 0x5b, 0xcd, 0x15};
Personally for instances where size matters I prefer using uint32_t and uint8_t types, then it is clear of the sizes, int may be larger or smaller the 32-bits depending on the CPU.
The problem is that the compiler does not interpret your 00000111as binary. C++ 11 does not support binary literals. In addition, any number that starts with 0 (aka 00000111) as the most-significant digit is considered to be in octal. In octal, 111 is the equivalent of 73 in decimal.
You'll need to convert all of your "binary values" to decimal, then use those. Or you could use boost, which has a utility that handles what you're looking to do.

How to use char data type as a number rather than a character?

When I use the char datatype to add two numbers, I get the sum of the ASCII code of the characters and not the numbers itself. When I researched on the internet, various sites say that the char type can indeed be used to handle one byte numbers. But in reality, I get the sum of ASCII values. Why is this happening? Below is just a sample code which illustrates the problem:
uint8_t rows,cols; //uint8_t is just a typedef for char
cin >> rows;
cout << rows + 1 << endl;
When people talk about "one-byte numbers", they're talking about 8-bit values, ranging from -128 to 127 for a char, or 0 to 255 for an unsigned char, also known as octets. These can be converted directly to larger integer types and to floats:
char eight_bit = 122;
float floating_point = eight_bit; // = 122.0
If you're trying to convert a digit value such as '1' into the numeric value it represents, there's stoi:
#include <string>
int ctoi(char c) {
std::string temp;
temp.push_back(c);
return std::stoi(temp);
}
Chars store the ASCII equivalent of a character as an integer.
For example
char value = 'A' // == int 65
It's best you use a short integer to store numbers, but if you really want to, you can do something like this;
char value1 = '2';
char value2 = '5';
char sum = (value1 + value2) - '0'; // int value of sum would be 7
When you use char, you use signed 8 bit data type (mostly).
And you get "sum of ASCII" only because std::cout is programmed to display char as ASCII character.
Try
cout << stratic_cast<int16_t>(rows) + 1 << endl;
And you will see that you get the 'number' rather than an 'ASCII character'.
NOTE
uint8_t is not (or probably should not be) char since char is defined as signed data type while uint* stands for unsigned.

Convert four bytes to Integer using C++

I am trying to convert 4 bytes to an integer using C++.
This is my code:
int buffToInteger(char * buffer)
{
int a = (int)(buffer[0] << 24 | buffer[1] << 16 | buffer[2] << 8 | buffer[3]);
return a;
}
The code above works in almost all cases, for example:
When my buffer is: "[\x00, \x00, \x40, \x00]" the code will return 16384 as expected.
But when the buffer is filled with: "[\x00, \x00, \x3e, \xe3]", the code won't work as expected and will return "ffffffe1".
Does anyone know why this happens?
Your buffer contains signed characters. So, actually, buffer[0] == -29, which upon conversion to int gets sign-extended to 0xffffffe3, and in turn (0x3e << 8) | 0xffffffe3 == 0xffffffe3.
You need ensure your individual buffer bytes are interpreted unsigned, either by declaring buffer as unsigned char *, or by explicitly casting:
int a = int((unsigned char)(buffer[0]) << 24 |
(unsigned char)(buffer[1]) << 16 |
(unsigned char)(buffer[2]) << 8 |
(unsigned char)(buffer[3]));
In the expression buffer[0] << 24 the value 24 is an int, so buffer[0] will also be converted to an int before the shift is performed.
On your system a char is apparently signed, and will then be sign extended when converted to int.
There's a implict promotion to a signed int in your shifts.
That's because char is (apparently) signed on your platform (the common thing) and << promotes to integers implicitly. In fact none of this would work otherwise because << 8 (and higher) would scrub all your bits!
If you're stuck with using a buffer of signed chars this will give you what you want:
#include <iostream>
#include <iomanip>
int buffToInteger(char * buffer)
{
int a = static_cast<int>(static_cast<unsigned char>(buffer[0]) << 24 |
static_cast<unsigned char>(buffer[1]) << 16 |
static_cast<unsigned char>(buffer[2]) << 8 |
static_cast<unsigned char>(buffer[3]));
return a;
}
int main(void) {
char buff[4]={0x0,0x0,0x3e,static_cast<char>(0xe3)};
int a=buffToInteger(buff);
std::cout<<std::hex<<a<<std::endl;
// your code goes here
return 0;
}
Be careful about bit shifting on signed values. Promotions don't just add bytes but may convert values.
For example a gotcha here is that you can't use static_cast<unsigned int>(buffer[1]) (etc.) directly because that converts the signed char value to a signed int and then reinterprets that value as an unsigned.
If anyone asks me all implicit numeric conversions are bad. No program should have so many that they would become a chore. It's a softness in the C++ inherited from C that causes all sorts of problems that far exceed their value.
It's even worse in C++ because they make the already confusing overloading rules even more confusing.
I think this could be also done with use of memcpy:
int buffToInteger(char* buffer)
{
int a;
memcpy( &a, buffer, sizeof( int ) );
return a;
}
This is much faster than the example mentioned in the original post, because it just treats all bytes "as is" and there is no need to do any operations such as bit shift etc.
It also doesn't cause any signed-unsigned issues.
char buffer[4];
int a;
a = *(int*)&buffer;
This takes a buffer reference, type casts it to an int reference and then dereferences it.
int buffToInteger(char * buffer)
{
return *reinterpret_cast<int*>(buffer);
}
This conversion is simple and fast. We only tell compiler to treat a byte array in a memory as a single integer

C/C++ Converting a 64 bit integer to char array

I have the following simple program that uses a union to convert between a 64 bit integer and its corresponding byte array:
union u
{
uint64_t ui;
char c[sizeof(uint64_t)];
};
int main(int argc, char *argv[])
{
u test;
test.ui = 0x0123456789abcdefLL;
for(unsigned int idx = 0; idx < sizeof(uint64_t); idx++)
{
cout << "test.c[" << idx << "] = 0x" << hex << +test.c[idx] << endl;
}
return 0;
}
What I would expect as output is:
test.c[0] = 0xef
test.c[1] = 0xcd
test.c[2] = 0xab
test.c[3] = 0x89
test.c[4] = 0x67
test.c[5] = 0x45
test.c[6] = 0x23
test.c[7] = 0x1
But what I actually get is:
test.c[0] = 0xffffffef
test.c[1] = 0xffffffcd
test.c[2] = 0xffffffab
test.c[3] = 0xffffff89
test.c[4] = 0x67
test.c[5] = 0x45
test.c[6] = 0x23
test.c[7] = 0x1
I'm seeing this on Ubuntu LTS 14.04 with GCC.
I've been trying to get my head around this for some time now. Why are the first 4 elements of the char array displayed as 32 bit integers, with 0xffffff prepended to them? And why only the first 4, why not all of them?
Interestingly enough, when I use the array to write to a stream (which was the original purpose of the whole thing), the correct values are written. But comparing the array char by char obviously leads to problems, since the first 4 chars are not equal 0xef, 0xcd, and so on.
Using char is not the right thing to do since it could be signed or unsigned. Use unsigned char.
union u
{
uint64_t ui;
unsigned char c[sizeof(uint64_t)];
};
char gets promoted to an int because of the prepended unary + operator. . Since your chars are signed, any element with the highest by set to 1 is interpreted as a negative number and promoted to an integer with the same negative value. There are a few different ways to solve this:
Drop the +: ... << test.c[idx] << .... This may print the char as a character rather than a number, so is probably not a good solution.
Declare c as unsigned char. This will promote it to an unsigned int.
Explicitly cast +test.c[idx] before it is passed: ... << (unsigned char)(+test.c[idx]) << ...
Set the upper bytes of the integer to zero using binary &: ... << +test.c[idx] & 0xFF << .... This will only display the lowest-order byte no matter how the char is promoted.
Use either unsigned char or use test.c[idx] & 0xff to avoid sign extension when a char value > 0x7f is converted to int.
It is unsigned char vs signed char and its casting to integer
The unary plus causes the char to be promoted to a int (integral promotion). Because you have signed chars the value will be used as such and the other bytes will reflect that.
It is not true that only the four are ints, they all are. You just don't see it from the representtion since the leading zeroes are not shown.
Either use unsigned chars or & 0xff for promotion to get the desired result.

Unexpected result in byte representation of a variable

I scan through the byte representation of an int variable and get somewhat unexpected result.
If I do
int a = 127;
cout << (unsigned int) *((char *)&a);
I get 127 as expected. If I do
int a = 256;
cout << (unsigned int) *((char *)&a + 1);
I get 1 as expected. But if I do
int a = 128;
cout << (unsigned int) *((char *)&a);
I have 4294967168 which is, well… quite fancy.
The question is: is there a way to get 128 when looking at first byte of an int variable which value is 128?
For the same reason that (unsigned int)(char)128 is 4294967168: char is signed by default on most commonly used systems. 128 cannot fit in a signed 8-bit quantity, so when you cast it to char, you get -128 (0x80 in hex).
Then, when you cast -128 to an unsigned int, you get 232 - 128, which is 4294967168.
If you want to get +128, then use an unsigned char instead of char.
char is signed here, so in your second example, *((char *)&a + 1) = ((char)256 +1) = (0+1) = 1, which is encoded as 0b00000000000000000000000000000001, so becomes 1 as an unsigned int.
In your third example, *((char *)&a) = (char)128 = (char)-127, which is encoded as 0b10000000000000000000000000000000, i.e., 2<<31, which is 4294967168
As the comments have pointed out, it looks like what's happening here is that you are running into an oddity of twos complement. In your last cast, since you are not using an unsigned char, the highest-order bit of the byte is being used to indicate positive or negative values. You then only have 7 bits out of the full 8 to represent your value, giving you a range of 0-127 for positive numbers (-128-127 overall).
If you exceed this range, then it wraps, and you get -128, which when casted back to an unsigned int will result in that abnormally large value.
int a = 128;
cout << (unsigned int) *((unsigned char *)&a);
Also all of your code is dependent on running on a little endian machine.
Here's how you should probably be doing these things:
int a = 127;
cout << (unsigned)(unsigned char)(0xFF & a);
int a = 256;
cout << (unsigned)(unsigned char)(0xFF & (a>>8));
int a = 128;
cout << (unsigned)(unsigned char)(0xFF & a);