Converting 24 bit integer (2s complement) to 32 bit integer in C++ - c++

The dataFile.bin is a binary file with 6-byte records. The first 3
bytes of each record contain the latitude and the last 3 bytes contain
the longitude. Each 24 bit value represents radians multiplied by
0X1FFFFF
This is a task I've been working on. I havent done C++ in years so its taking me way longer than I thought it would -_-. After googling around I saw this algorthim which made sense to me.
int interpret24bitAsInt32(byte[] byteArray) {
int newInt = (
((0xFF & byteArray[0]) << 16) |
((0xFF & byteArray[1]) << 8) |
(0xFF & byteArray[2])
);
if ((newInt & 0x00800000) > 0) {
newInt |= 0xFF000000;
} else {
newInt &= 0x00FFFFFF;
}
return newInt;
}
The problem is a syntax issue I am restricting to working by the way the other guy had programmed this. I am not understanding how I can store the CHAR "data" into an INT. Wouldn't it make more sense if "data" was an Array? Since its receiving 24 integers of information stored into a BYTE.
double BinaryFile::from24bitToDouble(char *data) {
int32_t iValue;
// ****************************
// Start code implementation
// Task: Fill iValue with the 24bit integer located at data.
// The first byte is the LSB.
// ****************************
//iValue +=
// ****************************
// End code implementation
// ****************************
return static_cast<double>(iValue) / FACTOR;
}
bool BinaryFile::readNext(DataRecord &record)
{
const size_t RECORD_SIZE = 6;
char buffer[RECORD_SIZE];
m_ifs.read(buffer,RECORD_SIZE);
if (m_ifs) {
record.latitude = toDegrees(from24bitToDouble(&buffer[0]));
record.longitude = toDegrees(from24bitToDouble(&buffer[3]));
return true;
}
return false;
}
double BinaryFile::toDegrees(double radians) const
{
static const double PI = 3.1415926535897932384626433832795;
return radians * 180.0 / PI;
}
I appreciate any help or hints even if you dont understand a clue or hint will help me alot. I just need to talk to someone.

I am not understanding how I can store the CHAR "data" into an INT.
Since char is a numeric type, there is no problem combining them into a single int.
Since its receiving 24 integers of information stored into a BYTE
It's 24 bits, not bytes, so there are only three integer values that need to be combined.
An easier way of producing the same result without using conditionals is as follows:
int interpret24bitAsInt32(byte[] byteArray) {
return (
(byteArray[0] << 24)
| (byteArray[1] << 16)
| (byteArray[2] << 8)
) >> 8;
}
The idea is to store the three bytes supplied as an input into the upper three bytes of the four-byte int, and then shift it down by one byte. This way the program would sign-extend your number automatically, avoiding conditional execution.
Note on portability: This code is not portable, because it assumes 32-bit integer size. To make it portable use <cstdint> types:
int32_t interpret24bitAsInt32(const std::array<uint8_t,3> byteArray) {
return (
(const_cast<int32_t>(byteArray[0]) << 24)
| (const_cast<int32_t>(byteArray[1]) << 16)
| (const_cast<int32_t>(byteArray[2]) << 8)
) >> 8;
}
It also assumes that the most significant byte of the 24-bit number is stored in the initial element of byteArray, then comes the middle element, and finally the least significant byte.
Note on sign extension: This code automatically takes care of sign extension by constructing the value in the upper three bytes and then shifting it to the right, as opposed to constructing the value in the lower three bytes right away. This additional shift operation ensures that C++ takes care of sign-extending the result for us.

When an unsigned char is casted to an int the higher order bits are filled with 0's
When a signed char is casted to a casted int, the sign bit is extended.
ie:
int x;
char y;
unsigned char z;
y=0xFF
z=0xFF
x=y;
/*x will be 0xFFFFFFFF*/
x=z;
/*x will be 0x000000FF*/
So, your algorithm, uses 0xFF as a mask to remove C' sign extension, ie
0xFF == 0x000000FF
0xABCDEF10 & 0x000000FF == 0x00000010
Then uses bit shifts and logical ands to put the bits in their proper place.
Lastly checks the most significant bit (newInt & 0x00800000) > 0 to decide if completing with 0's or ones the highest byte.

int32_t upperByte = ((int32_t) dataRx[0] << 24);
int32_t middleByte = ((int32_t) dataRx[1] << 16);
int32_t lowerByte = ((int32_t) dataRx[2] << 8);
int32_t ADCdata32 = (((int32_t) (upperByte | middleByte | lowerByte)) >> 8); // Right-shift of signed data maintains signed bit

Related

How to safely extract a signed field from a uint32_t into a signed number (int or uint32_t)

I have a project in which I am getting a vector of 32-bit ARM instructions, and a part of the instructions (offset values) needs to be read as signed (two's complement) numbers instead of unsigned numbers.
I used a uint32_t vector because all the opcodes and registers are read as unsigned and the whole instruction was 32-bits.
For example:
I have this 32-bit ARM instruction encoding:
uint32_t addr = 0b00110001010111111111111111110110
The last 19 bits are the offset of the branch that I need to read as signed integer branch displacement.
This part: 1111111111111110110
I have this function in which the parameter is the whole 32-bit instruction:
I am shifting left 13 places and then right 13 places again to have only the offset value and move the other part of the instruction.
I have tried this function casting to different signed variables, using different ways of casting and using other c++ functions, but it prints the number as it was unsigned.
int getCat1BrOff(uint32_t inst)
{
uint32_t temp = inst << 13;
uint32_t brOff = temp >> 13;
return (int)brOff;
}
I get decimal number 524278 instead of -10.
The last option that I think is not the best one, but it may work is to set all the binary values in a string. Invert the bits and add 1 to convert them and then convert back the new binary number into decimal. As I would of do it in a paper, but it is not a good solution.
It boils down to doing a sign extension where the sign bit is the 19th one.
There are two ways.
Use arithmetic shifts.
Detect sign bit and or with ones at high bits.
There is no portable way to do 1. in C++. But it can be checked on compilation time. Please correct me if the code below is UB, but I believe it is only implementation defined - for which we check at compile time.
The only questionable thing is conversion of unsigned to signed which overflows, and the right shift, but that should be implementation defined.
int getCat1BrOff(uint32_t inst)
{
if constexpr (int32_t(0xFFFFFFFFu) >> 1 == int32_t(0xFFFFFFFFu))
{
return int32_t(inst << uint32_t{13}) >> int32_t{13};
}
else
{
int32_t offset = inst & 0x0007FFFF;
if (offset & 0x00040000)
{
offset |= 0xFFF80000;
}
return offset;
}
}
or a more generic solution
template <uint32_t N>
int32_t signExtend(uint32_t value)
{
static_assert(N > 0 && N <= 32);
constexpr uint32_t unusedBits = (uint32_t(32) - N);
if constexpr (int32_t(0xFFFFFFFFu) >> 1 == int32_t(0xFFFFFFFFu))
{
return int32_t(value << unusedBits) >> int32_t(unusedBits);
}
else
{
constexpr uint32_t mask = uint32_t(0xFFFFFFFFu) >> unusedBits;
value &= mask;
if (value & (uint32_t(1) << (N-1)))
{
value |= ~mask;
}
return int32_t(value);
}
}
https://godbolt.org/z/rb-rRB
In practice, you just need to declare temp as signed:
int getCat1BrOff(uint32_t inst)
{
int32_t temp = inst << 13;
return temp >> 13;
}
Unfortunately this is not portable:
For negative a, the value of a >> b is implementation-defined (in most
implementations, this performs arithmetic right shift, so that the
result remains negative).
But I have yet to meet a compiler that doesn't do the obvious thing here.

How to grab specific bits from a 256 bit message?

I'm using winsock to receive udp messages 256 bits long. I use 8 32-bit integers to hold the data.
int32_t dataReceived[8];
recvfrom(client, (char *)&dataReceived, 8 * sizeof(int), 0, &fromAddr, &fromLen);
I need to grab specific bits like, bit #100, #225, #55, etc. So some bits will be in dataReceived[3], some in dataReceived[4], etc.
I was thinking I need to bitshift each array, but things got complicated. Am I approaching this all wrong?
Why are you using int32_t type for buffer elements and not uint32_t?
I usually use something like this:
int bit_needed = 100;
uint32_t the_bit = dataReceived[bit_needed>>5] & (1U << (bit_needed & 0x1F));
Or you can use this one (but it won't work for sign in signed integers):
int bit_needed = 100;
uint32_t the_bit = (dataReceived[bit_needed>>5] >> (bit_needed & 0x1F)) & 1U;
In other answers you can access only lowes 8bits in each int32_t.
When you count bits and bytes from 0:
int bit_needed = 100;
So:
int byte = int(bit_needed / 8);
int bit = bit_needed % 8;
int the_bit = dataReceived[byte] & (1 << bit);
If the recuired bit contains 0, then the_bit will be zero. If it's 1, then the_bit will hold 2 to the power of that bit ordinal place within the byte.
You can make a small function to do the job.
uint8_t checkbit(uint32_t *dataReceived, int bitToCheck)
{
byte = bitToCheck/32;
bit = bitToCheck - byte*32;
if( dataReceived[byte] & (1U<< bit))
return 1;
else
return 0;
}
Note that you should use uint32_t rather than int32_t, if you are using bit shifting. Signed integer bit shifts lead to unwanted results, especially if the MSbit is 1.
You can use a macro in C or C++ to check for specific bit:
#define bit_is_set(var,bit) ((var) & (1 << (bit)))
and then a simple if:
if(bit_is_set(message,29)){
//bit is set
}

Extracting continuos bits from a std::string bytewise with a bit offset

I'm kind of at a loss i want to extract up to 64bits with a defined bitoffset and bitlength (unsigned long long) from a string (coming from network).
The string can be at an undefined length, so i need to be sure to only access it Bytewise. (Also means i cant use _bextr_u32 intrinsic). I cant use the std bitset class because it doesnt allow extraction of more then one bit with an offset and also only allows extraction of a predefined number of bits.
So I already calculate the byteoffset (within the string) and bitoffset (within the starting byte).
m_nByteOffset = nBitOffset / 8;
m_nBitOffset = nBitOffset % 8;
Now i can get the starting address
const char* sSource = str.c_str()+m_nByteOffset;
And the bitmask
unsigned long long nMask = 0xFFFFFFFFFFFFFFFFULL >> (64-nBitLen);
But now I just cant figure out how to extract up to 64 bits from this as there are no 128 bit integers available.
unsigned long long nResult = ((*(unsigned long long*)sSource) >> m_nBitOffset) & nMask;
This only works for up to 64-bitoffset bits, how can i extend it to really work for 64 bit indepently of the bitoffset. And also as this is not a bytewise access it could cause a memory read access violation.
So im really looking for a bytewise solution to this problem that works for up to 64 bits. (preferably C or intrinsics)
Update: After searching and testing a lot I will probably use this function from RakNet:
https://github.com/OculusVR/RakNet/blob/master/Source/BitStream.cpp#L551
To do it byte-wise, just read the string (which BTW it is better to interpret as a sequence of uint8_t rather than char) one byte at a time, updating your result by shifting it left 8 and oring it with the current byte. The only complications are the first bit and the last bit, which both require you to read a part of a byte. For the first part simply use a bit mask to get the bit you need, and for the last part down shift it by the amount needed. Here is the code:
const uint8_t* sSource = reinterpret_cast<const uint8_t*>(str.c_str()+m_nByteOffset);
uint64_t result = 0;
uint8_t FULL_MASK = 0xFF;
if(m_nBitOffset) {
result = (*sSource & (FULL_MASK >> m_nBitOffset));
nBitLen -= (8 - m_nBitOffset);
sSource++;
}
while(nBitLen > 8) {
result <<= 8;
result |= *sSource;
nBitLen -= 8;
++sSource;
}
if(nBitLen) {
result <<= nBitLen;
result |= (*sSource >> (8 - nBitLen));
}
return result;
This is how I would do it in modern C++ style.
The bit length is determined by the size of the buffer extractedBits: instead of using an unsigned long long, you could also use any other data type (or even array type) with the desired size.
See it live
unsigned long long extractedBits;
char* extractedString = reinterpret_cast<char*>(&extractedBits);
std::transform(str.begin() + m_nByteOffset,
str.begin() + m_nByteOffset + sizeof(extractedBits),
str.begin() + m_nByteOffset + 1,
extractedString,
[=](char c, char d)
{
char bitsFromC = (c << m_nBitOffset);
char bitsFromD =
(static_cast<unsigned char>(d) >> (CHAR_BIT - m_nBitOffset));
return bitsFromC | bitsFromD;
});

XTEA-function with std::vector

I'm trying to encrypt a std::vector with XTEA. Because using std::vector brings various benefits dealing with big amounts of data, i want to use it.
The XTEA-Alogrithm uses two unsigned longs (v0 and v1) which take 64 bits of data, to encrypt them.
xtea_enc(unsigned char buf[], int length, unsigned char key[], unsigned char** outbuf)
/* Source http://pastebin.com/uEvZqmUj */
unsigned long v0 = *((unsigned long*)(buf+n));
unsigned long v1 = *((unsigned long*)(buf+n+4));
My problem is, that I'm looking for the best way to convert my char vector into a unsigned long pointer.
Or is there another way to split vector in 64-bit parts for the encryption function?
The insight comes in realizing that each char is a byte; thus a 64 bit number consists of 8 bytes or two 32 bit numbers.
Thus one 32 bit number can store 4 bytes, so you would for each 8 byte block in your char vector, store a pair of 4 byte numbers in a pair of 32 bit numbers. You would then pass this pair in to your xtea function, something like:
uint32_t datablock[2];
datablock[0] = (buf[0] << 24) | (buf[1] << 16) | (buf[2] << 8) | (buf[3]);
datablock[1] = (buf[4] << 24) | (buf[5] << 16) | (buf[6] << 8) | (buf[7]);
where in this example, buf is the type char[8] (or more appropriately uint8_t[8]).
The bit-shift '<<' operator shifts the placement of where a given byte's bits should be stored in the uint32_t (thus for example, the first byte in the above example is stored in the first 8 bits of datablock[0]). The '|' operator provides a concatenaton of all bits so that you end up with the full 32 bit number. Hope that makes sense.
My problem is, that I'm looking for the best way to convert my char vector into a unsigned long pointer.
((unsigned long*)vec.data()) since C++11 or ((unsigned long*)&vec[0])) pre-c++11?
PS: i guess someone will come along and argue that it should be a reinterpret_cast<unsigned long*>() or something sooner or later, and they'll probably be right.
also, i used a std::string, but here's how i did the enciper loop:
string message = readMessage();
for (size_t i = 0; i < message.length(); i += 8)
{
encipher(32, (uint32_t *)&message[i], keys);
}
// now message is encrypted
and
for (size_t i = 0; i < message.length(); i += 8)
{
decipher(32, (uint32_t *)&message[i], keys);
}
// now message is decrypted (still may have padding bytes tho)
and i just used the sample C enciper/deciper functions from XTEA's wikipedia page.

How to convert an array of bits to a char

I am trying to edit each byte of a buffer by modifying the LSB(Least Significant Bit) according to some requirements.
I am using the unsigned char type for the bytes, so please let me know IF that is correct/wrong.
unsigned char buffer[MAXBUFFER];
Next, i'm using this function
char *uchartob(char s[9], unsigned char u)
which modifies and returns the first parameter as an array of bits. This function works just fine, as the bits in the array represent the second parameter.
Here's where the hassle begins. I am going to point out what I'm trying to do step by step so you guys can let me know where i'm taking the wrong turn.
I am saving the result of the above function (called for each element of the buffer) in a variable
char binary_byte[9]; // array of bits
I am testing the LSB simply comparing it to some flag like above.
if (binary_byte[7]==bit_flag) // i go on and modify it like this
binary_byte[7]=0; // or 1, depending on the case
Next, I'm trying to convert the array of bits binary_byte (it is an array of bits, isn't it?) back into a byte/unsigned char and update the data in the buffer at the same time. I hope I am making myself clear enough, as I am really confused at the moment.
buffer[position_in_buffer]=binary_byte[0]<<7| // actualize the current BYTE in the buffer
binary_byte[1]<<6|
binary_byte[2]<<5|
binary_byte[3]<<4|
binary_byte[4]<<3|
binary_byte[5]<<2|
binary_byte[6]<<1|
binary_byte[7];
Keep in mind that the bit at the position binary_byte[7] may be modified, that's the point of all this.
The solution is not really elegant, but it's working, even though i am really insecure of what i did (I tried to do it with bitwise operators but without success)
The weird thing is when I am trying to print the updated character from the buffer. It has the same bits as the previous character, but it's a completely different one.
My final question is : What effect does changing only the LSB in a byte have? What should I expect?. As you can see, I'm getting only "new" characters even when i shouldn't.
So I'm still a little unsure what you are trying to accomplish here but since you are trying to modify individual bits of a byte I would propose using the following data structure:
union bit_byte
{
struct{
unsigned bit0 : 1;
unsigned bit1 : 1;
unsigned bit2 : 1;
unsigned bit3 : 1;
unsigned bit4 : 1;
unsigned bit5 : 1;
unsigned bit6 : 1;
unsigned bit7 : 1;
} bits;
unsigned char all;
};
This will allow you to access each bit of your byte and still get your byte representation. Here some quick sample code:
bit_byte myValue;
myValue.bits.bit0 = 1; // Set the LSB
// Test the LSB
if(myValue.bits.bit0 == 1) {
myValue.bits.bit7 = 1;
}
printf("%i", myValue.all);
bitwise:
set bit => a |= 1 << x;
reset bit => a &= ~(1 << x);
bit check => a & (1 << x);
flip bit => a ^= (1 << x)
If you can not manage this you can always use std::bitset.
Helper macros:
#define SET_BIT(where, bit_number) ((where) |= 1 << (bit_number))
#define RESET_BIT(where, bit_number) ((where) &= ~(1 << (bit_number)))
#define FLIP_BIT(where, bit_number) ((where) ^= 1 << (bit_number))
#define GET_BIT_VALUE(where, bit_number) (((where) & (1 << (bit_number))) >> bit_number) //this will retun 0 or 1
Helper application to print bits:
#include <iostream>
#include <cstdint>
#define GET_BIT_VALUE(where, bit_number) (((where) & (1 << (bit_number))) >> bit_number)
template<typename T>
void print_bits(T const& value)
{
for(uint8_t bit_count = 0;
bit_count < (sizeof(T)<<3);
++bit_count)
{
std::cout << GET_BIT_VALUE(value, bit_count) << std::endl;
}
}
int main()
{
unsigned int f = 8;
print_bits(f);
}