type cast to integer - c++

Suppose I have
unsigned char * buffer; // buffer len in 10000
I want to convert buffer+50 to buffer+54 to int. The following code works
int c=(*((int *) (buffer+ 32));
But is there any better way to do this and how much instruction it should take ?
Thanks a lot.

Something like this would work:
std::uint32_t convert_to_int32(std::uint8_t* buffer) // assume size 4
{
std::uint32_t result = (static_cast<std::uint32_t>(buffer[0]) << 24) |
(static_cast<std::uint32_t>(buffer[1]) << 16) |
(static_cast<std::uint32_t>(buffer[2]) << 8) |
(static_cast<std::uint32_t>(buffer[3]));
return result;
}
The main problem you will have with your current method is if you run into alignment issues (e.g. you attempt to cast the integer pointer from a point in the buffer that is not on an integer alignment barrier). The shifting method gets around that.

Related

Converting 24 bit integer (2s complement) to 32 bit integer in C++

The dataFile.bin is a binary file with 6-byte records. The first 3
bytes of each record contain the latitude and the last 3 bytes contain
the longitude. Each 24 bit value represents radians multiplied by
0X1FFFFF
This is a task I've been working on. I havent done C++ in years so its taking me way longer than I thought it would -_-. After googling around I saw this algorthim which made sense to me.
int interpret24bitAsInt32(byte[] byteArray) {
int newInt = (
((0xFF & byteArray[0]) << 16) |
((0xFF & byteArray[1]) << 8) |
(0xFF & byteArray[2])
);
if ((newInt & 0x00800000) > 0) {
newInt |= 0xFF000000;
} else {
newInt &= 0x00FFFFFF;
}
return newInt;
}
The problem is a syntax issue I am restricting to working by the way the other guy had programmed this. I am not understanding how I can store the CHAR "data" into an INT. Wouldn't it make more sense if "data" was an Array? Since its receiving 24 integers of information stored into a BYTE.
double BinaryFile::from24bitToDouble(char *data) {
int32_t iValue;
// ****************************
// Start code implementation
// Task: Fill iValue with the 24bit integer located at data.
// The first byte is the LSB.
// ****************************
//iValue +=
// ****************************
// End code implementation
// ****************************
return static_cast<double>(iValue) / FACTOR;
}
bool BinaryFile::readNext(DataRecord &record)
{
const size_t RECORD_SIZE = 6;
char buffer[RECORD_SIZE];
m_ifs.read(buffer,RECORD_SIZE);
if (m_ifs) {
record.latitude = toDegrees(from24bitToDouble(&buffer[0]));
record.longitude = toDegrees(from24bitToDouble(&buffer[3]));
return true;
}
return false;
}
double BinaryFile::toDegrees(double radians) const
{
static const double PI = 3.1415926535897932384626433832795;
return radians * 180.0 / PI;
}
I appreciate any help or hints even if you dont understand a clue or hint will help me alot. I just need to talk to someone.
I am not understanding how I can store the CHAR "data" into an INT.
Since char is a numeric type, there is no problem combining them into a single int.
Since its receiving 24 integers of information stored into a BYTE
It's 24 bits, not bytes, so there are only three integer values that need to be combined.
An easier way of producing the same result without using conditionals is as follows:
int interpret24bitAsInt32(byte[] byteArray) {
return (
(byteArray[0] << 24)
| (byteArray[1] << 16)
| (byteArray[2] << 8)
) >> 8;
}
The idea is to store the three bytes supplied as an input into the upper three bytes of the four-byte int, and then shift it down by one byte. This way the program would sign-extend your number automatically, avoiding conditional execution.
Note on portability: This code is not portable, because it assumes 32-bit integer size. To make it portable use <cstdint> types:
int32_t interpret24bitAsInt32(const std::array<uint8_t,3> byteArray) {
return (
(const_cast<int32_t>(byteArray[0]) << 24)
| (const_cast<int32_t>(byteArray[1]) << 16)
| (const_cast<int32_t>(byteArray[2]) << 8)
) >> 8;
}
It also assumes that the most significant byte of the 24-bit number is stored in the initial element of byteArray, then comes the middle element, and finally the least significant byte.
Note on sign extension: This code automatically takes care of sign extension by constructing the value in the upper three bytes and then shifting it to the right, as opposed to constructing the value in the lower three bytes right away. This additional shift operation ensures that C++ takes care of sign-extending the result for us.
When an unsigned char is casted to an int the higher order bits are filled with 0's
When a signed char is casted to a casted int, the sign bit is extended.
ie:
int x;
char y;
unsigned char z;
y=0xFF
z=0xFF
x=y;
/*x will be 0xFFFFFFFF*/
x=z;
/*x will be 0x000000FF*/
So, your algorithm, uses 0xFF as a mask to remove C' sign extension, ie
0xFF == 0x000000FF
0xABCDEF10 & 0x000000FF == 0x00000010
Then uses bit shifts and logical ands to put the bits in their proper place.
Lastly checks the most significant bit (newInt & 0x00800000) > 0 to decide if completing with 0's or ones the highest byte.
int32_t upperByte = ((int32_t) dataRx[0] << 24);
int32_t middleByte = ((int32_t) dataRx[1] << 16);
int32_t lowerByte = ((int32_t) dataRx[2] << 8);
int32_t ADCdata32 = (((int32_t) (upperByte | middleByte | lowerByte)) >> 8); // Right-shift of signed data maintains signed bit

Why does this function return a different value than if I did it inline?

Having come back from a .NET high, looking to make a faster and more efficient image library I figured I'd try doing everything manually for maximum control raises fist into the air.
A learning experience, you know?
So, I'm reading some test bitmaps (.bmp extension). While reading some bitmap file headers, I noticed the size block gives me negative values.
I've managed to track down the issue somewhat, but I'm not that great at debugging. When I replace my get_u32 function with the same chunk of code - inline - it gives me the right size. I imagine there is some implicit type conversion going on under the hood, but I can't really tell.
Here's the relevant code
typedef struct IMGLIB_API{
unsigned short type;
unsigned long size;
unsigned short reserved1;
unsigned short reserved2;
unsigned long offbits;
} BITMAP_FILE_HEADER;
unsigned short get_u16(char b0, char b1){
return ((b1 << 8) | b0);
}
unsigned long get_u32(char b0, char b1, char b2, char b3){
return (b3 << 24) | (b2 << 16) | (b1 << 8) | b0;
}
IMAGE_DATA bitmap_loader(const char* file_path){
IMAGE_DATA r;
ifstream ifs;
const unsigned short valid_formats[] = {
16973, // BM - Windows 3.1x, 95, NT, ...
16961, // BA - OS/2 struct bitmap array
17225, // CI - OS/2 struct color icon
17232, // CP - OS/2 const color pointer
18755, // IC - OS/2 struct icon
20564 // PT - OS/2 pointer
};
const int BMP_FILE_HEADER_SIZE = 14;
ifs.open(file_path, ifstream::in | ifstream::binary);
unsigned char* rhead = new unsigned char[BMP_FILE_HEADER_SIZE];
ifs.read((char*)rhead, BMP_FILE_HEADER_SIZE);
BITMAP_FILE_HEADER bmp_header;
bmp_header.type = get_u16(rhead[0], rhead[1]);
bmp_header.size = get_u32(rhead[2], rhead[3], rhead[4], rhead[5]); // - doesn't work
//bmp_header.size = (rhead[5] << 24) | (rhead[4] << 16) | (rhead[3] << 8) | rhead[2]; // - works
bmp_header.reserved1 = get_u16(rhead[6], rhead[7]);
bmp_header.reserved2 = get_u16(rhead[8], rhead[9]);
bmp_header.offbits = get_u32(rhead[10], rhead[11], rhead[12], rhead[13]); // correct, reports 54 bytes
// TODO: check if valid bitmap type
// TODO: read the dib header
// TODO: read in pixel data and decompress (if needed)
r.size = bmp_header.size;
r.type = img_type::IM_BITMAP;
ifs.close();
delete rhead;
rhead = 0;
return r;
}
As an example, when loading a particular bitmap with a size of 86 454 bytes, I get -74 when using the function get_u32. I figured that setting the return as unsigned, and the struct member to unsigned would mean it can't be.. signed. Better rename myself to Snow, because apparently I know nothing grumble
Feel free to give some tips on optimizations/better ways to go about it.
char can be signed or unsigned. A signed char when converted to a larger type sign extends. When you do math on a char, it converts to int, which is a larger type.
When doing bit manipulation, default to unsigned.
unsigned short get_u16(char b0, char b1){
return ((b1 << 8) | b0);
}
is an example of the problem. b0 is sign extended, maybe.
Implementations are free to treat char as either signed or unsigned.

XTEA-function with std::vector

I'm trying to encrypt a std::vector with XTEA. Because using std::vector brings various benefits dealing with big amounts of data, i want to use it.
The XTEA-Alogrithm uses two unsigned longs (v0 and v1) which take 64 bits of data, to encrypt them.
xtea_enc(unsigned char buf[], int length, unsigned char key[], unsigned char** outbuf)
/* Source http://pastebin.com/uEvZqmUj */
unsigned long v0 = *((unsigned long*)(buf+n));
unsigned long v1 = *((unsigned long*)(buf+n+4));
My problem is, that I'm looking for the best way to convert my char vector into a unsigned long pointer.
Or is there another way to split vector in 64-bit parts for the encryption function?
The insight comes in realizing that each char is a byte; thus a 64 bit number consists of 8 bytes or two 32 bit numbers.
Thus one 32 bit number can store 4 bytes, so you would for each 8 byte block in your char vector, store a pair of 4 byte numbers in a pair of 32 bit numbers. You would then pass this pair in to your xtea function, something like:
uint32_t datablock[2];
datablock[0] = (buf[0] << 24) | (buf[1] << 16) | (buf[2] << 8) | (buf[3]);
datablock[1] = (buf[4] << 24) | (buf[5] << 16) | (buf[6] << 8) | (buf[7]);
where in this example, buf is the type char[8] (or more appropriately uint8_t[8]).
The bit-shift '<<' operator shifts the placement of where a given byte's bits should be stored in the uint32_t (thus for example, the first byte in the above example is stored in the first 8 bits of datablock[0]). The '|' operator provides a concatenaton of all bits so that you end up with the full 32 bit number. Hope that makes sense.
My problem is, that I'm looking for the best way to convert my char vector into a unsigned long pointer.
((unsigned long*)vec.data()) since C++11 or ((unsigned long*)&vec[0])) pre-c++11?
PS: i guess someone will come along and argue that it should be a reinterpret_cast<unsigned long*>() or something sooner or later, and they'll probably be right.
also, i used a std::string, but here's how i did the enciper loop:
string message = readMessage();
for (size_t i = 0; i < message.length(); i += 8)
{
encipher(32, (uint32_t *)&message[i], keys);
}
// now message is encrypted
and
for (size_t i = 0; i < message.length(); i += 8)
{
decipher(32, (uint32_t *)&message[i], keys);
}
// now message is decrypted (still may have padding bytes tho)
and i just used the sample C enciper/deciper functions from XTEA's wikipedia page.

having char[1] and offset how to read int?

I have such structure
typedef struct {
int32_t DataLen;
char Data[1];
} MTEMSG;
So Data contains DataLen symbols that should be decoded by certain rules. I should write ReadInt ReadString etc methods.
As a first step I want to write ReadInt. From documentation this is "Four bytes in a format of x86 CPU (the little-endian byte goes first)." How can I convert char[1] to int? I guess it should be something like:
MTEMSG* data;
int offset;
....
int Reader::ReadInt()
{
int result = // read 4 bytes starting from offset
offset += 4;
}
It's allowed to use boost and c++11. Just looking for simple and fast method to convert.
I hope once you suggest me how to convert int I can do many of the rest methods myself.
Totally illegal and UB, but you would do something like *reinterpret_cast<int*>(data+offset).
Watch out for alignment and stuff.
First of all, in C++ as they have stated in the comments, this is illegal. Nevertheless, assuming your compiler assumes you might do something like this and has a well-defined behavior for it, then let's go ahead.
So semantically, you have such a struct:
typedef struct {
int32_t DataLen;
char Data[N];
} MTEMSG;
where N is "large enough".
And you need to convert Data to a 4-byte little endian integer. That's quite simple:
MTEMSG* data;
int offset = 0;
....
int Reader::ReadInt()
{
/* Note: int32_t would be more precise */
int result = data->Data[offset + 0]
| (data->Data[offset + 1] << 8)
| (data->Data[offset + 2] << 16)
| (data->Data[offset + 3] << 24);
offset += 4;
}

C/C++: Bitwise operators on dynamically allocated memory

In C/C++, is there an easy way to apply bitwise operators (specifically left/right shifts) to dynamically allocated memory?
For example, let's say I did this:
unsigned char * bytes=new unsigned char[3];
bytes[0]=1;
bytes[1]=1;
bytes[2]=1;
I would like a way to do this:
bytes>>=2;
(then the 'bytes' would have the following values):
bytes[0]==0
bytes[1]==64
bytes[2]==64
Why the values should be that way:
After allocation, the bytes look like this:
[00000001][00000001][00000001]
But I'm looking to treat the bytes as one long string of bits, like this:
[000000010000000100000001]
A right shift by two would cause the bits to look like this:
[000000000100000001000000]
Which finally looks like this when separated back into the 3 bytes (thus the 0, 64, 64):
[00000000][01000000][01000000]
Any ideas? Should I maybe make a struct/class and overload the appropriate operators? Edit: If so, any tips on how to proceed? Note: I'm looking for a way to implement this myself (with some guidance) as a learning experience.
I'm going to assume you want bits carried from one byte to the next, as John Knoeller suggests.
The requirements here are insufficient. You need to specify the order of the bits relative to the order of the bytes - when the least significant bit falls out of one byte, does to go to the next higher or next lower byte.
What you are describing, though, used to be very common for graphics programming. You have basically described a monochrome bitmap horizontal scrolling algorithm.
Assuming that "right" means higher addresses but less significant bits (ie matching the normal writing conventions for both) a single-bit shift will be something like...
void scroll_right (unsigned char* p_Array, int p_Size)
{
unsigned char orig_l = 0;
unsigned char orig_r;
unsigned char* dest = p_Array;
while (p_Size > 0)
{
p_Size--;
orig_r = *p_Array++;
*dest++ = (orig_l << 7) + (orig_r >> 1);
orig_l = orig_r;
}
}
Adapting the code for variable shift sizes shouldn't be a big problem. There's obvious opportunities for optimisation (e.g. doing 2, 4 or 8 bytes at a time) but I'll leave that to you.
To shift left, though, you should use a separate loop which should start at the highest address and work downwards.
If you want to expand "on demand", note that the orig_l variable contains the last byte above. To check for an overflow, check if (orig_l << 7) is non-zero. If your bytes are in an std::vector, inserting at either end should be no problem.
EDIT I should have said - optimising to handle 2, 4 or 8 bytes at a time will create alignment issues. When reading 2-byte words from an unaligned char array, for instance, it's best to do the odd byte read first so that later word reads are all at even addresses up until the end of the loop.
On x86 this isn't necessary, but it is a lot faster. On some processors it's necessary. Just do a switch based on the base (address & 1), (address & 3) or (address & 7) to handle the first few bytes at the start, before the loop. You also need to special case the trailing bytes after the main loop.
Decouple the allocation from the accessor/mutators
Next, see if a standard container like bitset can do the job for you
Otherwise check out boost::dynamic_bitset
If all fails, roll your own class
Rough example:
typedef unsigned char byte;
byte extract(byte value, int startbit, int bitcount)
{
byte result;
result = (byte)(value << (startbit - 1));
result = (byte)(result >> (CHAR_BITS - bitcount));
return result;
}
byte *right_shift(byte *bytes, size_t nbytes, size_t n) {
byte rollover = 0;
for (int i = 0; i < nbytes; ++i) {
bytes[ i ] = (bytes[ i ] >> n) | (rollover < n);
byte rollover = extract(bytes[ i ], 0, n);
}
return &bytes[ 0 ];
}
Here's how I would do it for two bytes:
unsigned int rollover = byte[0] & 0x3;
byte[0] >>= 2;
byte[1] = byte[1] >> 2 | (rollover << 6);
From there, you can generalize this into a loop for n bytes. For flexibility, you will want to generate the magic numbers (0x3 and 6) rather then hardcode them.
I'd look into something similar to this:
#define number_of_bytes 3
template<size_t num_bytes>
union MyUnion
{
char bytes[num_bytes];
__int64 ints[num_bytes / sizeof(__int64) + 1];
};
void main()
{
MyUnion<number_of_bytes> mu;
mu.bytes[0] = 1;
mu.bytes[1] = 1;
mu.bytes[2] = 1;
mu.ints[0] >>= 2;
}
Just play with it. You'll get the idea I believe.
Operator overloading is syntactic sugar. It's really just a way of calling a function and passing your byte array without having it look like you are calling a function.
So I would start by writing this function
unsigned char * ShiftBytes(unsigned char * bytes, size_t count_of_bytes, int shift);
Then if you want to wrap this up in an operator overload in order to make it easier to use or because you just prefer that syntax, you can do that as well. Or you can just call the function.