Legacy code seems to have an overflow, I'm not sure though - c++

I'm working at some legacy code right now (converting some of it to C#), and I've stumbled upon a problem:
A byte array is created (length is ulcLen):
CSLAutoArray<BYTE> pMem(new BYTE[ulcLen]);
Now some stuff is put into the byte array, after which a CRC / Hash value is supposed to be written to the first four bytes (ULONG / UInt32):
__CfgCRC(pMem + sizeof(ULONG), ulcLen - sizeof(ULONG))
->
inline ULONG __CfgCRC(const void* const cpcMem, const ULONG ulcMemSize)
{
ULONG ulRes = 0;
const BYTE* const cpcUseMem = reinterpret_cast<const BYTE*>(cpcMem);
for(const BYTE* pcLook = cpcUseMem; cpcUseMem + ulcMemSize > pcLook; pcLook++)
{
ulRes ^= static_cast<ULONG>(*pcLook);
//[...]
};
return ulRes;
};
Now, is it just me, or is the static_cast reading 1/2/3 bytes over the end of the byte array, at the end of the for loop? Since pcLook (the memory pointer) is increased until it reaches the full length of the data, (ulclen + sizeof(ULONG)) ? Or am I wrong? Or does static_cast somehow not read over the end of an array ? (CSLAutoArray is some kind of managed pointer class, but as far as I see it does not interfere with this code)

*pcLook is just a BYTE so no, it's only reading 1 octet at a time. the cast just casts the BYTE and not what pcLock is pointing to.

Related

SystemC Transfer Level Modeling Extract Two Integers from tlm_generic_payload

I am working with the SystemC TLM library. I would like to send a payload with two integers to a module that will perform an operation on those two integers. My question is simply how to setup and decode the payload.
Doulos provided documentation on both setting up and decoding here https://www.doulos.com/knowhow/systemc/tlm2/tutorial__1/
Setup
tlm::tlm_command cmd = static_cast(rand() % 2);
if (cmd == tlm::TLM_WRITE_COMMAND) data = 0xFF000000 | i;
trans->set_command( cmd );
trans->set_address( i );
trans->set_data_ptr( reinterpret_cast<unsigned char*>(&data) );
trans->set_data_length( 4 );
trans->set_streaming_width( 4 );
trans->set_byte_enable_ptr( 0 );
trans->set_dmi_allowed( false );
trans->set_response_status( tlm::TLM_INCOMPLETE_RESPONSE );
socket->b_transport( *trans, delay );
Decode
virtual void b_transport( tlm::tlm_generic_payload& trans, sc_time& delay )
{
tlm::tlm_command cmd = trans.get_command();
sc_dt::uint64 adr = trans.get_address() / 4;
unsigned char* ptr = trans.get_data_ptr();
unsigned int len = trans.get_data_length();
unsigned char* byt = trans.get_byte_enable_ptr();
unsigned int wid = trans.get_streaming_width();
So it looks to me like you would send a pointer to a memory location where there are two integers written.
|----------------------------------int1-------------------------|------------------------------------int2------------------------
|ptr+0x0|ptr+0x(wid)|ptr+0x(2*wid)|ptr+0x(3*wid) | ptr+0x(4*wid)|ptr+0x(5*wid)|ptr+0x(6*wid)|ptr+0x
----------|
(7*wid)|
Is my interpretation of this documentation correct?
How could you get those first 4 memory locations [3:0] and combine them into an int32 and how could you get the second 4 [7:4] and turn them into the second integer?
So it looks to me like you would send a pointer to a memory location
where there are two integers written.
Is my interpretation of this documentation correct?
Yes
To get them back you just need to copy them:
int32_t val0, val1;
memcpy(&val0, ptr, sizeof(int32_t));
memcpy(&val1, ptr + sizeof(int32_t), sizeof(int32_t));
or something like
int32_t val[2];
memcpy(val, ptr, sizeof val);
But make sure initiator keeps memory under the pointer valid long enough e.g. it might be better to avoid using keep data on the stack. And don't forget to check if payloads data length attribute has valid value - you want to detect those issues as soon as possible.

Convert char buffer to struct

I have a char buffer buf containing buf[0] = 10, buf[1] = 3, buf[2] = 3, buf[3] = 0, buf[4] = 58,
and a structure:
typedef struct
{
char type;
int version;
int length;
}Header;
I wanted to convert the buf into a Header. Now I am using the function
int getByte( unsigned char* buf)
{
int number = buf[0];
return number;
}
int getInt(unsigned char* buf)
{
int number = (buf[0]<<8)+buf[1];
return number;
}
main()
{
Header *head = new Header;
int location = 0;
head->type = getByte(&buf[location]);
location++; // location = 1
head->version = getInt(&buf[location]);
location += 2; // location = 3
head->ength = getInt(&buf[location]);
location += 2; // location = 5
}
I am searching for a solution such as
Header *head = new Header;
memcpy(head, buf, sizeof(head));
In this, first value in the Header, head->type is proper and rest is garbage. Is it possible to convert unsigned char* buf to Header?
The only full portable and secure way is:
void convertToHeader(unsigned char const * const buffer, Header *header)
{
header->type = buffer[0];
header->version = (buffer[1] << 8) | buffer[2];
header->length = (buffer[3] << 8) | buffer[4];
}
and
void convertFromHeader(Header const * const header, unsigned char * buffer)
{
buffer[0] = header->type;
buffer[1] = (static_cast<unsigned int>(header->version) >> 8) & 0xFF;
buffer[2] = header->version & 0xFF;
buffer[3] = (static_cast<unsigned int>(header->length) >> 8) & 0xFF;
buffer[4] = header->length & 0xFF;
}
Example
see Converting bytes array to integer for explanations
EDIT
A quick summary of previous link: other possible solutions (memcpy or union for example) are no portable according endianess of different system (doing what you do is probably for a sort of communication between at least two heterogeneous systems) => some of systems byte[0] is LSB of int and byte[1] is MSB and on other is the inverse.
Also, due to alignement, struct Header can be bigger than 5 bytes (probably 6 bytes in your case, if alignement is 2 bytes!) (see here for example)
Finally, according alignment restrictions and aliasing rules on some platform, compiler can generate incorrect code.
What you want would need your version and length to have the same length as 2 elements of your buf array; that is you'd need to use the type uint16_t, defined in <cstdint>, rather than int which is likely longer. And also you'd need to make buf an array of uint8_t, as char is allowed to take more than 1 byte!
You probably also need to move type to the end; as otherwise the compiler will almost certainly insert a padding byte after it to be able to align version to a 2-byte boundary (once you have made it uint16_t and thus 2 bytes); and then your buf[1] would end up there rather than were you want it.
This is probably what you observe right now, by the way: by having a char followed by an int, which is probably 4 bytes, you have 3 bytes of padding, and the elements 1 to 3 of your array are being inserted there (=lost forever).
Another solution would be to modify your buf array to be longer and have empty padding bytes as well, so that the data will be actually aligned with the struct fields.
Worth mentioning again is that, as pointed out in the comments, sizeof(head) returns the size of pointers on your system, not of the Header structure. You can directly write sizeof(Header); but at this level of micromanagement, you wont be losing any more flexibility if you just write "5", really.
Also, endianness can screw with you. Processors have no obbligation to store the bytes of a number in the order you expect rather than the opposite one; both make internal sense after all. This means that blindly copying bytes buf[0], buf[1] into a number can result in (buf[0]<<8)+buf[1], but also in (buf[1]<<8)+buf[0], or even in (buf[1]<<24)+(buf[0]<<16) if the data type is 4 bytes (as int usually is). And even if it works on your computer now, there is at least one out there where the same code will result in garbage. Unless, that is, those bytes actually come from reinterpreting a number in the first place. In which case the code is wrong (not portable) now, however.
...is it worth it?
All things considered, my advice is strongly to keep the way you handle them now. Maybe simplify it.
It really makes no sense to convert a byte to an int then to byte again, or to take the address of a byte to dereference it again, nor there is need of helper variables with no descriptive name and no purpose other than being returned, or of a variable whose value you know in advance at all time.
Just do
int getTwoBytes(unsigned char* buf)
{
return (buf[0]<<8)+buf[1];
}
main()
{
Header *head = new Header;
head->type = buf[0];
head->version = getTwoBytes(buf + 1);
head->length = getTwoBytes(buf + 3);
}
the better way is to create some sort of serialization/deserialization routines.
also, I'd use not just int or char types, but would use more specific int32_t etc. it's just platform-independent way (well, actually you can also pack your data structures with pragma pack).
struct Header
{
char16_t type;
int32_t version;
int32_t length;
};
struct Tools
{
std::shared_ptr<Header> deserializeHeader(const std::vector<unsigned char> &loadedBuffer)
{
std::shared_ptr<Header> header(new Header);
memcpy(&(*header), &loadedBuffer[0], sizeof(Header));
return header;
}
std::vector<unsigned char> serializeHeader(const Header &header)
{
std::vector<unsigned char> buffer;
buffer.resize(sizeof(Header));
memcpy(&buffer[0], &header, sizeof(Header));
return buffer;
}
}
tools;
Header header = {'B', 5834, 4665};
auto v1 = tools.serializeHeader(header);
auto v2 = tools.deserializeHeader(v1);

Setting a buffer of char* with intermediate casting to int*

I could not fully understand the consequences of what I read here: Casting an int pointer to a char ptr and vice versa
In short, would this work?
set4Bytes(unsigned char* buffer) {
const uint32_t MASK = 0xffffffff;
if ((uintmax_t)buffer % 4) {//misaligned
for (int i = 0; i < 4; i++) {
buffer[i] = 0xff;
}
} else {//4-byte alignment
*((uint32_t*) buffer) = MASK;
}
}
Edit
There was a long discussion (it was in the comments, which mysteriously got deleted) about what type the pointer should be casted to in order to check the alignment. The subject is now addressed here.
This conversion is safe if you are filling same value in all 4 bytes. If byte order matters then this conversion is not safe.
Because when you use integer to fill 4 Bytes at a time it will fill 4 Bytes but order depends on the endianness.
No, it won't work in every case. Aside from endianness, which may or may not be an issue, you assume that the alignment of uint32_t is 4. But this quantity is implementation-defined (C11 Draft N1570 Section 6.2.8). You can use the _Alignof operator to get the alignment in a portable way.
Second, the effective type (ibid. Sec. 6.5) of the location pointed to by buffer may not be compatible to uint32_t (e.g. if buffer points to an unsigned char array). In that case you break strict aliasing rules once you try reading through the array itself or through a pointer of different type.
Assuming that the pointer actually points to an array of unsigned char, the following code will work
typedef union { unsigned char chr[sizeof(uint32_t)]; uint32_t u32; } conv_t;
void set4Bytes(unsigned char* buffer) {
const uint32_t MASK = 0xffffffffU;
if ((uintptr_t)buffer % _Alignof(uint32_t)) {// misaligned
for (size_t i = 0; i < sizeof(uint32_t); i++) {
buffer[i] = 0xffU;
}
} else { // correct alignment
conv_t *cnv = (conv_t *) buffer;
cnv->u32 = MASK;
}
}
This code might be of help to you. It shows a 32-bit number being built by assigning its contents a byte at a time, forcing misalignment. It compiles and works on my machine.
#include<stdint.h>
#include<stdio.h>
#include<inttypes.h>
#include<stdlib.h>
int main () {
uint32_t *data = (uint32_t*)malloc(sizeof(uint32_t)*2);
char *buf = (char*)data;
uintptr_t addr = (uintptr_t)buf;
int i,j;
i = !(addr%4) ? 1 : 0;
uint32_t x = (1<<6)-1;
for( j=0;j<4;j++ ) buf[i+j] = ((char*)&x)[j];
printf("%" PRIu32 "\n",*((uint32_t*) (addr+i)) );
}
As mentioned by #Learner, endianness must be obeyed. The code above is not portable and would break on a big endian machine.
Note that my compiler throws the error "cast from ‘char*’ to ‘unsigned int’ loses precision [-fpermissive]" when trying to cast a char* to an unsigned int, as done in the original post. This post explains that uintptr_t should be used instead.
In addition to the endian issue, which has already been mentioned here:
CHAR_BIT - the number of bits per char - should also be considered.
It is 8 on most platforms, where for (int i=0; i<4; i++) should work fine.
A safer way of doing it would be for (int i=0; i<sizeof(uint32_t); i++).
Alternatively, you can include <limits.h> and use for (int i=0; i<32/CHAR_BIT; i++).
Use reinterpret_cast<>() if you want to ensure the underlying data does not "change shape".
As Learner has mentioned, when you store data in machine memory endianess becomes a factor. If you know how the data is stored correctly in memory (correct endianess) and you are specifically testing its layout as an alternate representation, then you would want to use reinterpret_cast<>() to test that memory, as a specific type, without modifying the original storage.
Below, I've modified your example to use reinterpret_cast<>():
void set4Bytes(unsigned char* buffer) {
const uint32_t MASK = 0xffffffff;
if (*reinterpret_cast<unsigned int *>(buffer) % 4) {//misaligned
for (int i = 0; i < 4; i++) {
buffer[i] = 0xff;
}
} else {//4-byte alignment
*reinterpret_cast<unsigned int *>(buffer) = MASK;
}
}
It should also be noted, your function appears to set the buffer (32-bytes of contiguous memory) to 0xFFFFFFFF, regardless of which branch it takes.
Your code is perfect for working with any architecture with 32bit and up. There is no issue with byte ordering since all your source bytes are 0xFF.
At x86 or x64 machines, the extra work necessary to deal with eventually unaligned access to RAM are managed by the CPU and transparent to the programmer (since Pentium II), with some performance cost at each access. So, if you are just setting the first four bytes of a buffer a few times, you are good to simplify your function:
void set4Bytes(unsigned char* buffer) {
const uint32_t MASK = 0xffffffff;
*((uint32_t *)buffer) = MASK;
}
Some readings:
A Linux kernel doc about UNALIGNED MEMORY ACCESSES
Intel Architecture Optimization Manual, section 3.4
Windows Data Alignment on IPF, x86, and x64
A Practical 'Aligned vs. unaligned memory access', by Alexander Sandler

4 chars to int in c++

I have to read 10 bytes from a file and the last 4 bytes are an unsigned integer. But I got a 11 char byte long char array / pointer. How do I convert the last 4 bytes (without the zero terminating character at the end) to an unsigned integer?
//pesudo code
char *p = readBytesFromFile();
unsigned int myInt = 0;
for( int i = 6; i < 10; i++ )
myInt += (int)p[i];
Is that correct? Doesn't seem correct to me.
The following code might work:
myInt = *(reinterpret_cast<unsigned int*>(p + 6));
iff:
There are no alignment problems (e.g. on a GPU memory space this is very likely to blow if some guarantees aren't provided).
You can guarantee that the system endianness is the same used to store the data
You can be sure that sizeof(int) == 4, this is not guaranteed everywhere
If not, as Dietmar suggested, you should loop over your data (forward or reverse according to the endianness) and do something like
myInt = myInt << 8 | static_cast<unsigned char>(p[i])
this is alignment-safe (it should be on every system). Still pay attention to points 1 and 3.
I agree with the previous answer but just wanna add that this solution may not work 100% if the file was created with a different endianness.
I do not want to confuse you with extra information but keep in mind that endianness may cause you problem when you cast directly from a file.
Here's a tutorial on endianness : http://www.codeproject.com/Articles/4804/Basic-concepts-on-Endianness
Try myInt = *(reinterpret_cast<unsigned int*>(p + 6));.
This takes the address of the 6th character, reinterprets as a pointer to an unsigned int, and then returns the (unsigned int) value it points to.
Maybe using an union is an option? I think this might work;
UPDATE: Yes, it works.
union intc32 {
char c[4];
int v;
};
int charsToInt(char a, char b, char c, char d) {
intc32 r = { { a, b, c, d } };
return r.v;
}

C/C++: Bitwise operators on dynamically allocated memory

In C/C++, is there an easy way to apply bitwise operators (specifically left/right shifts) to dynamically allocated memory?
For example, let's say I did this:
unsigned char * bytes=new unsigned char[3];
bytes[0]=1;
bytes[1]=1;
bytes[2]=1;
I would like a way to do this:
bytes>>=2;
(then the 'bytes' would have the following values):
bytes[0]==0
bytes[1]==64
bytes[2]==64
Why the values should be that way:
After allocation, the bytes look like this:
[00000001][00000001][00000001]
But I'm looking to treat the bytes as one long string of bits, like this:
[000000010000000100000001]
A right shift by two would cause the bits to look like this:
[000000000100000001000000]
Which finally looks like this when separated back into the 3 bytes (thus the 0, 64, 64):
[00000000][01000000][01000000]
Any ideas? Should I maybe make a struct/class and overload the appropriate operators? Edit: If so, any tips on how to proceed? Note: I'm looking for a way to implement this myself (with some guidance) as a learning experience.
I'm going to assume you want bits carried from one byte to the next, as John Knoeller suggests.
The requirements here are insufficient. You need to specify the order of the bits relative to the order of the bytes - when the least significant bit falls out of one byte, does to go to the next higher or next lower byte.
What you are describing, though, used to be very common for graphics programming. You have basically described a monochrome bitmap horizontal scrolling algorithm.
Assuming that "right" means higher addresses but less significant bits (ie matching the normal writing conventions for both) a single-bit shift will be something like...
void scroll_right (unsigned char* p_Array, int p_Size)
{
unsigned char orig_l = 0;
unsigned char orig_r;
unsigned char* dest = p_Array;
while (p_Size > 0)
{
p_Size--;
orig_r = *p_Array++;
*dest++ = (orig_l << 7) + (orig_r >> 1);
orig_l = orig_r;
}
}
Adapting the code for variable shift sizes shouldn't be a big problem. There's obvious opportunities for optimisation (e.g. doing 2, 4 or 8 bytes at a time) but I'll leave that to you.
To shift left, though, you should use a separate loop which should start at the highest address and work downwards.
If you want to expand "on demand", note that the orig_l variable contains the last byte above. To check for an overflow, check if (orig_l << 7) is non-zero. If your bytes are in an std::vector, inserting at either end should be no problem.
EDIT I should have said - optimising to handle 2, 4 or 8 bytes at a time will create alignment issues. When reading 2-byte words from an unaligned char array, for instance, it's best to do the odd byte read first so that later word reads are all at even addresses up until the end of the loop.
On x86 this isn't necessary, but it is a lot faster. On some processors it's necessary. Just do a switch based on the base (address & 1), (address & 3) or (address & 7) to handle the first few bytes at the start, before the loop. You also need to special case the trailing bytes after the main loop.
Decouple the allocation from the accessor/mutators
Next, see if a standard container like bitset can do the job for you
Otherwise check out boost::dynamic_bitset
If all fails, roll your own class
Rough example:
typedef unsigned char byte;
byte extract(byte value, int startbit, int bitcount)
{
byte result;
result = (byte)(value << (startbit - 1));
result = (byte)(result >> (CHAR_BITS - bitcount));
return result;
}
byte *right_shift(byte *bytes, size_t nbytes, size_t n) {
byte rollover = 0;
for (int i = 0; i < nbytes; ++i) {
bytes[ i ] = (bytes[ i ] >> n) | (rollover < n);
byte rollover = extract(bytes[ i ], 0, n);
}
return &bytes[ 0 ];
}
Here's how I would do it for two bytes:
unsigned int rollover = byte[0] & 0x3;
byte[0] >>= 2;
byte[1] = byte[1] >> 2 | (rollover << 6);
From there, you can generalize this into a loop for n bytes. For flexibility, you will want to generate the magic numbers (0x3 and 6) rather then hardcode them.
I'd look into something similar to this:
#define number_of_bytes 3
template<size_t num_bytes>
union MyUnion
{
char bytes[num_bytes];
__int64 ints[num_bytes / sizeof(__int64) + 1];
};
void main()
{
MyUnion<number_of_bytes> mu;
mu.bytes[0] = 1;
mu.bytes[1] = 1;
mu.bytes[2] = 1;
mu.ints[0] >>= 2;
}
Just play with it. You'll get the idea I believe.
Operator overloading is syntactic sugar. It's really just a way of calling a function and passing your byte array without having it look like you are calling a function.
So I would start by writing this function
unsigned char * ShiftBytes(unsigned char * bytes, size_t count_of_bytes, int shift);
Then if you want to wrap this up in an operator overload in order to make it easier to use or because you just prefer that syntax, you can do that as well. Or you can just call the function.