convert vector of 4 bytes into int c++ - c++

my function read_address always returns a vector of 4 bytes
for other usecases it is a vector and not const array
in this use case I always return 4 bytes
std::vector<uint8_t> content;
_client->read_address(start_addr, sizeof(int), content);
uint32_t res = reinterpret_cast<uint32_t>(content.data());
return res;
here is the vector called content
it is always the same values
however the value of "res" is always changing to random numbers
Can you please explain my mistake

Your code is casting the pointer value (random address) to an integer instead of referencing what it points to. Easy fix.
Instead of this:
uint32_t res = reinterpret_cast<uint32_t>(content.data());
This:
uint32_t* ptr = reinterpret_cast<uint32_t*>(content.data());
uint32_t res = *ptr;
The language lawyers may take exception to the above. And because there can be issues with copying data on unaligned boundaries on some platforms. Hence, copying the 4 bytes out of content and into the address of the 32-bit integer may suffice:
memcpy(&res, content.data(), 4);

Related

C++ adding 4 bytes to pointer address

I have a question about pointers, and memory addresses:
Supposing I have the following code:
int * array = (int *) malloc(sizeof(int) * 4);
Now in array im storing a memory address, I know that c++ takes already care when adding +1 to this pointer it will add 4 bytes, but what If I want to add manually 4 bytes?
array + 0x004
If im correct this will lead to add 4*4 (16) bytes, but my Idea is to add manually those 4 bytes.
Why? Just playing around, i've tried this and I got a totally different result from what I expected, then i've researched and i've seen that c++ takes already care when you add +1 to a pointer (it sums 4 bytes in this case).
Any idea?
For a pointer p to a type T with value v, the expression p+n will (on most systems anyway) result in a pointer to the address v+n*sizeof(T). To get a fixed-byte offset to the pointer, you can first cast it to a character pointer, like this:
reinterpret_cast<T*>(reinterpret_cast<char*>(p) + n)
In c++, sizeof(char) is defined to be equal to 1.
Do note that accessing improperly aligned values can have large performance penalties.
Another thing to note is that, in general, casting pointers to different types is not allowed (called the strict aliasing rule), but an exception is explicitly made for casting any pointer type to char* and back.
The trick is convert the type of array into any pointer-type with a size of 1 Byte, or store the pointer value in an integer.
#include <stdint.h>
int* increment_1(int* ptr) {
//C-Style
return (int*)(((char*)ptr) + 4);
}
int* increment_2(int* ptr) {
//C++-Style
char* result = reinterpret_cast<char*>(ptr);
result += 4;
return reinterpret_cast<int*>(result);
}
int* increment_3(int* ptr) {
//Store in integer
intptr_t result = reinterpret_cast<intptr_t>(ptr);
result += 4;
return reinterpret_cast<int*>(result);
}
Consider that if you add an arbitrary number of bytes to an address of an object of type T, it no longer makes sense to use a pointer of type T, since there might not be an object of type T at the incremented memory address.
If you want to access a particular byte of an object, you can do so using a pointer to a char, unsigned char or std::byte. Such objects are the size of a byte, so incrementing behaves just as you would like. Furthermore, while rules of C++ disallow accessing objects using incompatible pointers, these three types are excempt of that rule and are allowed to access objects of any type.
So, given
int * array = ....
You can access the byte at index 4 like this:
auto ptr = reinterpret_cast<unsigned char*>(array);
auto byte_at_index_4 = ptr + 4;
array + 0x004
If im correct this will lead to add 4*4 (16) bytes
Assuming sizeof(int) happens to be 4, then yes. But size of int is not guaranteed to be 4.

Obtaining an int from a void pointer which points to a short

I have a return value from a library which is a void pointer. I know that it points to a short int; I try to obtain the int value in the following way (replacing the function call with a simple assignment to a void *):
short n = 1;
void* s = &n;
int k = *(int*)s;
I try to cast a void pointer that points to an address in which there is a short and I try to cast the pointer to point to an int and when I do so the output becomes a rubbish value. While I understand why it's behaving like that I don't know if there's a solution to this.
If the problem you are dealing with truly deals with short and int, you can simply avoid the pointer and use:
short n = 1;
int k = n;
If the object types you are dealing with are different, then the solution will depend on what those types are.
Update, in response to OP's comment
In a comment, you said,
I have a function that returns a void pointer and I would need to cast the value accordingly.
If you know that the function returns a void* that truly points to a short object, then, your best bet is:
void* ptr = function_returning_ptr();
short* sptr = reinterpret_cast<short*>(ptr);
int k = *sptr;
The last line work since *sptr evaluates to a short and the conversion of a short to an int is a valid operation. On the other hand,
int k = *(int*)sptr;
does not work since conversion of short* to an int* is not a valid operation.
Your code is subject to undefined behavior, as it violates the so-called strict aliasing rules. Without going into too much detail and simplifying a bit, the rule states that you can not access an object of type X though a pointer to type Z unless types X and Z are related. There is a special exception for char pointer, but it doesn't apply here.
In your example, short and int are not related types, and as such, accessing one through pointer to another is not allowed.
The size of a short is only 16 bits the size of a int is 32 bits ( in most cases not always) this means that you are tricking the computer into thinking that your pointer to a short is actually pointing to an integer. This causes it to read more memory that it should and is reading garbage memory. If you cast s to a pointer to a short then deference it it will work.
short n = 1;
void* s = &n;
int k = *(short*)s;
Assuming you have 2 byte shorts and 4 byte ints, There's 3 problems with casting pointers in your method.
First off, the 4 byte int will necessarily pick up some garbage memory when using the short's pointer. If you're lucky the 2 bytes after short n will be 0.
Second, the 4 byte int may not be properly aligned. Basically, the memory address of a 4 byte int has to be a multiple of 4, or else you risk bus errors. Your 2 byte short is not guaranteed to be properly aligned.
Finally, you have a big-endian/little-endian dependency. You can't turn a big-endian short into a little-endian int by just tacking on some 0's at the end.
In the very fortunate circumstance that the bytes following the short are 0, AND the short is integer aligned, AND the system uses little-endian representation, then such a cast will probably work. It would be terrible, but it would (probably) work.
The proper solution is to use the original type and let the compiler cast. Instead of int k = *(int*)s;, you need to use int k = *(short *)s;

C++ Vector data access

I've got an array of bytes, declared like so:
typedef unsigned char byte;
vector<byte> myBytes = {255, 0 , 76 ...} //individual bytes no larger in value than 255
The problem I have is I need to access the raw data of the vector (without any copying of course), but I need to assign an arbitrary amount of bits to any given pointer to an element.
In other words, I need to assign, say an unsigned int to a certain position in the vector.
So given the example above, I am looking to do something like below:
myBytes[0] = static_cast<unsigned int>(76535); //assign n-bit (here 32-bit) value to any index in the vector
So that the vector data would now look like:
{2, 247, 42, 1} //raw representation of a 32-bit int (76535)
Is this possible? I kind of need to use a vector and am just wondering whether the raw data can be accessed in this way, or does how the vector stores raw data make this impossible or worse - unsafe?
Thanks in advance!
EDIT
I didn't want to add complication, but I'm constructing variously sized integer as follows:
//**N_TYPES
u16& VMTypes::u8sto16(u8& first, u8& last) {
return *new u16((first << 8) | last & 0xffff);
}
u8* VMTypes::u16to8s(u16& orig) {
u8 first = (u8)orig;
u8 last = (u8)(orig >> 8);
return new u8[2]{ first, last };
}
What's terrible about this, is I'm not sure of the endianness of the numbers generated. But I know that I am constructing and destructing them the same everywhere (I'm writing a stack machine), so if I'm not mistaken, endianness is not effected with what I'm trying to do.
EDIT 2
I am constructing ints in the following horrible way:
u32 a = 76535;
u16* b = VMTypes::u32to16s(a);
u8 aa[4] = { VMTypes::u16to8s(b[0])[0], VMTypes::u16to8s(b[0])[1], VMTypes::u16to8s(b[1])[0], VMTypes::u16to8s(b[1])[1] };
Could this then work?:
memcpy(&_stack[0], aa, sizeof(u32));
Yes, it is possible. You take the starting address by &myVector[n] and memcpy your int to that location. Make sure that you stay in the bounds of your vector.
The other way around works too. Take the location and memcpy out of it to your int.
As suggested: by using memcpy you will copy the byte representation of your integer into the vector. That byte representation or byte order may be different from your expectation. Keywords are big and little endian.
As knivil says, memcpy will work if you know the endianess of your system. However, if you want to be safe, you can do this with bitwise arithmetic:
unsigned int myInt = 76535;
const int ratio = sizeof(int) / sizeof(byte);
for(int b = 0; b < ratio; b++)
{
myBytes[b] = byte(myInt >> (8*sizeof(byte)*(ratio - b)));
}
The int can be read out of the vector using a similar pattern, if you want me to show you how let me know.

When to use unsigned char pointer

What is the use of unsigned char pointers? I have seen it at many places that pointer is type cast to pointer to unsinged char Why do we do so?
We receive a pointer to int and then type cast it to unsigned char*. But if we try to print element in that array using cout it does not print anything. why? I do not understand. I am new to c++.
EDIT Sample Code Below
int Stash::add(void* element)
{
if(next >= quantity)
// Enough space left?
inflate(increment);
// Copy element into storage, starting at next empty space:
int startBytes = next * size;
unsigned char* e = (unsigned char*)element;
for(int i = 0; i < size; i++)
storage[startBytes + i] = e[i];
next++;
return(next - 1); // Index number
}
You are actually looking for pointer arithmetic:
unsigned char* bytes = (unsigned char*)ptr;
for(int i = 0; i < size; i++)
// work with bytes[i]
In this example, bytes[i] is equal to *(bytes + i) and it is used to access the memory on the address: bytes + (i* sizeof(*bytes)). In other words: If you have int* intPtr and you try to access intPtr[1], you are actually accessing the integer stored at bytes: 4 to 7:
0 1 2 3
4 5 6 7 <--
The size of type your pointer points to affects where it points after it is incremented / decremented. So if you want to iterate your data byte by byte, you need to have a pointer to type of size 1 byte (that's why unsigned char*).
unsigned char is usually used for holding binary data where 0 is valid value and still part of your data. While working with "naked" unsigned char* you'll probably have to hold the length of your buffer.
char is usually used for holding characters representing string and 0 is equal to '\0' (terminating character). If your buffer of characters is always terminated with '\0', you don't need to know it's length because terminating character exactly specifies the end of your data.
Note that in both of these cases it's better to use some object that hides the internal representation of your data and will take care of memory management for you (see RAII idiom). So it's much better idea to use either std::vector<unsigned char> (for binary data) or std::string (for string).
In C, unsigned char is the only type guaranteed to have no trapping values, and which guarantees copying will result in an exact bitwise image. (C++ extends this guarantee to char as well.) For this reason, it is traditionally used for "raw memory" (e.g. the semantics of memcpy are defined in terms of unsigned char).
In addition, unsigned integral types in general are used when bitwise operations (&, |, >> etc.) are going to be used. unsigned char is the smallest unsigned integral type, and may be used when manipulating arrays of small values on which bitwise operations are used. Occasionally, it's also used because one needs the modulo behavior in case of overflow, although this is more frequent with larger types (e.g. when calculating a hash value). Both of these reasons apply to unsigned types in general; unsigned char will normally only be used for them when there is a need to reduce memory use.
The unsinged char type is usually used as a representation of a single byte of binary data. Thus, and array is often used as a binary data buffer, where each element is a singe byte.
The unsigned char* construct will be a pointer to the binary data buffer (or its 1st element).
I am not 100% sure what does c++ standard precisely says about size of unsigned char, whether it is fixed to be 8 bit or not. Usually it is. I will try to find and post it.
After seeing your code
When you use something like void* input as a parameter of a function, you deliberately strip down information about inputs original type. This is very strong suggestion that the input will be treated in very general manner. I.e. as a arbitrary string of bytes. int* input on the other hand would suggest it will be treated as a "string" of singed integers.
void* is mostly used in cases when input gets encoded, or treated bit/byte wise for whatever reason, since you cannot draw conclusions about its contents.
Then In your function you seem to want to treat the input as a string of bytes. But to operate on objects, e.g. performing operator= (assignment) the compiler needs to know what to do. Since you declare input as void* assignment such as *input = something would have no sense because *input is of void type. To make compiler to treat input elements as the "smallest raw memory pieces" you cast it to the appropriate type which is unsigned int.
The cout probably did not work because of wrong or unintended type conversion. char* is considered a null terminated string and it is easy to confuse singed and unsigned versionin code. If you pass unsinged char* to ostream::operator<< as a char* it will treat and expect the byte input as normal ASCII characters, where 0 is meant to be end of string not an integer value of 0. When you want to print contents of memory it is best to explicitly cast pointers.
Also note that to print memory contents of a buffer you would need to use a loop, since other wise the printing function would not know when to stop.
Unsigned char pointers are useful when you want to access the data byte by byte. For example, a function that copies data from one area to another could need this:
void memcpy (unsigned char* dest, unsigned char* source, unsigned count)
{
for (unsigned i = 0; i < count; i++)
dest[i] = source[i];
}
It also has to do with the fact that the byte is the smallest addressable unit of memory. If you want to read anything smaller than a byte from memory, you need to get the byte that contains that information, and then select the information using bit operations.
You could very well copy the data in the above function using a int pointer, but that would copy chunks of 4 bytes, which may not be the correct behavior in some situations.
Why nothing appears on the screen when you try to use cout, the most likely explanation is that the data starts with a zero character, which in C++ marks the end of a string of characters.

C++ How to directly access memory

Say I have manually allocated a large portion of memory in C++, say 10 MB.
Say for the heck of it I want to store a few bits around the middle of this region.
How would I get at the memory at that location?
The only way I know of accessing raw memory is using array notation.
And array notation works well for that, as the allocated memory can be seen as a large array.
// Set the byte in the middle to `123`
((char *) memory_ptr)[5 * 1024 * 1024] = 123;
I typecast to a char pointer in case the pointer is of another type. If it's already a char pointer then the typecast isn't needed.
If you only want to set a single bit, see the memory as a giant bit field with 80 million separate bits. To find the bit you want, say bit number 40000000, you must first find the byte it's in and then the bit. This is done with normal division (to find the char) and modulo (to find the bit):
int wanted_bit = 40000000;
int char_index = wanted_bit / 8; // 8 bits to a byte
int bit_number = wanted_bit % 8;
((char *) memory_ptr)[char_index] |= 1 << bit_number; // Set the bit
Array notation is just another way of writing pointers. You can use that, or use pointers directly like so:
char *the_memory_block = // your allocated block.
char b = *(the_memory_block + 10); // get the 11th byte, *-operator is a dereference.
*(the_memory_block + 20) = b; // set the 21st byte to b, same operator.
memcpy, memzero, memmove, memcmp and others may also be very useful, like this:
char *the_memory_block = // your allocated block.
memcpy(the_memory_block + 20, the_memory_block + 10, 1);
Of course this code is also the same:
char *the_memory_block = // your allocated block.
char b = the_memory_block[10];
the_memory_block[20] = b;
And so is this:
char *the_memory_block = // your allocated block.
memcpy(&the_memory_block[20], &the_memory_block[10], 1);
Also, one is not safer then the other, they are completely equivalent.
I think the array notation would be your answer... You can use the bitshift operators << and >> with AND and OR bitmasks to access specific bits.
You can use array notation, or you can use pointer arithmetic:
char* buffer = new char[1024 * 1024 * 10];
// copy 3 bytes to the middle of the memory region using pointer arithmetic
//
std::memcpy(buffer + (1024 * 1024 * 5), "XXX", 3);
C/C++, arrays are treated as pointers to their first elements.
So, an array name is nothing but an alias to its first element:
*pName is equivalent pName[0]
And then:
*(pName+1) == pName[1];
*(pName+2) == pName[2];
And so on. Parenthesis are used to avoid precedence issues. Never forget using them.
After compilation, both ways will behave the same.
I do prefer brackets notation for readability.