Does endianness have an effect when copying bytes in memory? - c++

Am I right in thinking that endianess is only relevant when we're talking about how to store a value and not relevant when copying memory?
For example
if I have a value 0xf2fe0000 and store it on a little endian system - the bytes get stored in the order 00, 00, fe and f2. But on a big endian system the bytes get stored f2, fe, 00 and 00.
Now - if I simply want to copy these 4 bytes to another 4 bytes (on the same system), on a little endian system am I going to end up with another 4 bytes containing 00, 00, fe and f2 in that order?
Or does endianness have an effect when copying these bytes in memory?

Endianness is only relevant in two scenarios
When manually inspecting a byte-dump of a multibyte object, you need to know if the bytes are ordered in little endian or big endian order to be able to correctly interpret the bytes.
When the program is communicating multibyte values with the outside world, e.g. over a network connection or a file. Then both parties need to agree on the endianness used in the communication and, if needed, convert between the internal and external byte orders.

Answering the question title.
Assume 'int' to be of 4 bytes
union{
unsigned int i;
char a[4];
};
// elsewhere
i = 0x12345678;
cout << a[0]; // output depends on endianness. This is relevant during porting code
// to different architectures
So, it is not about copying (alone)? It's about how you access?
It is also of significance while transferring raw bytes over a network!.
Here's the info on finding endianness programatically

memcpy doesn't know what it is copying. If it has to copy 43 61 74 00, it doesn't know whether it is copying 0x00746143 or 0x43617400 or a float or "Cat"

no when working on the same machine you don't have to worry about endianess, only when transferring binary data between little and big endian machines

Basically, you have to worry about endianess only when you need to transfer binary data between architectures which differ in endianess.
However, when you transfer binary data between architectures, you will also have to worry about other things, like the size of integer types, the format of floating numbers and other nasty headaches.

Yes, you are correct thinking that you should be endianness-aware when storing or communicating binary values outside your current "scope".
Generally you dont need to worry as long as everything is just inside your own program.
If you copy memory, have in mind what you are copying. (You could get in trouble if you store long values and read ints).
Have a look at htonl(3) or books about network programming for some good explanations.

Memcpy just copies bytes and doesn't care about endianness.
So if you want to copy one network stream to another use memcpy.

Related

Is it normal memcpy overwrites data it just wrote?

I use memcpy() to write data to a device, with a logic analyzer/PCIe analyzer, I can see the actual stores.
My device gets more stores than expected.
For example,
auto *data = new uint8_t[1024]();
for (int i=0; i<50; i++){
memcpy((void *)(addr), data, i);
}
For i=9, I see these stores:
4B from byte 0 to 3
4B from byte 4 to 7
3B from byte 5 to 7
1B-aligned only, re-writing the same data -> inefficient and useless store
1B the byte 8
In the end, all the 9 Bytes are written but memcpy creates an extra store of 3B re-writing what it has already written and nothing more.
Is it the expected behavior? The question is for C and C++, I'm interested in knowing why this happens, it seems very inefficient.
Is it the expected behavior?
The expected behavior is that it can do anything it feels like (including writing past the end, especially in a "read 8 bytes into a register, modify the first byte in the register, then write 8 bytes" way) as long as the result works as if the rules for the C abstract machine were followed.
Using a logic analyzer/PCIe analyzer to see the actual stores is so far beyond the scope of "works as if the rules for the abstraction machine were followed" that it's unreasonable to have any expectations.
Specifically; you can't assume the writes will happen in any specific order, can't assume anything about the size of any individual write, can't assume writes won't overlap, can't assume there won't be writes past the end of the area, can't assume writes will actually occur at all (without volatile), and can't even assume that CHAR_BIT isn't larger than 8 (or that memcpy(dest, source, 10); isn't asking to write 20 octets/"8 bit bytes").
If you need guarantees about writes, then you need to enforce those guarantees yourself (e.g. maybe create a structure of volatile fields to force the compiler to ensure writes happen in a specific order, maybe use inline assembly with explicit fences/barriers, etc).
The following illustrates why memcpy may be implemented this way.
To copy 9 bytes, starting at a 4-byte aligned address, memcpy issues these instructions (described as pseudo code):
Load four bytes from source+0 and store four bytes to destination+0.
Load four bytes from source+4 and store four bytes to destination+4.
Load four bytes from source+5 and store four bytes to destination+5.
The processor implements the store instructions with these data transfer in the hardware:
Since destination+0 is aligned, store 4 bytes to destination+0.
Since destination+4 is aligned, store 4 bytes to destination+4.
Since destination+5 is not aligned, store 3 bytes to destination+3 and store 1 byte to destination+8.
This is an easy and efficient way to write memcpy:
If length is less than four bytes, jump to separate code for that.
Loop copying four bytes until fewer than four bytes are left.
if length is not a multiple of four, copy four bytes from source+length−4 to destination+length−4.
That single step to copy the last few bytes may be more efficient than branching to three different cases with various cases.

Does endianness affect writing an odd number of bytes?

Imagine you had a uint64_t bytes and you know that you only need 7 bytes because the integers you store will not exceed the limit of 7 bytes.
When writing a file you could do something like
std::ofstream fout(fileName);
fout.write((char *)&bytes, 7);
to only write 7 bytes.
The question I'm trying to figure out is whether endianess of a system affects the bytes that are written to the file. I know that endianess affects the order in which the bytes are written, but does it also affect which bytes are written? (Only for the case when you write less bytes than the integer usually has.)
For example, on a little endian system the first 7 bytes are written to the file, starting with the LSB. On a big endian system what is written to the file?
Or to put it differently, on a little endian system the MSB(the 8th byte) is not written to the file. Can we expect the same behavior on a big endian system?
Endianess affects only the way (16, 32, 64) int are written. If you are writing bytes, (as it is your case) they will be written in the exact same order you are doing it.
For example, this kind of writing will be affected by endianess:
std::ofstream fout(fileName);
int i = 67;
fout.write((char *)&i, sizeof(int));
uint64_t bytes = ...;
fout.write((char *)&bytes, 7);
This will write exactly 7 bytes starting from the address of &bytes. There is a difference between LE and BE systems how the eight bytes in memory are laid out, though (let's assume the variable is located at address 0xff00):
0xff00 0xff01 0xff02 0xff03 0xff04 0xff05 0xff06 0xff07
LE: [byte 0 (LSB!)][byte 1][byte 2][byte 3][byte 4][byte 5][byte 6][byte 7 (MSB)]
BE: [byte 7 (MSB!)][byte 6][byte 5][byte 4][byte 3][byte 2][byte 1][byte 0 (LSB)]
Starting address (0xff00) won't change if casting to char*, and you'll print out the byte at exactly this address plus the next six following ones – in both cases (LE and BE), address 0xff07 won't be printed. Now if you look at my memory table above, it should be obvious that on BE system, you lose the LSB while storing the MSB, which does not carry information...
On a BE-System, you could instead write fout.write((char *)&bytes + 1, 7);. Be aware, though, that this yet leaves a portability issue:
fout.write((char *)&bytes + isBE(), 7);
// ^ giving true/false, i. e. 1 or 0
// (such function/test existing is an assumption!)
This way, data written by a BE-System would be misinterpreted by a LE-system, when read back, and vice versa. Safe version would be decomposing each single byte as geza did in his answer. To avoid multiple system calls, you might decompose the values into an array instead and print out that one.
If on linux/BSD, there's a nice alternative, too:
bytes = htole64(bytes); // will likely result in a no-op on LE system...
fout.write((char *)&bytes, 7);
The question I'm trying to figure out is whether endianess of a system affects the bytes that are written to the file.
Yes, it affects the bytes are written to the file.
For example, on a little endian system the first 7 bytes are written to the file, starting with the LSB. On a big endian system what is written to the file?
The first 7 bytes are written to the file. But this time, starting with the MSB. So, in the end, the lowest byte is not written in the file, because on big endian systems, the last byte is the lowest byte.
So, this is not what you've wanted, because you lose information.
A simple solution is to convert uint64_t to little endian, and write the converted value. Or just write the value byte-by-byte in a way that a little endian system would write it:
uint64_t x = ...;
write_byte(uint8_t(x));
write_byte(uint8_t(x>>8));
write_byte(uint8_t(x>>16));
// you get the idea how to write the remaining bytes

endianness influence in C++ code

I know that this might be a silly question, but I am a newbie C++ developer and I need some clarifications about the endianness.
I have to implement a communication interface that relies on SCTP protocol in order to communicate between two different machines (one ARM based, and the other Intel based).
The aim is to:
encode messages into a stream of bytes to be sent on the socket (I used a vector of uint8_t, and positioned each byte of the different fields -taking care of splitting uint16/32/64 to single bytes- following big-endian convention)
send the bytestream via socket to the receiver (using stcp)
retrieve the stream and parse it in order to fill the message object with the correct elements (represented by header + TV information elements)
I am confused on where I could have problem with the endianness of the underlying architecture of the 2 machines in where the interface will be used.
I think that taking care of splitting objects into single bytes and positioning them using big-endian can preclude that, at the arrival, the stream is represented differently, right? or am I missing something?
Also, I am in doubt about the role of C++ representation of multiple-byte variables, for example:
uint16_t var=0x0123;
//low byte 0x23
uint8_t low = (uint8_t)var;
//hi byte 0x01
uint8_t hi = (uint8_t)(var >> 8);
This piece of code is endianness dependent or not? i.e. if I work on a big-endian machine I suppose that the above code is ok, but if it is little-endian, will I pick up the bytes in different order?
I've searched already for such questions but no one gave me a clear reply, so I have still doubts on this.
Thank you all in advance guys, have a nice day!
This piece of code is endianness dependent or not?
No the code doesn't depend on endianess of the target machine. Bitwise operations work the same way as e.g. mathematical operators do.
They are independent of the internal representation of the numbers.
Though if you're exchanging data over the wire, you need to have a defined byte order known at both sides. Usually that's network byte ordering (i.e. big endian).
The functions of the htonx() ntohx() family will help you do en-/decode the (multibyte) numbers correctly and transparently.
The code you presented is endian-independent, and likely the correct approach for your use case.
What won't work, and is not portable, is code that depends on the memory layout of objects:
// Don't do this!
uint16_t var=0x0123;
auto p = reinterpret_cast<char*>(&var);
uint8_t hi = p[0]; // 0x01 or 0x23 (probably!)
uint8_t lo = p[1]; // 0x23 or 0x01 (probably!)
(I've written probably in the comments to show that these are the likely real-world values, rather than anything specified by Standard C++)

Types bit length and architecture specific implementations

I'm doing stuff in C++ but lately I've found that there are slight differences regarding how much data a type can accomodate and also the byte order is an issue.
Suppose I got a binary file, where I've encoded shorts that are 2 bytes in size. The file is in binary format like:
FA C8 - data segment 1
BA 32 - data segment 2
53 56 - data segment 3
Now all is well up to this point. Now I want to read this data. There are 2 problems:
1 what data type to choose to store this values?
2 how to deal with endianness of the target architecture?
The first problem is actually related to the second because here I will have to do bit shifts in order to swap the order of bytes.
I know that I could read the file byte by byte and add every two bytes. But is there an approach that could ease that pain?
I'm sorry If I'm being ambiguous. The problem is hard to explain. Hope you get a glimpse of what I'm talking about. I just want to store this data internally.
So I would appreciate some advices or if you can share some of your experience in this topic.
If you use big endian on the file that stores the data then you could just rely on htons(), htonl(), ntohs(), ntohl() to convert the integers to the right endianess before saving or after reading.
There is no easy way to do this.
Rather than doing that yourself, you might want to look into serialization libraries (for example Protobuf or boost serialization), they'll take care of a lot of that for you.
If you want to do it yourself, use fixed-width types (uint32_t and the like from <cstdint>), and endian conversion functions as appropriate. Either have a "prefix" in your file that determines what endianness it contains (a BOM/Byte Order Mark), or always store in either big or little endian, and systematically convert.
Be extra careful if you need to serialize strings, they have encoding problems of their own too.

Storing hexadecimal addresses in a file

I have a pintool application which store the memory address accessed by an application in a file. These addresses are in hexadecimal form. If I write these addresses in form of string, it will take a huge amount of storage(nearly 300GB). Writing such a large file will also take large amount of time. So I think of an alternate way to reduce the amount of storage used.
Each character of hexadecimal address represent 4 bits and each ASCII character is of 8 bits. So I am thinking of representing two hexadecimal characters by one ASCII character.
For example :
if my hexadecimal address is 0x26234B
then corresponding converted ASCII address will be &#K (0x is ignored as I know all address will be hexadecimal).
I want to know that is there any other much more efficient method for doing this which takes less amount of storage.
NOTE : I am working in c++
This is a good start. If you really want to go further, you can consider compressing the data using something like a zip library or Huffman encoding.
Assuming your addresses are 64-bit pointers, and that such a representation is sensible for your platform, you can just store them as 64-bit ints. For example, you list 0x1234567890abcdef, which could be stored as the four bytes:
12 34 56 78 90 ab cd ef
(your pointer, stored in 8 bytes.)
or the same, but backwards, depending on what endianness you choose. Specifically, you should read this.
We can even do this somewhat platform-independently: uintptr_t is unsigned integer type the same width as a pointer (assuming one exists, which it usually does, but it's not a sure thing), and sizeof(our_pointer), which gives us the size in bytes of a pointer. We can arrive at the above bytes with:
Convert the pointer to an integer representation (i.e., 0x0026234b)
Shift the bytes around to pick out the one we want.
Stick it somewhere.
In code:
unsigned char buffer[sizeof(YourPointerType)];
for(unsigned int i = 0; i < sizeof(YourPointerType); ++i) {
buffer[i] = (
(reinterpret_cast<uintptr_t>(your_pointer) >> (sizeof(YourPointerType) - i - 1))
& 0xff
);
}
Some notes:
That'll do a >> 0 on the last loop iteration. I suspect that might be undefined behavior, and you'll need an if-case to handle it.
This will write out pointers of the size of your platform, and requires that they can be converted sensibly to integers. (I think uintptr_t won't exist if this isn't the case.) It won't do the same thing on 64- as it will on 32-bit platforms, as they have different pointer sizes. (Or any other pointer-sized platform you run across.)
A program's pointers aren't valid once the program dies, and might not even remain valid when the program is still running. (If the pointer points to memory that the program decides to free, then the pointer is invalid.)
There's likely a library that'll do this for you. (struct, in Python, does this.)
The above is a big-endian encoder. Alternatively, you can write out little endian — the Wikipedia article details the difference.
Last, you can just cast a pointer to the pointer to a unsigned char *, and write that. (I.e., dump the actual memory of the pointer to a file.) That's way more platform dependent though.
If you need even more space, I'd run it through gzip.