What's an AoB (Array of Bytes) - c++

I have encountered this term a couple of times now, and I have googled for explanations, but couldn't find any.
I'm accessing the memory of a running software-game. I do have an address but I'm also given an AoB, for example
89 8B ? ? 00 00 8B 50 ? 89 93 ? ?.
What do I do with it?
I'd appreciate it if you could give me a guide or something.
Thanks

An array of bytes is best explained in C/++ as an array of [unsigned] char.
The values you see are only hexadecimal representations of these bytes or unsigned char's.

An array of bytes is a contiguous series of values, usually in the range 0 to 255 (0x00 to 0xFF).
The contents must be interpreted by the programmer and can be anything from addresses to pixels for a bitmap.
A common use of AoB, a.k.a. buffer, is for I/O, reading and writing data. The fundamental I/O routines do not care about content, just quantity, source and destination. A program may read large amounts of data into an AOB, then later cast it as some kind of structure or assign fields with data from the buffer. See also "serialization." This is a performance technique with I/O: convert many small reads into one large block read.
Not all data has to be in structures or objects; those are just a convenience.

Related

Why is the CAN BUS Frame ID backwards when reading from a Socket?

So I have a Raspberry Pi reading CAN Data from a vehicle. If I use the candump program included in canutils I get a bunch of data, an example look like:
can0 1C4 [8] 03 F3 26 08 00 00 7F 70
I then wrote a simple C++ app to open a socket to the can0 bus and read some data into a char buffer. If I loop through each character of the buffer after a read and convert each char to an int in hex format (and put a pipe in between each char) I get the following:
c4|1|0|0|8|0|0|0|3|f3|26|8|0|0|7f|70|
My question is, why is the ID byte reversed when I read the data using a socket and a char buffer? This behavior is consistent with all CAN ID's. The Data Length code and the Data is in the correct format/order but the ID backward.
Thanks,
Adam
Congratulation, you just discovered endianness.
Endianness refers to the sequential order in which bytes are arranged into larger numerical values, when stored in computer memory or secondary storage, or when transmitted over digital links. Endianness is of interest in computer science because two conflicting and incompatible formats are in common use: words may be represented in big-endian or little-endian format, depending on whether bits or bytes or other components are ordered from the big end (most significant bit) or the little end (least significant bit).
As a convention, network (including bus) data is big endian. Your PC architecture is probably little endian.
In order to fix this, pass your data to ntoh*() functions to reverse its byte order (if necessary) from network (n) to host (h) endianness.

Types bit length and architecture specific implementations

I'm doing stuff in C++ but lately I've found that there are slight differences regarding how much data a type can accomodate and also the byte order is an issue.
Suppose I got a binary file, where I've encoded shorts that are 2 bytes in size. The file is in binary format like:
FA C8 - data segment 1
BA 32 - data segment 2
53 56 - data segment 3
Now all is well up to this point. Now I want to read this data. There are 2 problems:
1 what data type to choose to store this values?
2 how to deal with endianness of the target architecture?
The first problem is actually related to the second because here I will have to do bit shifts in order to swap the order of bytes.
I know that I could read the file byte by byte and add every two bytes. But is there an approach that could ease that pain?
I'm sorry If I'm being ambiguous. The problem is hard to explain. Hope you get a glimpse of what I'm talking about. I just want to store this data internally.
So I would appreciate some advices or if you can share some of your experience in this topic.
If you use big endian on the file that stores the data then you could just rely on htons(), htonl(), ntohs(), ntohl() to convert the integers to the right endianess before saving or after reading.
There is no easy way to do this.
Rather than doing that yourself, you might want to look into serialization libraries (for example Protobuf or boost serialization), they'll take care of a lot of that for you.
If you want to do it yourself, use fixed-width types (uint32_t and the like from <cstdint>), and endian conversion functions as appropriate. Either have a "prefix" in your file that determines what endianness it contains (a BOM/Byte Order Mark), or always store in either big or little endian, and systematically convert.
Be extra careful if you need to serialize strings, they have encoding problems of their own too.

Storing hexadecimal addresses in a file

I have a pintool application which store the memory address accessed by an application in a file. These addresses are in hexadecimal form. If I write these addresses in form of string, it will take a huge amount of storage(nearly 300GB). Writing such a large file will also take large amount of time. So I think of an alternate way to reduce the amount of storage used.
Each character of hexadecimal address represent 4 bits and each ASCII character is of 8 bits. So I am thinking of representing two hexadecimal characters by one ASCII character.
For example :
if my hexadecimal address is 0x26234B
then corresponding converted ASCII address will be &#K (0x is ignored as I know all address will be hexadecimal).
I want to know that is there any other much more efficient method for doing this which takes less amount of storage.
NOTE : I am working in c++
This is a good start. If you really want to go further, you can consider compressing the data using something like a zip library or Huffman encoding.
Assuming your addresses are 64-bit pointers, and that such a representation is sensible for your platform, you can just store them as 64-bit ints. For example, you list 0x1234567890abcdef, which could be stored as the four bytes:
12 34 56 78 90 ab cd ef
(your pointer, stored in 8 bytes.)
or the same, but backwards, depending on what endianness you choose. Specifically, you should read this.
We can even do this somewhat platform-independently: uintptr_t is unsigned integer type the same width as a pointer (assuming one exists, which it usually does, but it's not a sure thing), and sizeof(our_pointer), which gives us the size in bytes of a pointer. We can arrive at the above bytes with:
Convert the pointer to an integer representation (i.e., 0x0026234b)
Shift the bytes around to pick out the one we want.
Stick it somewhere.
In code:
unsigned char buffer[sizeof(YourPointerType)];
for(unsigned int i = 0; i < sizeof(YourPointerType); ++i) {
buffer[i] = (
(reinterpret_cast<uintptr_t>(your_pointer) >> (sizeof(YourPointerType) - i - 1))
& 0xff
);
}
Some notes:
That'll do a >> 0 on the last loop iteration. I suspect that might be undefined behavior, and you'll need an if-case to handle it.
This will write out pointers of the size of your platform, and requires that they can be converted sensibly to integers. (I think uintptr_t won't exist if this isn't the case.) It won't do the same thing on 64- as it will on 32-bit platforms, as they have different pointer sizes. (Or any other pointer-sized platform you run across.)
A program's pointers aren't valid once the program dies, and might not even remain valid when the program is still running. (If the pointer points to memory that the program decides to free, then the pointer is invalid.)
There's likely a library that'll do this for you. (struct, in Python, does this.)
The above is a big-endian encoder. Alternatively, you can write out little endian — the Wikipedia article details the difference.
Last, you can just cast a pointer to the pointer to a unsigned char *, and write that. (I.e., dump the actual memory of the pointer to a file.) That's way more platform dependent though.
If you need even more space, I'd run it through gzip.

Data Encryption using AES-256-CBC mode openssl , doesnt return the same size of data which doesnt need padding?

I am trying to use openssl AES to encrypt my data i found the pretty nice example in this link ., http://saju.net.in/code/misc/openssl_aes.c.txt
but the question i still could found the answer it padding the data although it perform a multiple of key size .
for example it needs 16 byte as input to encrypt or any multiple of 16
i gave 1024 including the null ., and it still give me an out put of size 1040 ,
but as what i know AES input size = out put size , if the input is a multiple of 128 bit / 16 byte .
any one tried this example before me or can give me any idea ?|
thanks in Advance .
Most padding schemes require that some minimum amount of padding always be added. This is (at least primarily) so that on the receiving end, you can look at the last byte (or some small amount of data at the end) and know how much of the data at the end is padding, and how much is real data.
For example, a typical padding scheme puts zero bytes after the data with one byte at the end containing the number of bytes that are padding. For example, if you added 4 bytes of padding, the padding bytes (in hex) would be something like 00 00 00 04. Another common possibility puts that same value in all the padding bytes, so it would look like 04 04 04 04.
On the receiving end, the algorithm has to be ready to strip off the padding bytes. To do that, it looks at the last byte to tell it how many bytes of data to remove from the end and ignore. If there's no padding present, that's going to contain some value (whatever the last byte in the message happened to be). Since it has no way to know that no padding was added, it looks at that value, and removes that many bytes of data -- only in this case, it's removing actual data instead of padding.
Although it might be possible to devise a padding scheme that avoided adding extra data when/if the input happened to be an exact multiple of the block size, it's a lot simpler to just add at least one byte of padding to every message, so the receiver can count on always reading the last byte and finding how much of what it received is padding.

Does endianness have an effect when copying bytes in memory?

Am I right in thinking that endianess is only relevant when we're talking about how to store a value and not relevant when copying memory?
For example
if I have a value 0xf2fe0000 and store it on a little endian system - the bytes get stored in the order 00, 00, fe and f2. But on a big endian system the bytes get stored f2, fe, 00 and 00.
Now - if I simply want to copy these 4 bytes to another 4 bytes (on the same system), on a little endian system am I going to end up with another 4 bytes containing 00, 00, fe and f2 in that order?
Or does endianness have an effect when copying these bytes in memory?
Endianness is only relevant in two scenarios
When manually inspecting a byte-dump of a multibyte object, you need to know if the bytes are ordered in little endian or big endian order to be able to correctly interpret the bytes.
When the program is communicating multibyte values with the outside world, e.g. over a network connection or a file. Then both parties need to agree on the endianness used in the communication and, if needed, convert between the internal and external byte orders.
Answering the question title.
Assume 'int' to be of 4 bytes
union{
unsigned int i;
char a[4];
};
// elsewhere
i = 0x12345678;
cout << a[0]; // output depends on endianness. This is relevant during porting code
// to different architectures
So, it is not about copying (alone)? It's about how you access?
It is also of significance while transferring raw bytes over a network!.
Here's the info on finding endianness programatically
memcpy doesn't know what it is copying. If it has to copy 43 61 74 00, it doesn't know whether it is copying 0x00746143 or 0x43617400 or a float or "Cat"
no when working on the same machine you don't have to worry about endianess, only when transferring binary data between little and big endian machines
Basically, you have to worry about endianess only when you need to transfer binary data between architectures which differ in endianess.
However, when you transfer binary data between architectures, you will also have to worry about other things, like the size of integer types, the format of floating numbers and other nasty headaches.
Yes, you are correct thinking that you should be endianness-aware when storing or communicating binary values outside your current "scope".
Generally you dont need to worry as long as everything is just inside your own program.
If you copy memory, have in mind what you are copying. (You could get in trouble if you store long values and read ints).
Have a look at htonl(3) or books about network programming for some good explanations.
Memcpy just copies bytes and doesn't care about endianness.
So if you want to copy one network stream to another use memcpy.