Difference between byte flip and byte swap - c++

I am trying to find the difference becoz of byte flip functionality I see in Calculator on Mac with Programmer`s view.
So I wrote a program to byte swap a value which we do to go from small to big endian or other way round and I call it as byte swap. But when I see byte flip I do not understand what exactly it is and how is it different than byte swap. I did confirm that the results are different.
For example, for an int with value 12976128
Byte Flip gives me 198;
Byte swap gives me 50688.
I want to implement an algorithm for byte flip since 198 is the value I want to get while reading something. Anything on google says byte flip founds the help byte swap which isnt the case for me.

Byte flip and byte swap are synonyms.
The results you see are just two different ways of swapping the bytes, depending on whether you look at the number as a 32bit number (consisting of 4 bytes), or as the smallest size of a number that can hold 12976128, which is 24 bits or 3 bytes.
The 4byte swap is more usual in computer culture, because 32bit processors are currently predominant (even 64bit architectures still do most of their mathematics in 32bit numbers, partly because of backward compatible software infrastructure, partly because it is enough for many practical purposes). But the Mac Calculator seems to use the minimum-width swap, in this case a 3 byte swap.
12976128, when converted to hexadecimal, gives you 0xC60000. That's 3 bytes total ; each hexadecimal digit is 4 bits, or half a byte wide. The bytes to be swapped are 0xC6, zero, and another zero.
After 3byte swap: 0x0000C6 = 198
After 4byte swap: 0x0000C600 = 50688

Related

Is the msb of hex representation of binary the left side or right side?

In an answer to a bit padding related question here the respondent made the following statement
uint32 value of 69 is an integer value, if you want to pad it to have 32-byte
(the EVMnative length), then you have to consider the "endianness" (Read more here):
* Big-endian: 0x0000.....0045 (32 bytes)
* Little-endian: 0x4500.....00000 (32 bytes)
Apologies if this is a trite question, but what determines that
in case of big-endian the most significant bit is on the left side? hence padding goes on the left, but
in case of little-endian, the most significant bit is on right side? hence padding goes on the right?
Because my understanding is that with big-endian the most significant bits get read first into the least significant memory address but in little-endian, the least significant bits get read first into the least significant memory address.
The above padding scheme seems to suggest when represented in hex, the left side get's read in first with big-endian, but with little-endian, the right side get read in first
that most significant bit is on the left side, hence when dealing with big-endian, the padding goes on the left, but when dealing with little-endian it goes on the right, since again, the

How a program in 16 bit system can access integers more than 65535 but not address

A 16 bit system can only access RAM upto 64kbytes (normally). There is a concept of memory addresses that 16 bit system can access 2^16 numbers thus in unsigned integers it can only access 2^16 = 65536 INTEGERS (0 to 65535). Thus 16 bit sytem can only use addresses upto 64kbytes(after conclusion of small calculation). now the main que. Is that when we define an integer to be 'long int' than how can it access integers more than 65535?
There are a bunch of misconceptions in this post:
I came to know in previous days that a 16 bit system can only access RAM upto 64kbytes
This is factually wrong, the 8086 has a external address bus of 20 bits, so it can access 1,048,576 bytes (~1MB). You can read more about the 8086 architecture here: https://en.wikipedia.org/wiki/Intel_8086.
Is that when we define an integer to be 'long int' than how can it access integers more than 65535?
Are you asking about register size? In that case the answer is easy: it doesn't. It can access the first 16 bits, and then it can access the other 16 bits, and whatever the application does with those 2 16 bit values is up to it (and the framework used, like the C runtime).
As to how you can access the full address space of 20 bits with just 16bit integers, the answer is address segmenting. You have a second register (CS, DS, SS, and ES on 8086) that stores the high part of the address, and the CPU "stitches" them together to send to the memory controller.
Computers can perform arithmetic on values larger than a machine word in much the same way as humans can perform arithmetic on values larger than a digit: by splitting operations into multiple parts, and keeping track of "carries" that would move data between them.
On the 8086, for example, if AX holds the bottom half of a 32-bit number and DX holds the top half, the sequence:
ADD AX,[someValue]
ADC DX,[someValue+2]
will add to DX::AX the 32-bit value whose lower half is at address [someValue] and whose upper half is at [someValue+2]. The ADD instruction will update a "carry" flag indicating whether there was a carry out from the addition, and the ADC instruction will add an extra 1 if the carry flag was set.
Some processors don't have a carry flag, but have an instruction that will compare two registers, and set a third register to 1 if the first was greater than the second, and 0 otherwise. On those processors, if one wants to add R1::R0 to R3::R2 and place the result in R5::R4, one can use the sequence:
Add R0 to R2 and store the result in R4
Set R5 to 1 if R4 is less than R0 (will happen if there was a carry), and 0 otherwise
Add R1 to R5, storing the result in R5
Add R3 to R5, storing the result in R5
Four times as slow as a normal single-word addition, but still at least somewhat practical. Note that while the carry-flag approach is easily extensible to operate on numbers of any size, extending this approach beyond two words is much harder.

Byte order for packed images

So from http://en.wikipedia.org/wiki/RGBA_color_space, I learned that the byte order for ARGB is, from lowest address to highest address, BGRA, on a little endian machine in certain interpretations.
How does this effect the naming convention of packed data eg a uint8_t ar[]={R,G,B,R,G,B,R,G,B}?
Little endian by definition stores the bytes of a number in reverse order. This is not strictly necessary if you are treating them as byte arrays however any vaguely efficient code base will actually treat the 4 bytes as a 32 bit unsigned integer. This will speed up software blitting by a factor of almost 4.
Now the real question is why. This comes from the fact that when treating a pixel as a 32 bit int as described above coders want to be able to run arithmetic and shifts in a predictable way. This relies on the bytes being in reverse order.
In short, this is not actually odd as in little endian machines the last byte (highest address) is actually the most significant byte and the first the least significant. Thus a field like this will naturally be in reverse order so it is the correct way around when treated as a number (as a number it will appear ARGB but as a byte array it will appear BGRA).
Sorry if this is unclear, but I hope it helps. If you do not understand or I have missed something please comment.
If you are storing data in a byte array like you have specified, you are using BGR format which is basically RGB reversed:
bgr-color-space

Why are all datatypes a power of 2?

Why are all data type sizes always a power of 2?
Let's take two examples:
short int 16
char 8
Why are they not the like following?
short int 12
That's an implementation detail, and it isn't always the case. Some exotic architectures have non-power-of-two data types. For example, 36-bit words were common at one stage.
The reason powers of two are almost universal these days is that it typically simplifies internal hardware implementations. As a hypothetical example (I don't do hardware, so I have to confess that this is mostly guesswork), the portion of an opcode that indicates how large one of its arguments is might be stored as the power-of-two index of the number of bytes in the argument, thus two bits is sufficient to express which of 8, 16, 32 or 64 bits the argument is, and the circuitry required to convert that into the appropriate latching signals would be quite simple.
The reason why builtin types are those sizes is simply that this is what CPUs support natively, i.e. it is the fastest and easiest. No other reason.
As for structs, you can have variables in there which have (almost) any number of bits, but you will usually want to stay with integral types unless there is a really urgent reason for doing otherwise.
You will also usually want to group identical-size types together and start a struct with the largest types (usually pointers).That will avoid needless padding and it will make sure you don't have access penalties that some CPUs exhibit with misaligned fields (some CPUs may even trigger an exception on unaligned access, but in this case the compiler would add padding to avoid it, anyway).
The size of char, short, int, long etc differ depending on the platform. 32 bit architectures tend to have char=8, short=16, int=32, long=32. 64 bit architectures tend to have char=8, short=16, int=32, long=64.
Many DSPs don't have power of 2 types. For example, Motorola DSP56k (a bit dated now) has 24 bit words. A compiler for this architecture (from Tasking) has char=8, short=16, int=24, long=48. To make matters confusing, they made the alignment of char=24, short=24, int=24, long=48. This is because it doesn't have byte addressing: the minimum accessible unit is 24 bits. This has the exciting (annoying) property of involving lots of divide/modulo 3 when you really do have to access an 8 bit byte in an array of packed data.
You'll only find non-power-of-2 in special purpose cores, where the size is tailored to fit a special usage pattern, at an advantage to performance and/or power. In the case of 56k, this was because there was a multiply-add unit which could load two 24 bit quantities and add them to a 48 bit result in a single cycle on 3 buses simultaneously. The entire platform was designed around it.
The fundamental reason most general purpose architectures use powers-of-2 is because they standardized on the octet (8 bit bytes) as the minimum size type (aside from flags). There's no reason it couldn't have been 9 bit, and as pointed out elsewhere 24 and 36 bit were common. This would permeate the rest of the design: if x86 was 9 bit bytes, we'd have 36 octet cache lines, 4608 octet pages, and 569KB would be enough for everyone :) We probably wouldn't have 'nibbles' though, as you can't divide a 9 bit byte in half.
This is pretty much impossible to do now, though. It's all very well having a system designed like this from the start, but inter-operating with data generated by 8 bit byte systems would be a nightmare. It's already hard enough to parse 8 bit data in a 24 bit DSP.
Well, they are powers of 2 because they are multiples of 8, and this comes (simplifying a little) from the fact that usually the atomic allocation unit in memory is a byte, which (edit: often, but not always) is made by 8 bits.
Bigger data sizes are made taking multiple bytes at a time.
So you could have 8,16,24,32... data sizes.
Then, for the sake of memory access speed, only powers of 2 are used as a multiplier of the minimum size (8), so you get data sizes along these lines:
8 => 8 * 2^0 bits => char
16 => 8 * 2^1 bits => short int
32 => 8 * 2^2 bits => int
64 => 8 * 2^3 bits => long long int
8 bits is the most common size for a byte (but not the only size, examples of 9 bit bytes and other byte sizes are not hard to find). Larger data types are almost always multiples of the byte size, hence they will typically be 16, 32, 64, 128 bits on systems with 8 bit bytes, but not always powers of 2, e.g. 24 bits is common for DSPs, and there are 80 bit and 96 bit floating point types.
The sizes of standard integral types are defined as multiple of 8 bits, because a byte is 8-bits (with a few extremely rare exceptions) and the data bus of the CPU is normally a multiple of 8-bits wide.
If you really need 12-bit integers then you could use bit fields in structures (or unions) like this:
struct mystruct
{
short int twelveBitInt : 12;
short int threeBitInt : 3;
short int bitFlag : 1;
};
This can be handy in embedded/low-level environments - but bear in mind that the overall size of the structure will still be packed out to the full size.
They aren't necessarily. On some machines and compilers, sizeof(long double) == 12 (96 bits).
It's not necessary that all data types use of power of 2 as number of bits to represent. For example, long double uses 80 bits(though its implementation dependent on how much bits to allocate).
One advantage you gain with using power of 2 is, larger data types can be represented as smaller ones. For example, 4 chars(8 bits each) can make up an int(32 bits). In fact, some compilers used to simulate 64 bit numbers using two 32 bit numbers.
Most of the times your computer tries to keep all data formats in either a whole multiple (2, 3, 4...) or a whole part (1/2, 1/3, 1/4...) of the machine data size. It does this so that each time it loads N data words it loads an integer number of bits of information for you. That way, it doesn't have to recombine parts later on.
You can see this in the x86 for example:
a char is 1/4th of 32-bits
a short is 1/2 of 32-bits
an int / long are a whole 32 bits
a long long is 2x 32 bits
a float is a single 32-bits
a double is two times 32-bits
a long double may either be three or four times 32-bits, depending on your compiler settings. This is because for 32-bit machines it's three native machine words (so no overhead) to load 96 bits. On 64-bit machines it is 1.5 native machine word, so 128 bits would be more efficient (no recombining). The actual data content of a long double on x86 is 80 bits, so both of these are already padded.
A last aside, the computer doesn't always load in its native data size. It first fetches a cache line and then reads from that in native machine words. The cache line is larger, usually around 64 or 128 bytes. It's very useful to have a meaningful bit of data fit into this and not be stuck on the edge as you'd have to load two whole cache lines to read it then. That's why most computer structures are a power of two in size; it will fit in any power of two size storage either half, completely, double or more - you're guaranteed to never end up on a boundary.
There are a few cases where integral types must be an exact power of two. If the exact-width types in <stdint.h> exist, such as int16_t or uint32_t, their widths must be exactly that size, with no padding. Floating-point math that declares itself to follow the IEEE standard forces float and double to be powers of two (although long double often is not). There are additionally types char16_t and char32_t in the standard library now, or built-in to C++, defined as exact-width types. The requirements about support for UTF-8 in effect mean that char and unsigned char have to be exactly 8 bits wide.
In practice, a lot of legacy code would already have broken on any machine that didn’t support types exactly 8, 16, 32 and 64 bits wide. For example, any program that reads or writes ASCII or tries to connect to a network would break.
Some historically-important mainframes and minicomputers had native word sizes that were multiples of 3, not powers of two, particularly the DEC PDP-6, PDP-8 and PDP-10.
This was the main reason that base 8 used to be popular in computing: since each octal digit represented three bits, a 9-, 12-, 18- or 36-bit pattern could be represented more neatly by octal digits than decimal or hex. For example, when using base-64 to pack characters into six bits instead of eight, each packed character took up two octal digits.
The two most visible legacies of those architectures today are that, by default, character escapes such as '\123' are interpreted as octal rather than decimal in C, and that Unix file permissions/masks are represented as three or four octal digits.

Endianness in casting an array of two bytes into a single short

Problem: I cannot understand the number 256 (2^8) in the extract of the IBM article:
On the other hand, if it's a
big-endian system, the high byte is 1
and the value of x is 256.
Assume each element in an array consumes 4 bites, then the processor should read somehow: 1000 0000. If it is a big endian, it is 0001 0000 because endianness does not affect bits inside bytes. [2] Contradiction to the 256 in the article!?
Question: Why is the number 256_dec (=1000 0000_bin) and not 32_dec (=0001 0000_bin)?
[2] Endian issues do not affect sequences that have single bytes, because "byte" is considered an atomic unit from a storage point of view.
Because a byte is 8 bits, not 4. The 9th least significant bit in an unsigned int will have value 2^(9-1)=256. (the least significant has value 2^(1-1)=1).
From the IBM article:
unsigned char endian[2] = {1, 0};
short x;
x = *(short *) endian;
They're correct; the value is (short)256 on big-endian, or (short)1 on little-endian.
Writing out the bits, it's an array of {00000001_{base2}, 00000000_{base2}}. Big endian would interpret that bit array read left to right; little endian would swap the two bytes.
256dec is not 1000_0000bin, it's 0000_0001_0000_0000bin.
With swapped bytes (1 byte = 8 bits) this looks like 0000_0000_0000_0001bin, which is 1dec.
Answering your followup question: briefly, there is no "default size of an element in an array" in most programming languages.
In C (perhaps the most popular programming language), the size of an array element -- or anything, really -- depends on its type. For an array of char, the elements are usually 1 byte. But for other types, the size of each element is whatever the sizeof() operator gives. For example, many C implementations give sizeof(short) == 2, so if you make an array of short, it will then occupy 2*N bytes of memory, where N is the number of elements.
Many high-level languages discourage you from even attempting to discover how many bytes an element of an array requires. Giving a fixed number of bytes ties the designers' hands to always using that many bytes, which is good for transparency and code that relies on its binary representation, but bad for backward compatibility whenever some reason comes along to change the representation.
Hope that helps. (I didn't see the other comments until after I wrote the first version of this.)