Adding n bits to the first n bits of another number - bit-manipulation

I am doing a project on digital filters. I needed to know how to add a 4 bit binary number to the most significant 4 bits of an 8 bit number. For example:
0 1 0 0 0 0 0 0 //x
+ 1 0 1 0 //y
= 1 1 1 0 0 0 0 0 //z
Can I add using a code somewhat like this?
z=[7:4]x + y
or should I have to concatenate the 4 bit number with another four zeros and add?

Assuming y is the 4 bit number and x the 8 bit number:
If you do
assign z = x[7:4] + y
Then you are doing a 4-bit addition and the most significant part of z is padded with 0's.
If you do
assign z = y[7:4] + x
You will get an error message from the synthesizer, as subscripts for y are wrong.
So do as this:
assign z = {y,4'b0} + x
Which performs an 8-bit addition with x and the value of y shifted 4 bits to the left, which is want you wanted.

Related

I'm having trouble understanding the syntax used in a piece of code

I've been looking at this for a while and I have some ideas on what this code could be doing, but I'm not sure if I correctly understand what the syntax of the code does.
The code iterates through a 2D array of unsigned chars, its meant to fill the array with 0's unless the spot in the array represents; the bottom, or the sides. If that is the case fill the spot in the array with a 9 instead.
The part I'm confused about is the statement (pField[y*fieldWidth + x] =) I believe this is a conditional statement, I understand the logic after, my question is specifically about this conditional, how should it be interpreted using if statements if possible?, If its not a conditional statement, what kind of statement is it?
pField = new unsigned char[fieldWidth*fieldHeight]; // Create play field buffer
for (int x = 0; x < fieldWidth; x++) // Board Boundary
for (int y = 0; y < fieldHeight; y++)
pField[y*fieldWidth + x] = (x == 0 || x == fieldWidth - 1 || y == fieldHeight - 1) ? 9 : 0;
The code is using a 2-dimensional array that is allocated in memory as a 1-dimensional array. The expression y*fieldWidth + x is calculating a 1D array index from a pair of 2D indexes.
The array represents a rectangle. The code is assigning a 9 to the 1D array elements that represent the rectangle’s left, right, and bottom edges (but not the top edge), and a 0 in the 1D array elements representing the rest of the rectangle.
For example, a 5x5 rectangle would look like this:
9 0 0 0 9
9 0 0 0 9
9 0 0 0 9
9 0 0 0 9
9 9 9 9 9
The corresponding 1D array elements would look like this:
x = 0 1 2 3 4 | 0 1 2 3 4 | 0 1 2 3 4 | 0 1 2 3 4 | 0 1 2 3 4
y = 0 0 0 0 0 | 1 1 1 1 1 | 2 2 2 2 2 | 3 3 3 3 3 | 4 4 4 4 4
---------------------------------------------------------
pField[] = 9 0 0 0 9 9 0 0 0 9 9 0 0 0 9 9 0 0 0 9 9 9 9 9 9
The ternary ?: operator can be rewritten using an if statement like this:
int value;
if (x == 0 || x == fieldWidth - 1 || y == fieldHeight - 1)
value = 9;
else
value = 0;
pField[y*fieldWidth + x] = value;
This code allocates space for a 2D array into a 1D array and then sets each element in the array. Regarding specifically the code you question, on the left hand side of the array the logic translates from 2D indices into 1D index. On the right hand side of the assignment the value is calculated by concatenating boolean predicates into a resulting boolean value that is then selected between returning 0 or 9 via a ternary operator and is wrote to the unsigned char array element.

Why does "number & (~(1 << 3))" not work for 0's?

I'm writing a program that exchanges the values of the bits on positions 3, 4 and 5 with bits on positions 24, 25 and 26 of a given 32-bit unsigned integer.
So lets say I use the number 15 and I want to turn the 4th bit into a 0, I'd use...
int number = 15
int newnumber = number & (~(1 << 3));
// output is 7
This makes sense because I'm exchanging the 4th bit from 1 to 0 so 15(1111) becomes 7(0111).
However this wont work the other way round (change a 0 to a 1), Now I know how to achieve exchanging a 0 to a 1 via a different method, but I really want to understand the code in this method.
So why wont it work?
The truth table for x AND y is:
x y Output
-----------
0 0 0
0 1 0
1 0 0
1 1 1
In other words, the output/result will only be 1 if both inputs are 1, which means that you cannot change a bit from 0 to 1 through a bitwise AND. Use a bitwise OR for that (e.g. int newnumber = number | (1 << 3);)
To summarize:
Use & ~(1 << n) to clear bit n.
Use | (1 << n) to set bit n.
To set the fourth bit to 0, you AND it with ~(1 << 3) which is the negation of 1000, or 0111.
By the same reasoning, you can set it to 1 by ORing with 1000.
To toggle it, XOR with 1000.

Binary quicksort starting bit position

I am reading about binary quicksort at the following location:
http://books.google.co.in/books?id=hyvdUQUmf2UC&pg=PA426&lpg=PA426&dq=robert+sedwick+binary+quick+sort&source=bl&ots=kAYK3_LkCg&sig=BjKk4g68h8xG87Vx2vS_TiUKDQY&hl=en&sa=X&ei=uuKzUq4-iY-tB7nZgdgL&ved=0CEYQ6AEwBA#v=onepage&q=robert%20sedwick%20binary%20quick%20sort&f=false
Text snippet:
For full-word keys consisting of random bits, the starting point in Program 10.1 should be the leftmost bit of the words, or bit 0. In general, the starting point that should be used depends in a straightforward way on the application, on the number of bits per word in the machine, and on the machine representation of integers and negative numbers. For the one-letter 5-bit keys in Figures 10.2 and 10.3, the starting point on a 32-bit machine would be bit 27.
My question on above text is:
Why does the author conclude that the starting point on a 32-bit machine should be bit 27 for 5-bit keys?
The text excerpt is confusing because it is incomplete.
It appears the text assumes a big-endian bit numbering for bits within a machine word. In big endian bit numbering, bit 0 is the leftmost bit within a word. The hint comes from the phrase "the leftmost bit of the words, or bit 0."
Therefore, for a 5 bit number held in a 32 bit register, bit 0 of that number would be held in bit 27 of the machine word, for a right-aligned value in a big-endian numbered word.
0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 machine word
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 bit numbers
+-----------------------------------------------------+---------+
|x x x x x x x x x x x x x x x x x x x x x x x x x x x|0 1 2 3 4| char to sort
+-----------------------------------------------------+---------+
Big endian bit numbering is uncommon in most places these days. IBM POWER / PowerPC still use big endian numbering, as did older big endian architectures such as the TMS9900 / TMS99000 family.

How can I count amount of sequentially set bits in a byte from left to right until the first 0?

I'm not good in English, I can't ask it better, but please below:
if byte in binary is 1 0 0 0 0 0 0 0 then result is 1
if byte in binary is 1 1 0 0 0 0 0 0 then result is 2
if byte in binary is 1 1 1 0 0 0 0 0 then result is 3
if byte in binary is 1 1 1 1 0 0 0 0 then result is 4
if byte in binary is 1 1 1 1 1 0 0 0 then result is 5
if byte in binary is 1 1 1 1 1 1 0 0 then result is 6
if byte in binary is 1 1 1 1 1 1 1 0 then result is 7
if byte in binary is 1 1 1 1 1 1 1 1 then result is 8
But if for example the byte in binary is 1 1 1 0 * * * * then result is 3.
I would determine how many bit is set contiguous from left to right with one operation.
The results are not necessary numbers from 1-8, just something to distinguish.
I think it's possible in one or two operations, but I don't know how.
If you don't know a solution as short as 2 operations, please write that too, and I won't try it anymore.
Easiest non-branching solution I can think of:
y=~x
y|=y>>4
y|=y>>2
y|=y>>1
Invert x, and extend the lefttmost 1-bit (which corresponds to the leftmost 0-bit in the non-inverted value) to the right. Will give distinct values (not 1-8 though, but it's pretty easy to do a mapping).
110* ****
turns into
001* ****
001* **1*
001* 1*1*
0011 1111
EDIT:
As pointed out in a different answer, using a precomputed lookup table is probably the fastets. Given only 8 bits, it's probably even feasible in terms of memory consumption.
EDIT:
Heh, woops, my bad.. You can skip the invert, and do ands instead.
x&=x>>4
x&=x>>2
x&=x>>1
here
110* ****
gives
110* **0*
110* 0*0*
1100 0000
As you can see all values beginning with 110 will result in the same output (1100 0000).
EDIT:
Actually, the 'and' version is based on undefined behavior (shifting negative numbers), and will usually do the right thing if using signed 8-bit (i.e. char, rather than unsigned char in C), but as I said the behavaior is undefined and might not always work.
I'd second a lookup table... otherwise you can also do something like:
unsigned long inverse_bitscan_reverse(unsigned long value)
{
unsigned long bsr = 0;
_BitScanReverse(&bsr, ~value); // x86 bsr instruction
return bsr;
}
EDIT: Not that you have to be careful of the special case where "value" has no zeroed bits. See the documentation for _BitScanReverse.

Warning about data loss c++/c

I am getting a benign warning about possible data loss
warning C4244: 'argument' : conversion from 'const int' to 'float', possible loss of data
Question
I remember as if float has a larger precision than int. So how can data be lost if I convert from a smaller data type (int) to a larger data type (float)?
Because float numbers are not precise. You cannot represent every possible value an int can hold into a float, even though the maximum value of a float is much higher.
For instance, run this simple program:
#include <stdio.h>
int main()
{
for(int i = 0; i < 2147483647; i++)
{
float value = i;
int ivalue = value;
if(i != ivalue)
printf("Integer %d is represented as %d in a float\n", i, ivalue);
}
}
You'll quickly see that there are thousands billions of integers that can't be represented as floats. For instance, all integers between the range 16,777,219 and 16,777,221 are represented as 16,777,220.
EDIT again Running that program above indicates that there are 2,071,986,175 positive integers that cannot be represented precisely as floats. Which leaves you roughly with only 100 millions of positive integer that fit correctly into a float. This means only one integer out of 21 is right when you put it into a float.
I expect the numbers to be the same for the negative integers.
On most architectures int and float are the same size, in that they have the same number of bits. However, in a float those bits are split between exponent and mantissa, meaning that there are actually fewer bits of precision in the float than the int. This is only likely to be a problem for larger integers, though.
On systems where an int is 32 bits, a double is usually 64 bits and so can exactly represent any int.
Both types are composed of 4 bytes (32 bits).
Only one of them allows a fraction (the float).
Take this for a float example;
34.156
(integer).(fraction)
Now use your logic;
If one of them must save fraction information (after all it should represent a number) then it means that it has less bits for the integer part.
Thus, a float can represent a maximal integer number which is smaller than the int's type capability.
To be more specific, an "int" uses 32 bits to represent an integer number (maximal unsigned integer of 4,294,967,296). A "float" uses 23 bits to do so (maximal unsigned integer of 8,388,608).
That's why when you convert from int to float you might lose data.
Example:
int = 1,158,354,125
You cannot store this number in a "float".
More information at:
http://en.wikipedia.org/wiki/Single_precision_floating-point_format
http://en.wikipedia.org/wiki/Integer_%28computer_science%29
Precision does not matter. The precision of int is 1, while the precision of a typical float (IEEE 754 single precision) is approximately 5.96e-8. What matters is the sets of numbers that the two formats can represent. If there are numbers that int can represent exactly that float cannot, then there is a possible loss of data.
Floats and ints are typically both 32 bits these days, but that's not guaranteed. Assuming it is the case on your machine, it follows that there must be int values that float cannot represent exactly, because there are obviously float values that int cannot represent exactly. The range of one format cannot be a proper super-set of the other if both formats use the same number of bits efficiently.
A 32 bit int effectively has 31 bits that code for the absolute value of the number. An IEEE 754 float effectively has only 24 bits that code for the mantissa (one implicit).
The fact is that both a float and an int are represented using 32 bits. The integer value uses all 32 bits so it can accommodate numbers from -231 to 231-1. However, a float uses 1 bit for the sign (including -0.0f) and 8 bits for the exponent. The means 32 - 9 = 23 bits left for the mantissa. However, the float assumes that if the mantissa and exponent are not zero, then the mantissa starts with a 1. So you more or less have 24 bits for your integer, instead of 32. However, because it can be shifted, it accommodates more than 224 integers.
A floating point uses a Sign, an eXponent, and a Mantissa
S X X X X X X X X M M M M M M M M M M M M M M M M M M M M M M M
An integer has a Sign, and a Mantissa
S M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M
So, a 29 bit integer such as:
0 0 0 1 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
fits in a float because it can be shifted:
0 0 0 1 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
| | |
| +-----------+ +-----------+
| | |
v v v
S X X X X X X X X M M M M M M M M M M M M M M M M M M M M M M M
0 1 0 0 1 1 0 1 1 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0
The eXponent represents a biased shift (the shift of the mantissa minus 128, if I'm correct—the shift counts from the decimal point). This clearly shows you that if you have to shift by 5 bits, you're going to lose the 5 lower bits.
So this other integer can be converted to a float with a lose of 2 bits (i.e. when you convert back to an integer, the last two bits (11) are set to zero (00) because they were not saved in the float):
1 1 1 0 0 1 1 1 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1
| ||
| || complement
| vv
| 0 0 1 1 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 1
| | | | | | | |
| +-----------+ +-----------+ +-+-+-+-+--> lost bits
| | |
v v v
S X X X X X X X X M M M M M M M M M M M M M M M M M M M M M M M
1 1 0 0 1 1 0 1 1 1 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 1 1 1 1 1 1 1
Note: For negative numbers, we first generate the complement, which is subtracting 1 then reversing all the bits from 0 to 1. That complement is what gets saved in the mantissa. The sign, however, still gets copied as is.
Pretty simple stuff really.
IMPORTANT NOTE: Yes, the first 1 in the integer is the sign, then the next 1 is not copied in the mantissa, it is assumed to be 1 so it is not required.
A float is usually in the standard IEEE single-precision format. This means there are only 24 bits of precision in a float, while an int is likely to be 32-bit. So, if your int contains a number whose absolute value cannot fit in 24 bits, you are likely to have it rounded to the nearest representable number.
My stock answer to such questions is to read this - What Every Computer Scientist Should Know About Floating-Point Arithmetic.