Extracting a bit from the sequence of bits c++ - c++

Hi everyone I have the following function :
#define GET_BIT(p, n) ((((unsigned char *)p)[n/8] >> (n%8)) & 0x01)
void extractBit(void const * data, int bitIndex)
{
string result = "";
result.append(std::to_string(GET_BIT(data, bitIndex)));
}
and following link shows my bits which are pointed by void const* data pointer :http://prntscr.com/3znmpz . void const* data points the part of my screenshot which are represented by red box. (I mean first member is "00000000" shown in green box). If this is required information, my file is written and shown using by little endian.
With this function I want to append bit at bitset position into my result string
For example, when extractBit(data,23) I want to add first 1 in the red box into my result string but it gives me 0. Altough I've looked at my code through a couple hours, I could not find my mistake. Is there anyone to help me ?

The first '1' is not the 23th, it's the 16th bit.
Well, it might look as a 23-th if you just count from left to right. But that's not how your function works.
Inside a byte, you enumerate bits from right to left (0th is rightmost bit, 7th is leftmost, which is a common convention and should be fine).
So, bit numbers as seen by your function are:
7 6 5 4 3 2 1 0 | 15 14 13 12 11 10 9 8 | 23 22 21 20 19 18 17 16 | 31 30 ...

Related

Why take up 21 bytes, on a 32-bit system, three pointers plus two numbers 5 * 4 = 20 (should be 20 bytes )

Why take up 21 bytes, on a 32-bit system, three pointers plus two numbers 5 * 4 = 20 (should be 20 bytes ah)
Thank you for your answer!!!
https://redis.com/ebook/part-2-core-concepts/01chapter-9-reducing-memory-use/9-1-short-structures/9-1-1-the-ziplist-representation/
enter image description here
Your book counts the terminating \0 byte at the end of "one\0" as overhead, bringing the total to 21.

Is there a way to set a specific amount of maximum spaces for TAB character (backslash t) "\t" in C++?

I want to print a square matrix that needs to be spaced each between the elements by 3 spaces. Then I found that char '\t' might be the easiest way. But, I think the number of spaces is somehow defined in certain algorithm. Could someone give me a guide through the algorithm or is there a way to set certain spaces in C++ for '\t'?
I know how to output manually by determining the number of space character. However '\t' seems simple to code rather than looping certain algorithm.
For a simple square matrix :
for (int x = 0, num = 1; x < 5; x++) {
for (int y = 0; y < 5; y++, num++) {
cout << num << "\t";
}
cout << endl;
}
The code outputs
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
21 22 23 24 25
while I need
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
21 22 23 24 25
Is there a way to set '\t'?
But, I think the number of spaces is somehow defined in certain algorithm.
...
Is there a way to set '\t'?
Not unless your connected terminal allows you to control that in a way.
The usual way to control output formatting using I/O manipulators, for your case
std::setw()
std::left (or may be better std::right for numbers)
I'm pretty sure the size of a tab is defined by your console, so this will be a setting of your IDE.
If you want a consistent quantity of spaces, you're better off using spaces. That way you don't have to alter the console settings of everything you run your code on. Why not just do something like
" "
instead of "\t" ?
EDIT: Apparently Stack overflow doesn't appreciate multiple spaces in-text.

Why do these two functions to print binary representation of an integer have the same output?

I have two functions that print 32bit number in binary.
First one divides the number into bytes and starts printing from the last byte (from the 25th bit of the whole integer).
Second one is more straightforward and starts from the 1st bit of the number.
It seems to me that these functions should have different outputs, because they process the bits in different orders. However the outputs are the same. Why?
#include <stdio.h>
void printBits(size_t const size, void const * const ptr)
{
unsigned char *b = (unsigned char*) ptr;
unsigned char byte;
int i, j;
for (i=size-1;i>=0;i--)
{
for (j=7;j>=0;j--)
{
byte = (b[i] >> j) & 1;
printf("%u", byte);
}
}
puts("");
}
void printBits_2( unsigned *A) {
for (int i=31;i>=0;i--)
{
printf("%u", (A[0] >> i ) & 1u );
}
puts("");
}
int main()
{
unsigned a = 1014750;
printBits(sizeof(a), &a); // ->00000000000011110111101111011110
printBits_2(&a); // ->00000000000011110111101111011110
return 0;
}
Both your functions print binary representation of the number from the most significant bit to the least significant bit. Today's PCs (and majority of other computer architectures) use so-called Little Endian format, in which multi-byte values are stored with least significant byte first.
That means that 32-bit value 0x01020304 stored on address 0x1000 will look like this in the memory:
+--------++--------+--------+--------+--------+
|Address || 0x1000 | 0x1001 | 0x1002 | 0x1003 |
+--------++--------+--------+--------+--------+
|Data || 0x04 | 0x03 | 0x02 | 0x01 |
+--------++--------+--------+--------+--------+
Therefore, on Little Endian architectures, printing value's bits from MSB to LSB is equivalent to taking its bytes in reversed order and printing each byte's bits from MSB to LSB.
This is the expected result when:
1) You use both functions to print a single integer, in binary.
2) Your C++ implementation is on a little-endian hardware platform.
Change either one of these factors (with printBits_2 appropriately adjusted), and the results will be different.
They don't process the bits in different orders. Here's a visual:
Bytes: 4 3 2 1
Bits: 8 7 6 5 4 3 2 1 8 7 6 5 4 3 2 1 8 7 6 5 4 3 2 1 8 7 6 5 4 3 2 1
Bits: 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
The fact that the output is the same from both of these functions tells you that your platform uses Little-Endian encoding, which means the most significant byte comes last.
The first two rows show how the first function works on your program, and the last row shows how the second function works.
However, the first function will fail on platforms that use Big-Endian encoding and output the bits in this order shown in the third row:
Bytes: 4 3 2 1
Bits: 8 7 6 5 4 3 2 1 8 7 6 5 4 3 2 1 8 7 6 5 4 3 2 1 8 7 6 5 4 3 2 1
Bits: 8 7 6 5 4 3 2 1 16 15 14 13 12 11 10 9 24 23 22 21 20 19 18 17 32 31 30 29 28 27 26 25
For the printbits1 function, it is taking the uint32 pointer and assigning it to a char pointer.
unsigned char *b = (unsigned char*) ptr;
Now, in a big endian processor, b[0] will point to the Most significant byte of the uint32 value. The inner loop prints this byte in binary, and then b[1] will point to the next most significant byte in ptr. Therefore this method prints the uint32 value MSB first.
As for printbits2, you are using
unsigned *A
i.e. an unsigned int. This loop runs from 31 to 0 and prints the uint32 value in binary.

UDP receive data as unsigned char

I am a trying to receive some data from network using UDP and parse it.
Here is the code,
char recvline[1024];
int n=recvfrom(sockfd,recvline,1024,0,NULL,NULL);
for(int i=0;i<n;i++)
cout << hex <<static_cast<short int>(recvline[i])<<" ";
Printed the output,
19 ffb0 0 0 ff88 d 38 19 48 38 0 0 2 1 3 1 ff8f ff82 5 40 20 16 6 6 22 36 6 2c 0 0 0 0 0 0 0 0
But I am expecting the output like,
19 b0 0 0 88 d 38 19 48 38 0 0 2 1 3 1 8f 82 5 40 20 16 6 6 22 36 6 2c 0 0 0 0 0 0 0 0
The ff shouldn't be there on printed output.
Actually I have to parse this data based on each character,
Like,
parseCommand(recvline);
and the parse code looks,
void parseCommand( char *msg){
int commId=*(msg+1);
switch(commId){
case 0xb0 : //do some operation
break;
case 0x20 : //do another operation
break;
}
}
And while debugging I am getting commId=-80 on watch.
Note:
In Linux I am getting successful output with the code, note that I have used unsigned char instead char for the read buffer.
unsigned char recvline[1024];
int n=recvfrom(sockfd,recvline,1024,0,NULL,NULL);
Where as in Windows recvfrom() not allowing the second argument as unsigned it giving build error, so I chose char
Looks like you might be getting the correct values, but your cast to short int during printing sign-extends your char value, causing ff to be propogated to the top byte if the top bit of your char is 1 (i.e. it is negative). You should first cast it to unsigned type, then extend to int, so you need 2 casts:
cout << hex << static_cast<short int>(static_cast<uint8_t>(recvline[i]))<<" ";
I have tested this and it behaves as expected.
In response to your extension: the data read is fine, it is a matter of how you interpret it. To parse correctly you should:
uint8_t commId= static_cast<uint8_t>(*(msg+1));
switch(commId){
case 0xb0 : //do some operation
break;
case 0x20 : //do another operation
break;
}
As you store your data in a signed data type conversions/promotion to bigger data types will first sign extend the value (filling the high order bits with the value of the MSB) even if it then gets converted to unsigned datatypes.
One solution is to define recvline as uint8_t[] in the first place an cast it to char* when passing it to the recvfrom function. That way, you only have to cast it once and you are using the same code in your windows and linux version. Also uint8_t[] is (at least to me) a clear indication that you are using the array as raw memory instead of a string of some kind.
Another possibility is to simply perform a bitwise And: (recvline[i] & 0xff). Thanks to automatic integral promotion this doesn't even require a cast.
Personal Note:
It is really annoying that the C and C++ standards don't provide a separate type for raw memory (yet), but with any luck well get a byte type in a future standard revision.

IP header bit order not clear

I read the IP RFC and in there it says the 4 first bits of the IP header is the version. In the drawing it also shows that bits 0 to 3 are the version.
https://www.rfc-editor.org/rfc/rfc791#section-3.1
But when I look at the first byte of the header (as captured using pcap lib) I see this byte:
0x45
This is a version 4 IP header but obviously bits 4 to 7 are equal to 4 and not bits 0 to 3 as I expected.
I expected doing a bitwise and on first byte and 0x0F will get me the version but it seems that I need to and with 0xF0.
Am I missing something? Understanding something incorrectly?
You should read Appendix B of the RFC:
Whenever an octet represents a numeric quantity the left most bit in the
diagram is the high order or most significant bit. That is, the bit
labeled 0 is the most significant bit. For example, the following
diagram represents the value 170 (decimal).
0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|1 0 1 0 1 0 1 0|
+-+-+-+-+-+-+-+-+
Which means everything is correct except for your assumption that the “first four bits” are the least-significant, while those are the most-significant.
E.g. in the 7th and 8th bytes, containing the flags and the fragment offset, you can separate those as follows (consider that pseudocode, even though it is working C#):
byte flagsAndFragmentHi = packet[6];
byte fragmentLo = packet[7];
bool flagReserved0 = (flagsAndFragmentHi & 0x80) != 0;
bool flagDontFragment = (flagsAndFragmentHi & 0x40) != 0;
bool flagMoreFragments = (flagsAndFragmentHi & 0x20) != 0;
int fragmentOffset = ((flagsAndFragmentHi & 0x1F) << 8) | (fragmentLo);
Note that the more significant (left-shifted 8 bits) portion of the fragment offset is in the first byte (because IP works in big endian). Generally: bits on the left in the diagram are always more significant.