How to convert an array of bits to a char

How to convert an array of bits to a char - c++

I am trying to edit each byte of a buffer by modifying the LSB(Least Significant Bit) according to some requirements.
I am using the unsigned char type for the bytes, so please let me know IF that is correct/wrong.
unsigned char buffer[MAXBUFFER];
Next, i'm using this function
char *uchartob(char s[9], unsigned char u)
which modifies and returns the first parameter as an array of bits. This function works just fine, as the bits in the array represent the second parameter.
Here's where the hassle begins. I am going to point out what I'm trying to do step by step so you guys can let me know where i'm taking the wrong turn.
I am saving the result of the above function (called for each element of the buffer) in a variable
char binary_byte[9]; // array of bits
I am testing the LSB simply comparing it to some flag like above.
if (binary_byte[7]==bit_flag) // i go on and modify it like this
binary_byte[7]=0; // or 1, depending on the case
Next, I'm trying to convert the array of bits binary_byte (it is an array of bits, isn't it?) back into a byte/unsigned char and update the data in the buffer at the same time. I hope I am making myself clear enough, as I am really confused at the moment.
buffer[position_in_buffer]=binary_byte[0]<<7| // actualize the current BYTE in the buffer
binary_byte[1]<<6|
binary_byte[2]<<5|
binary_byte[3]<<4|
binary_byte[4]<<3|
binary_byte[5]<<2|
binary_byte[6]<<1|
binary_byte[7];
Keep in mind that the bit at the position binary_byte[7] may be modified, that's the point of all this.
The solution is not really elegant, but it's working, even though i am really insecure of what i did (I tried to do it with bitwise operators but without success)
The weird thing is when I am trying to print the updated character from the buffer. It has the same bits as the previous character, but it's a completely different one.
My final question is : What effect does changing only the LSB in a byte have? What should I expect?. As you can see, I'm getting only "new" characters even when i shouldn't.

So I'm still a little unsure what you are trying to accomplish here but since you are trying to modify individual bits of a byte I would propose using the following data structure:
union bit_byte
{
struct{
unsigned bit0 : 1;
unsigned bit1 : 1;
unsigned bit2 : 1;
unsigned bit3 : 1;
unsigned bit4 : 1;
unsigned bit5 : 1;
unsigned bit6 : 1;
unsigned bit7 : 1;
} bits;
unsigned char all;
};
This will allow you to access each bit of your byte and still get your byte representation. Here some quick sample code:
bit_byte myValue;
myValue.bits.bit0 = 1; // Set the LSB
// Test the LSB
if(myValue.bits.bit0 == 1) {
myValue.bits.bit7 = 1;
}
printf("%i", myValue.all);

bitwise:
set bit => a |= 1 << x;
reset bit => a &= ~(1 << x);
bit check => a & (1 << x);
flip bit => a ^= (1 << x)
If you can not manage this you can always use std::bitset.
Helper macros:
#define SET_BIT(where, bit_number) ((where) |= 1 << (bit_number))
#define RESET_BIT(where, bit_number) ((where) &= ~(1 << (bit_number)))
#define FLIP_BIT(where, bit_number) ((where) ^= 1 << (bit_number))
#define GET_BIT_VALUE(where, bit_number) (((where) & (1 << (bit_number))) >> bit_number) //this will retun 0 or 1
Helper application to print bits:
#include <iostream>
#include <cstdint>
#define GET_BIT_VALUE(where, bit_number) (((where) & (1 << (bit_number))) >> bit_number)
template<typename T>
void print_bits(T const& value)
{
for(uint8_t bit_count = 0;
bit_count < (sizeof(T)<<3);
++bit_count)
{
std::cout << GET_BIT_VALUE(value, bit_count) << std::endl;
}
}
int main()
{
unsigned int f = 8;
print_bits(f);
}

Related

How to safely extract a signed field from a uint32_t into a signed number (int or uint32_t)

I have a project in which I am getting a vector of 32-bit ARM instructions, and a part of the instructions (offset values) needs to be read as signed (two's complement) numbers instead of unsigned numbers.
I used a uint32_t vector because all the opcodes and registers are read as unsigned and the whole instruction was 32-bits.
For example:
I have this 32-bit ARM instruction encoding:
uint32_t addr = 0b00110001010111111111111111110110
The last 19 bits are the offset of the branch that I need to read as signed integer branch displacement.
This part: 1111111111111110110
I have this function in which the parameter is the whole 32-bit instruction:
I am shifting left 13 places and then right 13 places again to have only the offset value and move the other part of the instruction.
I have tried this function casting to different signed variables, using different ways of casting and using other c++ functions, but it prints the number as it was unsigned.
int getCat1BrOff(uint32_t inst)
{
uint32_t temp = inst << 13;
uint32_t brOff = temp >> 13;
return (int)brOff;
}
I get decimal number 524278 instead of -10.
The last option that I think is not the best one, but it may work is to set all the binary values in a string. Invert the bits and add 1 to convert them and then convert back the new binary number into decimal. As I would of do it in a paper, but it is not a good solution.

It boils down to doing a sign extension where the sign bit is the 19th one.
There are two ways.
Use arithmetic shifts.
Detect sign bit and or with ones at high bits.
There is no portable way to do 1. in C++. But it can be checked on compilation time. Please correct me if the code below is UB, but I believe it is only implementation defined - for which we check at compile time.
The only questionable thing is conversion of unsigned to signed which overflows, and the right shift, but that should be implementation defined.
int getCat1BrOff(uint32_t inst)
{
if constexpr (int32_t(0xFFFFFFFFu) >> 1 == int32_t(0xFFFFFFFFu))
{
return int32_t(inst << uint32_t{13}) >> int32_t{13};
}
else
{
int32_t offset = inst & 0x0007FFFF;
if (offset & 0x00040000)
{
offset |= 0xFFF80000;
}
return offset;
}
}
or a more generic solution
template <uint32_t N>
int32_t signExtend(uint32_t value)
{
static_assert(N > 0 && N <= 32);
constexpr uint32_t unusedBits = (uint32_t(32) - N);
if constexpr (int32_t(0xFFFFFFFFu) >> 1 == int32_t(0xFFFFFFFFu))
{
return int32_t(value << unusedBits) >> int32_t(unusedBits);
}
else
{
constexpr uint32_t mask = uint32_t(0xFFFFFFFFu) >> unusedBits;
value &= mask;
if (value & (uint32_t(1) << (N-1)))
{
value |= ~mask;
}
return int32_t(value);
}
}
https://godbolt.org/z/rb-rRB

In practice, you just need to declare temp as signed:
int getCat1BrOff(uint32_t inst)
{
int32_t temp = inst << 13;
return temp >> 13;
}
Unfortunately this is not portable:
For negative a, the value of a >> b is implementation-defined (in most
implementations, this performs arithmetic right shift, so that the
result remains negative).
But I have yet to meet a compiler that doesn't do the obvious thing here.

How to grab specific bits from a 256 bit message?

I'm using winsock to receive udp messages 256 bits long. I use 8 32-bit integers to hold the data.
int32_t dataReceived[8];
recvfrom(client, (char *)&dataReceived, 8 * sizeof(int), 0, &fromAddr, &fromLen);
I need to grab specific bits like, bit #100, #225, #55, etc. So some bits will be in dataReceived[3], some in dataReceived[4], etc.
I was thinking I need to bitshift each array, but things got complicated. Am I approaching this all wrong?

Why are you using int32_t type for buffer elements and not uint32_t?
I usually use something like this:
int bit_needed = 100;
uint32_t the_bit = dataReceived[bit_needed>>5] & (1U << (bit_needed & 0x1F));
Or you can use this one (but it won't work for sign in signed integers):
int bit_needed = 100;
uint32_t the_bit = (dataReceived[bit_needed>>5] >> (bit_needed & 0x1F)) & 1U;
In other answers you can access only lowes 8bits in each int32_t.

When you count bits and bytes from 0:
int bit_needed = 100;
So:
int byte = int(bit_needed / 8);
int bit = bit_needed % 8;
int the_bit = dataReceived[byte] & (1 << bit);
If the recuired bit contains 0, then the_bit will be zero. If it's 1, then the_bit will hold 2 to the power of that bit ordinal place within the byte.

You can make a small function to do the job.
uint8_t checkbit(uint32_t *dataReceived, int bitToCheck)
{
byte = bitToCheck/32;
bit = bitToCheck - byte*32;
if( dataReceived[byte] & (1U<< bit))
return 1;
else
return 0;
}
Note that you should use uint32_t rather than int32_t, if you are using bit shifting. Signed integer bit shifts lead to unwanted results, especially if the MSbit is 1.

You can use a macro in C or C++ to check for specific bit:
#define bit_is_set(var,bit) ((var) & (1 << (bit)))
and then a simple if:
if(bit_is_set(message,29)){
//bit is set
}

Converting 24 bit integer (2s complement) to 32 bit integer in C++

The dataFile.bin is a binary file with 6-byte records. The first 3
bytes of each record contain the latitude and the last 3 bytes contain
the longitude. Each 24 bit value represents radians multiplied by
0X1FFFFF
This is a task I've been working on. I havent done C++ in years so its taking me way longer than I thought it would -_-. After googling around I saw this algorthim which made sense to me.
int interpret24bitAsInt32(byte[] byteArray) {
int newInt = (
((0xFF & byteArray[0]) << 16) |
((0xFF & byteArray[1]) << 8) |
(0xFF & byteArray[2])
);
if ((newInt & 0x00800000) > 0) {
newInt |= 0xFF000000;
} else {
newInt &= 0x00FFFFFF;
}
return newInt;
}
The problem is a syntax issue I am restricting to working by the way the other guy had programmed this. I am not understanding how I can store the CHAR "data" into an INT. Wouldn't it make more sense if "data" was an Array? Since its receiving 24 integers of information stored into a BYTE.
double BinaryFile::from24bitToDouble(char *data) {
int32_t iValue;
// ****************************
// Start code implementation
// Task: Fill iValue with the 24bit integer located at data.
// The first byte is the LSB.
// ****************************
//iValue +=
// ****************************
// End code implementation
// ****************************
return static_cast<double>(iValue) / FACTOR;
}
bool BinaryFile::readNext(DataRecord &record)
{
const size_t RECORD_SIZE = 6;
char buffer[RECORD_SIZE];
m_ifs.read(buffer,RECORD_SIZE);
if (m_ifs) {
record.latitude = toDegrees(from24bitToDouble(&buffer[0]));
record.longitude = toDegrees(from24bitToDouble(&buffer[3]));
return true;
}
return false;
}
double BinaryFile::toDegrees(double radians) const
{
static const double PI = 3.1415926535897932384626433832795;
return radians * 180.0 / PI;
}
I appreciate any help or hints even if you dont understand a clue or hint will help me alot. I just need to talk to someone.

I am not understanding how I can store the CHAR "data" into an INT.
Since char is a numeric type, there is no problem combining them into a single int.
Since its receiving 24 integers of information stored into a BYTE
It's 24 bits, not bytes, so there are only three integer values that need to be combined.
An easier way of producing the same result without using conditionals is as follows:
int interpret24bitAsInt32(byte[] byteArray) {
return (
(byteArray[0] << 24)
| (byteArray[1] << 16)
| (byteArray[2] << 8)
) >> 8;
}
The idea is to store the three bytes supplied as an input into the upper three bytes of the four-byte int, and then shift it down by one byte. This way the program would sign-extend your number automatically, avoiding conditional execution.
Note on portability: This code is not portable, because it assumes 32-bit integer size. To make it portable use <cstdint> types:
int32_t interpret24bitAsInt32(const std::array<uint8_t,3> byteArray) {
return (
(const_cast<int32_t>(byteArray[0]) << 24)
| (const_cast<int32_t>(byteArray[1]) << 16)
| (const_cast<int32_t>(byteArray[2]) << 8)
) >> 8;
}
It also assumes that the most significant byte of the 24-bit number is stored in the initial element of byteArray, then comes the middle element, and finally the least significant byte.
Note on sign extension: This code automatically takes care of sign extension by constructing the value in the upper three bytes and then shifting it to the right, as opposed to constructing the value in the lower three bytes right away. This additional shift operation ensures that C++ takes care of sign-extending the result for us.

When an unsigned char is casted to an int the higher order bits are filled with 0's
When a signed char is casted to a casted int, the sign bit is extended.
ie:
int x;
char y;
unsigned char z;
y=0xFF
z=0xFF
x=y;
/*x will be 0xFFFFFFFF*/
x=z;
/*x will be 0x000000FF*/
So, your algorithm, uses 0xFF as a mask to remove C' sign extension, ie
0xFF == 0x000000FF
0xABCDEF10 & 0x000000FF == 0x00000010
Then uses bit shifts and logical ands to put the bits in their proper place.
Lastly checks the most significant bit (newInt & 0x00800000) > 0 to decide if completing with 0's or ones the highest byte.

int32_t upperByte = ((int32_t) dataRx[0] << 24);
int32_t middleByte = ((int32_t) dataRx[1] << 16);
int32_t lowerByte = ((int32_t) dataRx[2] << 8);
int32_t ADCdata32 = (((int32_t) (upperByte | middleByte | lowerByte)) >> 8); // Right-shift of signed data maintains signed bit

Writing values as an arbitrary amount of bits into a byte buffer in C++

Hey, I need to pack bit values into a byte buffer in C++. My Buffer class has a char array and a position, similar to Java's ByteBuffer. I need a good way to pack bits into this buffer, like so:
void put_bits(int amount, uint32_t value);
It needs to support up to 32 bits. I've seen a solution implemented in Java (that requires start/end access methods before bits can be packed) but I'm not sure how to do this in C++ because the endianness and other low level factors aren't hidden like they are in Java.
I have an inline function declared as endianness() which returns 0 (defined as BIG_ENDIAN) or 1 (defined as LITTLE_ENDIAN) that can be used, but I'm just not sure how to properly pack bits into a byte buffer.
This is the Java version of what I need to implement:
public void writeBits(int numBits, int value) {
int bytePos = bitPosition >> 3;
int bitOffset = 8 - (bitPosition & 7);
bitPosition += numBits;
for(; numBits > bitOffset; bitOffset = 8) {
buffer[bytePos] &= ~ bitMaskOut[bitOffset];
buffer[bytePos++] |= (value >> (numBits-bitOffset)) & bitMaskOut[bitOffset];
numBits -= bitOffset;
}
if(numBits == bitOffset) {
buffer[bytePos] &= ~ bitMaskOut[bitOffset];
buffer[bytePos] |= value & bitMaskOut[bitOffset];
}
else {
buffer[bytePos] &= ~ (bitMaskOut[numBits]<<(bitOffset - numBits));
buffer[bytePos] |= (value&bitMaskOut[numBits]) << (bitOffset - numBits);
}
}
Which requires these two methods as well:
public void initBitAccess() {
bitPosition = currentOffset * 8;
}
public void finishBitAccess() {
currentOffset = (bitPosition + 7) / 8;
}
How should I go about solving this? Thanks.
EDIT: I also still need to be able to write normal bytes before and after writing bits.

Just remove all the public keywords, and I would say that you have your C++ implementation right there.

As long as you use the byte buffer only as such, you can translate the Java code one-to-one. It only gets dangerous if you interpret a byte pointer as another type and try to store a complete int in the byte buffer.
You don't even need the endianness function in this case, since you store a byte in a byte buffer, and there is nothing to convert or adjust size or whatever.

bit pattern matching and replacing

I come across a very tricky problem with bit manipulation.
As far as I know, the smallest variable size to hold a value is one byte of 8 bits. The bit operations available in C/C++ apply to an entire unit of bytes.
Imagine that I have a map to replace a binary pattern 100100 (6 bits) with a signal 10000 (5 bits). If the 1st byte of input data from a file is 10010001 (8 bits) being stored in a char variable, part of it matches the 6 bit pattern and therefore be replaced by the 5 bit signal to give a result of 1000001 (7 bits).
I can use a mask to manipulate the bits within a byte to get a result of the left most bits to 10000 (5 bit) but the right most 3 bits become very tricky to manipulate. I cannot shift the right most 3 bits of the original data to get the correct result 1000001 (7 bit) followed by 1 padding bit in that char variable that should be filled by the 1st bit of next followed byte of input.
I wonder if C/C++ can actually do this sort of replacement of bit patterns of length that do not fit into a Char (1 byte) variable or even Int (4 bytes). Can C/C++ do the trick or we have to go for other assembly languages that deal with single bits manipulations?
I heard that Power Basic may be able to do the bit-by-bit manipulation better than C/C++.

If time and space are not important then you can convert the bits to a string representation and perform replaces on the string, then convert back when needed. Not an elegant solution but one that works.

<< shiftleft
^ XOR
>> shift right
~ one's complement
Using these operations, you could easily isolate the pieces that you are interested in and compare them as integers.
say the byte 001000100 and you want to check if it contains 1000:
char k = (char)68;
char c = (char)8;
int i = 0;
while(i<5){
if((k<<i)>>(8-3-i) == c){
//do stuff
break;
}
}
This is very sketchy code, just meant to be a demonstration.

I wonder if C/C++ can actually do this
sort of replacement of bit patterns of
length that do not fit into a Char (1
byte) variable or even Int (4 bytes).
What about std::bitset?

Here's a small bit reader class which may suit your needs. Of course, you may want to create a bit writer for your use case.
#include <iostream>
#include <sstream>
#include <cassert>
class BitReader {
public:
typedef unsigned char BitBuffer;
BitReader(std::istream &input) :
input(input), bufferedBits(8) {
}
BitBuffer peekBits(int numBits) {
assert(numBits <= 8);
assert(numBits > 0);
skipBits(0); // Make sure we have a non-empty buffer
return (((input.peek() << 8) | buffer) >> bufferedBits) & ((1 << numBits) - 1);
}
void skipBits(int numBits) {
assert(numBits >= 0);
numBits += bufferedBits;
while (numBits > 8) {
buffer = input.get();
numBits -= 8;
}
bufferedBits = numBits;
}
BitBuffer readBits(int numBits) {
assert(numBits <= 8);
assert(numBits > 0);
BitBuffer ret = peekBits(numBits);
skipBits(numBits);
return ret;
}
bool eof() const {
return input.eof();
}
private:
std::istream &input;
BitBuffer buffer;
int bufferedBits; // How many bits are buffered into 'buffer' (0 = empty)
};

Use a vector<bool> if you can read your data into the vector mostly at once. It may be more difficult to find-and-replace sequences of bits, though.

If I understood your questions correctly, you have an input stream and and output stream and you want to replace the 6bits of the input with 5 in the output - and your output still should be a bit stream?
So, the most important programmer's rule can be applied: Divide et impera!
You should split your component in three parts:
Input Stream converter: Convert every pattern in the input stream to a char array (ring) buffer. If I understood you correctly your input "commands" are 8bit long, so there is nothing special about this.
Do the replacement on the ring buffer in a way that you replace every matching 6-bit pattern with the 5bit one, but "pad" the 5 bit with a leading zero, so the total length is still 8bit.
Write an output handler that reads from the ring buffer and let this output handler write only the 7 LSB to the output stream from each input byte. Of course some bit manipulation is necessary again for this.
If your ring buffer size can be divided by 8 and 7 (= is a multiple of 56) you will have a clean buffer at the end and can start again with 1.
The most simplest way to implement this is to iterate over this 3 steps as long as input data is available.
If a performance really matters and you are running on a multi-core CPU you even could split the steps and 3 threads, but then you must carefully synchronize the access to the ring buffer.

I think the following does what you want.
PATTERN_LEN = 6
PATTERNMASK = 0x3F //6 bits
PATTERN = 0x24 //b100100
REPLACE_LEN = 5
REPLACEMENT = 0x10 //b10000
void compress(uint8* inbits, uint8* outbits, int len)
{
uint16 accumulator=0;
int nbits=0;
uint8 candidate;
while (len--) //for all input bytes
{
//for each bit (msb first)
for (i=7;i<=0;i--)
{
//add 1 bit to accumulator
accumulator<<=1;
accumulator|=(*inbits&(1<<i));
nbits++;
//check for pattern
candidate = accumulator&PATTERNMASK;
if (candidate==PATTERN)
{
//remove pattern
accumulator>>=PATTERN_LEN;
//add replacement
accumulator<<=REPLACE_LEN;
accumulator|=REPLACMENT;
nbits+= (REPLACE_LEN - PATTERN_LEN);
}
}
inbits++;
//move accumulator to output to prevent overflow
while (nbits>8)
{
//copy the highest 8 bits
nbits-=8;
*outbits++ = (accumulator>>nbits)&0xFF;
//clear them from accumulator
accumulator&= ~(0xFF<<nbits);
}
}
//copy remainder of accumulator to output
while (nbits>0)
{
nbits-=8;
*outbits++ = (accumulator>>nbits)&0xFF;
accumulator&= ~(0xFF<<nbits);
}
}
You could use a switch or a loop in the middle to check the candidate against multiple patterns. There might have to be some special handling after doing a replacment to ensure the replacement pattern is not re-checked for matches.

#include <iostream>
#include <cstring>
size_t matchCount(const char* str, size_t size, char pat, size_t bsize) noexcept
{
if (bsize > 8) {
return 0;
}
size_t bcount = 0; // curr bit number
size_t pcount = 0; // curr bit in pattern char
size_t totalm = 0; // total number of patterns matched
const size_t limit = size*8;
while (bcount < limit)
{
auto offset = bcount%8;
char c = str[bcount/8];
c >>= offset;
char tpat = pat >> pcount;
if ((c & 1) == (tpat & 1))
{
++pcount;
if (pcount == bsize)
{
++totalm;
pcount = 0;
}
}
else // mismatch
{
bcount -= pcount; // backtrack
//reset
pcount = 0;
}
++bcount;
}
return totalm;
}
int main(int argc, char** argv)
{
const char* str = "abcdefghiibcdiixyz";
char pat = 'i';
std::cout << "Num matches = " << matchCount(str, 18, pat, 7) << std::endl;
return 0;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to convert an array of bits to a char - c++

Related

How to safely extract a signed field from a uint32_t into a signed number (int or uint32_t)

How to grab specific bits from a 256 bit message?

Converting 24 bit integer (2s complement) to 32 bit integer in C++

Writing values as an arbitrary amount of bits into a byte buffer in C++

bit pattern matching and replacing

Categories

Resources