Nibble shifting - c++

I was working on an encryption algorithm and I wonder how I can change the following code into something simpler and how to reverse this code.
typedef struct
unsigned low : 4;
unsigned high : 4;
} nibles;
static void crypt_enc(char *data, int size)
char last = 0;
// Pass 2
for (i = 0; i < size; i++)
nibles *n = (nibles *)&data[i];
n->low = last;
last = n->high;
n->high = n->low;
((nibles *)&data[0])->low = last;
data is the input and the output for this code.

You are setting both nibbles of every byte to the same thing, because you set the high nibble to the same as the low nibble in the end. I'll assume this is a bug and that your intention was to shift all the nibbles in the data, carrying from one byte to the other, and rolling around. Id est, ABCDEF (nibbles order from low to high) would become FABCDE. Please correct me if I got that wrong.
The code should be something like:
static void crypt_enc(char *data, int size)
char last = 0;
// Pass 2
for (i = 0; i < size; i++)
nibles *n = (nibles *)&data[i];
unsigned char old_low = n->low;
n->low = last;
last = n->high;
n->high = old_low;
((nibles *)&data[0])->low = last;
Is everything okay now? No. The cast to nibbles* is only well-defined if the alignment of nibbles is not stricter than the alignment of char. And that is not guaranteed (however, with a small change, GCC generates a type with the same alignment).
Personally, I'd avoid this issue altogether. Here's how I'd do it:
void set_low_nibble(char& c, unsigned char nibble) {
// assumes nibble has no bits set in the four higher bits)
unsigned char& b = reinterpret_cast<unsigned char&>(c);
b = (b & 0xF0) | nibble;
void set_high_nibble(char& c, unsigned char nibble) {
unsigned char& b = reinterpret_cast<unsigned char&>(c);
b = (b & 0x0F) | (nibble << 4);
unsigned char get_low_nibble(unsigned char c) {
return c & 0x0F;
unsigned char get_high_nibble(unsigned char c) {
return (c & 0xF0) >> 4;
static void crypt_enc(char *data, int size)
char last;
// Pass 2
for (i = 0; i < size; ++i)
unsigned char old_low = get_low_nibble(data[i]);
set_low_nibble(data[i], last);
last = get_high_nibble(data[i]);
set_high_nibble(data[i], old_low);
set_low_nibble(data[0], last);
Doing the reverse amounts to changing "low" to "high" and vice-versa; rolling to the last nibble, not the first; and going through the data in the opposite direction:
for (i = size-1; i >= 0; --i)
unsigned char old_high = get_high_nibble(data[i]);
set_high_nibble(data[i], last);
last = get_low_nibble(data[i]);
set_low_nibble(data[i], old_high);
set_high_nibble(data[size-1], last);
If you want you can get rid of all the transfers to the temporary last. You just need to save the last nibble of all, and then shift the nibbles directly without the use of another variable:
last = get_high_nibble(data[size-1]);
for (i = size-1; i > 0; --i) // the last one needs special care
set_high_nibble(data[i], get_low_nibble(data[i]));
set_low_nibble(data[i], get_high_nibble(data[i-1]));
set_high_nibble(data[0], get_low_nibble(data[0]));
set_low_nibble(data[0], last);

It looks like you're just shifting each nibble one place and then taking the low nibble of the last byte and moving it to the beginning. Just do the reverse to decrypt (start at the end of data, move to the beginning)

As you are using bit fields, it is very unlikely that there will be a shift style method to move nibbles around. If this shifting is important to you, then I recommend you consider storing them in an unsigned integer of some sort. In that form, bit operations can be performed effectively.

Kevin's answer is right in what you are attempting to do. However, you've made an elementary mistake. The end result is that your whole array is filled with zeros instead of rotating nibbles.
To see why that is the case, I'd suggest you first implement a byte rotation ({a, b, c} -> {c, a, b}) the same way - which is by using a loop counter increasing from 0 to array size. See if you can do better by reducing transfers into the variable last.
Once you see how you can do that, you can simply apply the same logic to nibbles ({al:ah, bl:bh, cl:ch} -> {ch:al, ah:bl, bh:cl}). My representation here is incorrect if you think in terms of hex values. The hex value 0xXY is Y:X in my notation. If you think about how you've done the byte rotation, you can figure out how to save only one nibble, and simply transfer nibbles without actually moving them into last.

Reversing the code is impossible as the algorithm nukes the first byte entirely and discards the lower half of the rest.
On the first iteration of the for loop, the lower part of the first byte is set to zero.
n->low = last;
It's never saved off anywhere. It's simply gone.
// I think this is what you were trying for
last = ((nibbles *)&data[0])->low;
for (i = 0; i < size-1; i++)
nibbles *n = (nibbles *)&data[i];
nibbles *next = (nibbles *)&data[i+1];
n->low = n->high;
n->high = next->low;
((nibbles *)&data[size-1])->high = last;
To reverse it:
last = ((nibbles *)&data[size-1])->high;
for (i = size-1; i > 0; i--)
nibbles *n = (nibbles *)&data[i];
nibbles *prev = (nibbles *)&data[i-1];
n->high = n->low;
n->low = prev->high;
((nibbles *)&data[0])->low = last;
... unless I got high and low backwards.
But anyway, this is NOWHERE near the field of encryption. This is obfuscation at best. Security through obscurity is a terrible terrible practice and home-brew encryption get's people in trouble. If you're playing around, all the more power to you. But if you actually want something to be secure, please for the love of all your bytes use a well known and secure encryption scheme.


Fast bitwise comparison of unaligned bit streams

I have two bit streams A[1..a] and B[1..b], where a is always smaller than b. Now, given an index c in B, I want to know if A matches the area B[c..c+a-1] (assume c+a-1<=b always hold).
I can't just use memcmp because A and B[c..c+a-1] are not necessarily byte-aligned.
So I have a custom function that compares A and B[c..c+a-1] bitwise, where B is encoded within a class that performs bit operations. This is my C++ code:
struct bitstream{
constexpr static uint8_t word_bits = 64;
constexpr static uint8_t word_shift = 6;
const static size_t masks[65];
size_t *B;
inline bool compare_chunk(const void* A, size_t a, size_t c) {
size_t n_words = a / word_bits;
size_t left = c & (word_bits - 1UL);
size_t right = word_bits - left;
size_t cell_i = c >> word_shift;
auto tmp_in = reinterpret_cast<const size_t *>(A);
size_t tmp_data;
//shift every cell in B[c..c+a-1] to compare it against A
for(size_t k=0; k < n_words - 1; k++){
tmp_data = (B[cell_i] >> left) & masks[right];
tmp_data |= (B[++cell_i] & masks[left]) << right;
if(tmp_data != tmp_in[k]) return false;
size_t read_bits = (n_words - 1) << word_shift;
return (tmp_in[n_words - 1] & masks[(a-read_bits)]) == read(c + read_bits, c+a-1);
inline size_t read(size_t i, size_t j) const{
size_t cell_i = i >> word_shift;
size_t i_pos = (i & (word_bits - 1UL));
size_t cell_j = j >> word_shift;
if(cell_i == cell_j){
return (B[cell_i] >> i_pos) & masks[(j - i + 1UL)];
size_t right = word_bits-i_pos;
size_t left = 1+(j & (word_bits - 1UL));
return ((B[cell_j] & masks[left]) << right) | ((B[cell_i] >> i_pos) & masks[right]);
const size_t bitstream::masks[65]={0x0,
0x1,0x3, 0x7,0xF,
0x1F,0x3F, 0x7F,0xFF,
0x1FF,0x3FF, 0x7FF,0xFFF,
0x1FFF,0x3FFF, 0x7FFF,0xFFFF,
The function read belongs to the class that wraps B and reads an area of B of most 64 bits.
The code above works, but it seems to be the bottleneck of my application (I run it exhaustively over massive inputs).
Now, my question is: do you know if there is a technique to compare A and B[c..c+a-1] faster?
I know I could use SIMD instructions, but I don't think it will produce a significant improvement as B is encoded in 64-bit cells.
Here are some extra details:
A is usually short (maybe 20 or 30 64-bit cells), but there is not guarantee. It could also be arbitrarily large, although always smaller than B.
I can't make any assumption about A's encoding. It could be uint8_t, uint16_t, uint32_t or uint64_t. That is the reason I pass it as void* to the function.
Link to godbolt with the code above compiling example
A few things you can try:
as noted before, you can't just cast A to size_t*. You either need to go byte-by-byte, or check the beginning and end that's not 8-byte aligned separately
move the declaration of tmp_data inside the loop as a single 'size_t const tmp_data' assignment, refer to B[cell_i] and B[cell_i+1], and increment cell_i in the for statement. That way the compiler can do loop unrolling (at least it can detect that it can much more easily).
finally, if memory is not an issue, then you can keep 8 copies of B (each shifted by a bit to the right), and use the one where B[c] is the beginning of a new byte. Then you can use memcmp (which will presumably give you the fastest code).

Convert array values of 1's and 0's to binary

In Arduino IDE, I am placing all of input values to an array like so:
int eOb1 = digitalRead(PrOb1);
int eLoop = digitalRead(PrLoop);
int eOb2 = digitalRead(PrOb2);
InputValues[0] = eOb1;
InputValues[1] = eLoop;
InputValues[2] = eOb2;
InputValues[3] = 0;
InputValues[4] = 0;
InputValues[5] = 0;
InputValues[6] = 0;
InputValues[7] = 0;
I would like to convert it to a byte array like so: 00000111.
Can you show me please. I tried using a for Loop to iterate through the values but it doesn't work.
char bin[8];
for(int i = 0; i < 8; ++i) {
bin &= InputValues[i];
If I understand your requirement correctly, you have an array of individual bits and you need to convert it into a byte that has the corresponding bits.
So to start, you should declare bin to be of type unsigned char instead of char[8]. char[8] means an array of 8 bytes, whereas you only need a single byte.
Then you need to initialize it to 0. (This is important since |= needs the variable to have some defined value).
unsigned char bin;
Now, unsigned char is guaranteed to have 1 byte but not 8 bits. So you should use something like uint8_t IF it is available.
Finally you can set the appropriate bits in bin as -
for(int i = 0; i < 8; ++i) {
bin |= (InputValues[i] << i);
There are two things I have changed.
I used |= instead of &=. This is the bitwise OR operator. You need to use OR because it only sets the correct bits in the LHS and leaves other bits untouched. An AND won't necessarily set that bit and will also mask away (set to 0), the other bits.
Shifted the bit in the array to the corresponding position using << i.

Convert hex integer into form "\x" (c++ - memory)

DWORD FindPattern(DWORD base, DWORD size, char *pattern, char *mask)
// Get length for our mask, this will allow us to loop through our array
DWORD patternLength = (DWORD)strlen(mask);
for (DWORD i = 0; i < size - patternLength; i++)
bool found = true;
for (DWORD j = 0; j < patternLength; j++)
// If we have a ? in our mask then we have true by default,
// or if the bytes match then we keep searching until finding it or not
found &= mask[j] == '?' || pattern[j] == *(char*)(base + i + j);
// Found = true, our entire pattern was found
// Return the memory addy so we can write to it
if (found)
return base + i;
return NULL;
Above is my FindPattern function that I use to find bytes in a given section of memory, here's how I call the function:
DWORD PATTERN = FindPattern(0xC0000000, 0x20000,"\x1F\x37\x66\xE3", "xxxx");
PrintStringBottomCentre("%02x", PATTERN);
Now, say I had an integer for example: 0xDEADBEEF
I want to convert this into a char pointer like: "\xDE\xAD\xBE\xEF", this is so that I can put it into my FindPattern function. How would I do this?
You have to be careful here. On many architectures including x86, ints are stored using little endian, meaning that the int 0xDEADBEEF is stored in memory in this order: EF BE AD DE. But the char array is stored in the order DE AD BE EF.
So the question is, are you trying to find an int 0xDEADBEEF stored in memory, or do you actually want the sequence of bytes DE AD BE EF?
If you want the int, don't use a char* array at all. Pass in your pattern and mask as DWORDs, and you can simplify that function a lot.
If you want to find the sequence of bytes, then don't store it as an int in the first place. Just get the input as a char array and pass it directly in as your pattern.
Edit: you can try something like this, which I think will give you what you want:
int a = 0xDEADBEEF;
char pattern[4];
pattern[0] = (a >> 24) & 0xFF;
pattern[1] = (a >> 16) & 0xFF;
pattern[2] = (a >> 8) & 0xFF;
pattern[3] = a & 0xFF;
The \ character in C/C++ is an escape character, so anything that follows it is translated to the escape character you want, hex conversion (\x) in your string. In order to avoid that, add another \ before it so it will be considered as a normal character.
Ex.) \\xDE\\xAD\\xBE\\xEF

Cheking a pattern of bits in a sequence

So basically i need to check if a certain sequence of bits occurs in other sequence of bits(32bits).
The function shoud take 3 arguments:
n right most bits of a value.
a value
the sequence where the n bits should be checked for occurance
The function has to return the number of bit where the desired sequence started. Example chek if last 3 bits of 0x5 occur in 0xe1f4.
void bitcheck(unsigned int source, int operand,int n)
int i,lastbits,mask;
for(i=0; i<32; i++)
printf("It start at bit number %i\n",i+n);
Your loop goes too far, I'm afraid. It could, for example 'find' the bit pattern '0001' in a value ~0, which consists of ones only.
This will do better (I hope):
void checkbit(unsigned value, unsigned pattern, unsigned n)
unsigned size = 8 * sizeof value;
if( 0 < n && n <= size)
unsigned mask = ~0U >> (size - n);
pattern &= mask;
for(int i = 0; i <= size - n; i ++, value >>= 1)
if((value & mask) == pattern)
printf("pattern found at bit position %u\n", i+n);
I take you to mean that you want to take source as a bit array, and to search it for a bit sequence specified by the n lowest-order bits of operand. It seems you would want to perform a standard mask & compare; the only (minor) complication being that you need to scan. You seem already to have that idea.
I'd write it like this:
void bitcheck(uint32_t source, uint32_t operand, unsigned int n) {
uint32_t mask = ~((~0) << n);
uint32_t needle = operand & mask;
int i;
for(i = 0; i <= (32 - n); i += 1) {
if (((source >> i) & mask) == needle) {
/* found it */
There are some differences in the details between mine and yours, but the main functional difference is the loop bound: you must be careful to ignore cases where some of the bits you compare against the target were introduced by a shift operation, as opposed to originating in source, lest you get false positives. The way I've written the comparison makes it clearer (to me) what the bound should be.
I also use the explicit-width integer data types from stdint.h for all values where the code depends on a specific width. This is an excellent habit to acquire if you want to write code that ports cleanly.
finding 10 in 11 will be true for your old code. In fact, your original condition will always return true when 'source' is made of all ones.

GCC Bit-scan-forward to find next set bit?

I have a uint64_t and I would like to find the index of the first set bit, reset it to zero and find the next set bit.
How do I know when to terminate? BSF on all zeros is undefined...
const uint64_t input = source;
if(0 != input){
int32_t setIndex = GCC_BSF_INTRINSIC(input);
while(setIndex != UNDEFINED???){
//Do my logic
input[setIndex] = 0;
setIndex = BSF_Variant(input);
Could somebody please help?
The simplest would be to just check the input:
while (input) {
int32_t index = __builtin_ffsll(input);
// do stuff
More complicatedly, according to the docs the docs:
— Built-in Function: int __builtin_ffs (int x)
Returns one plus the index of the least significant 1-bit of x, or if x is zero, returns zero.
Which lets you do:
for (int index = __builtin_ffsll(input);
index = __builtin_ffsll(input))
// do stuff
Which accomplishes the same thing, you just have to repeat the __builtin_ffsll call, so it's more verbose and in my opinion doesn't contribute to clarity.
2 points to keep in mind when using __builtin_ffs:
in order to get the next bit, you need to clear the recently found bit
if you are planning to use the result, for bit shifting or table indexing, you would most likely need to decrease it by one.
while (m) {
// Get the rightmost bit location.
int BitOffset = __builtin_ffs(m);
// Clear the bit before the next iteration.
// Used in the loop condition.
m = (m >> BitOffset) << BitOffset;
// Do your stuff .....