I am trying to fix this part of an abandonware program because I failed to find an alternative program.
As you can see the data of PUSH instructions are in the wrong order whereas Ethereum is a big endian machine (address are correctly represented because they use a smaller type).
An alternative is to run porosity.exe --code '0x61004b60026319e44e32' --disassm
Theu256 type is defined as
using u256 = boost::multiprecision::number<boost::multiprecision::cpp_int_backend<256, 256, boost::multiprecision::unsigned_magnitude, boost::multiprecision::unchecked, void>>;
Here’s a minimal example to reproduce the bug:
#include <sstream>
#include <iostream>
#include <iomanip>
#include <boost/multiprecision/cpp_int.hpp>
using u256 = boost::multiprecision::number<boost::multiprecision::cpp_int_backend<256, 256, boost::multiprecision::unsigned_magnitude, boost::multiprecision::unchecked, void>>;
int main() {
std::stringstream stream;
u256 data=0xFEDEFA;
for (int i = 0; i<5; ++i) { // print only the first 5 digits
uint8_t dataByte = int(data & 0xFF);
data >>= 8;
stream << std::setfill('0') << std::setw(sizeof(char) * 2) << std::hex << int(dataByte) << " ";
}
std::cout << stream.str();
}
So numbers are converted to string with a space between each byte (and only the first bytes).
But then I ran into an endianness problem: bytes were printed in the reverse order. I mean for example, 31722 is written 8a 02 02 on my machine and 02 02 8a when compiled for a big endian target.
So as I don’t which boost function to call, I modified the code:
#include <sstream>
#include <iostream>
#include <iomanip>
#include <boost/multiprecision/cpp_int.hpp>
using u256 = boost::multiprecision::number<boost::multiprecision::cpp_int_backend<256, 256, boost::multiprecision::unsigned_magnitude, boost::multiprecision::unchecked, void>>;
int main() {
std::stringstream stream;
u256 data=0xFEDEFA;
for (int i = 0; i<5; ++i) {
uint8_t dataByte = int(data >> ((32 - i - 1) * 8));
stream << std::setfill('0') << std::setw(sizeof(char) * 2) << std::hex << int(dataByte) << " ";
}
std::cout << stream.str();
}
Now, why are my 256 bits integers printed mostly as series of 00 00 00 00 00?
BTW, this is not an endianness issue; you aren't doing byte accesses to the object-representation. You're operating on it as a 256-bit integer and simply asking for the low 8 bits at a time with data & 0xFF.
If you did know the endianness of the target C implementation, and the data layout of the boost object, you could efficiently loop over it in descending address order with unsigned char*.
You're introducing the idea of endianness only because it's associated with byte-reversal, which is what you're trying to do. But that's really inefficient, just loop over the bytes of your bigint the other way.
I'm hesitant to recommend a specific solution because I don't know what will compile efficiently. But you might want something like this instead of byte-reversing ahead of time:
for (outer loop) {
uint64_t chunk = data >> (64*3); // grab the highest 64-bit chunk
data <<= 64; // and shift everything up
// alternative: maybe keep a shift-count in a variable instead of modifying `data`
// Then pick apart the chunk into its component bytes, in MSB first order
for (int = 0 ; i<8 ; i++) {
unsigned tmp = (chunk >> 56) & 0xFF;
// do something with it
chunk <<= 8; // bring the next byte to the top
}
}
In the inner loop, more efficient than using two shifts can be using a rotate to bring the high byte to the bottom (for & 0xFF) at the same time as shifting lower bytes upward.
Best practices for circular shift (rotate) operations in C++
In the outer loop, IDK if boost::multiprecision::number has any APIs for efficient indexing of chunks built in; if so using that is probably more efficient.
I used nested loops because I assume data <<= 8 doesn't compile particularly efficiently, and neither would (data >> (256-8)) & 0xFF. But that's how you'd grab bytes from the top instead of the bottom.
Another option is the standard trick for converting numbers to strings: store characters into a buffer in descending order. A 256-bit (32-byte) number will take 64 hex digits, and you want another 32 bytes of spaces between them.
For example:
// 97 = 32 * 2 + 32, plus 1 byte for an implicit-length C string terminator
// plus another 1 for an extra space
char buf[98]; // small enough to use automatic storage
char *outp = buf+96; // pointer to the end
*outp = 0; // terminator
const char *hex_lut = "0123456789abcdef";
for (int i=0 ; i<32 ; i++) {
uint8_t byte = data & 0xFF;
*--outp = hex_lut[byte >> 4];
*--outp = hex_lut[byte & 0xF];
*--outp = ' ';
data >>= 8;
}
// outp points at an extra ' '
outp++;
// outp points at the first byte of a string like "12 ab cd"
stream << outp;
If you want to break that up into chunks to put a line break in there, you can do that too.
If you're interested in efficient conversion to hex for 8, 16 or 32 bytes of data at once, see How to convert a number to hex? for some x86 SIMD ways. The asm should port easily to C++ intrinsics. (You can use SIMD shuffles to handle putting bytes into MSB-first printing order after loading from little-endian integers.)
You could also use a SIMD shuffle to space-separate your pairs of hex digits before storing to memory like you apparently want here.
Bug in the code you added:
So I added this code before the loop above:
for(unsigned int i=0,data,_data;i<33;++i)
unsigned i, data, _data declares new variables of type unsigned int that shadow the previous declarations of data and _data. That loop has zero effect on data or _data outside the scope of the loop. (And contains UB because you read _data and data without initializing them.)
If those vars are actually both still the u256 vars of the outer scope, I don't see an obvious problem other than efficiency, but maybe I'm missing the obvious too. I didn't look very hard because using 64x 256-bit shifts and 32x ORs seems like a horrible idea. It's possible it could optimize away completely, or into bswap byte-reverse instructions on ISAs that have them, but I doubt it. Especially not through the extra complication of the boost::multiprecision::number wrapper functions.
Related
I need to create a program, that calculates CRC from file. It needs to be done bit by bit.
The way I would like to read a file:
unsigned char byte;
ifstream file;
bool result;
int number;
file.open("test.txt", ios::binary);
while(true)
{
byte = file.get();
number = (int)byte;
result = file.good();
if(!result)
break;
}
However, I don't know how to read it bit by bit.
My CRC's divisor (called a "polynomial") is 0x04C11DB7 and I need to import 1 new bit from file each time I calculate my buffer.
My idea is to add first 4 bytes to variable (for let's say "1234" it would be 0x31323334), then remove last bit (by moving the number 1 bit to the left), but I don't know how to add a new bit from the next char.
Do you mean something along these lines?
The CRC calculation may vary, but the focus here is on getting the file content "bit by bit".
#include <iostream>
#include <fstream>
int main(int argc, char* argv[])
{
unsigned char next;
unsigned long crc = 0;
if (argc < 2)
return -1;
std::fstream fs(argv[1], std::fstream::in);
while (!fs.bad() && !fs.eof())
{
fs >> next;
for (int i = 0; i < 8; i++)
{
crc += next & 1;
next >>= 1;
}
}
std::cout << "CRC " << crc << std::endl;
return 0;
}
The divisor is not just called a polynomial. It means that each bit is a coefficient of a polynomial (of degree 32) and thus the way of computing with polynomial differs significantly from working with integers. You can add (and substract, which is the same in this case) two polynomials with a simple XOR operation. Multiplying/Dividing with/by X means shifting. To the right or to the left depends on the order in which the coefficients of the polynomials are written. This is important to know because both directions (left to right and right to left) actually exist. In the case of 0x04C11DB7, the coefficient of X^0 is bit 0 and the coefficient of X^31 is bit 31. Be aware that the popular implementation of the IEEE802.3 CRC has the opposite bit order. So, just copying the implementation of an Ethernet CRC will not work.
This means the next bit to process is always bit 31. You must therefore check for 0x80000000. If the bit is set, XOR your polynomial. This means, you subtract the polynomical from your work register. In any case, shift the result to the left afterwards. Then a 0 bit is shifted in at the right. Replace it with the next bit to process by a binary or operation (| in C++). You obtain that bit in the same way: if you are reading byte by byte, your next bit is 1 or 0, depending on whether 0x80 is set in your input. Then shift your input to the left.
I am using the boost crc library to calculate a 32-bit crc of an 112-bit (80-bit data + 32-bit crc) bitset. For testing, I reset all 80 data bits to 0. The calculation of the crc seems to work fine, but when I append the calculated crc to the data and calculate the crc again, I get a value > 0. But I expected a crc value of exactly 0. Here is the code:
// creating 112-bit bitset and reset all values to 0
bitset<112> b_data;
b_data.reset();
// create string containing data and blank crc
string crc_string = b_data.to_string<char,string::traits_type,string::allocator_type>();
// calculating crc using boost library
boost::crc_32_type crc32;
crc32 = for_each(crc_string.begin(), crc_string.end(), crc32);
bitset<32> b_crc(crc32());
// writing calculated crc to the last 32-bit of the 112-bit bitset.
for(int i = 0; i!= b_crc.size(); i++){
b_data[i+80] = b_crc[i];
}
// create string containing data and calculated crc
string crc_string_check = b_data.to_string<char,string::traits_type,string::allocator_type>();
// calculate crc again to check if the data is correct
boost::crc_32_type crc32_check;
crc32_check = std::for_each(crc_string_check.begin(), crc_string_check.end(), crc32_check);
// output the result
cout << crc32() << endl;
cout << crc32_check() << endl;
The output is:
1326744236
559431208
The output I have expected is:
1326744236
0
So something goes wrong, but what? Any ideas ?
Thanks in advance.
Your expectation of what is "satisfactory" is not correct. A CRC has the property you expect only if it has a zero initialization and no exclusive-or of the result. However the standard CRC-32 that you requested, crc_32_type initializes the CRC register with 0xffffffff, and exclusive-ors the result with 0xffffffff.
However you will always get the same constant when you take the CRC of the message concatenated with the message's CRC (assuming that you order the bytes of the CRC correctly). That constant is 0x2144df1c for this particular CRC, and is the CRC of four zero bytes.
It is common for CRC's to be defined in this way, so that the CRC of a string of zeros is not zero. Since the initialization and exclusive-or are the same, the CRC of the empty set is conveniently zero.
What you should be doing is simply computing the CRC on the message without the CRC, and compare that to the transmitted CRC. That is what is normally done, and applies to all message hashes in general.
First, like the commenter said, donot include the CRC bits once you check the sum. I suggest making a helper function to check n bits (80, in this case).
Secondly, consider refactoring the code to be readable and expressive of the intent:
Live On Coliru
#include <iostream>
#include <string>
#include <bitset>
#include <algorithm>
#include <boost/crc.hpp>
#include <cassert>
template <size_t N>
uint32_t crc32_n(std::bitset<N> const& bs, size_t n) {
assert(n <= N);
std::string const s = bs.template to_string<char>();
auto f = s.begin();
return std::for_each(f, f+n, boost::crc_32_type{})();
}
int main() {
std::bitset<112> b_data;
// set some random data (above the first 32 bits that are for CRC)
srand(time(0));
b_data.set(rand()%80 + 32);
auto const crc = crc32_n(b_data, 80);
std::cout << crc << ":\t" << b_data << "\n";
b_data |= crc;
std::cout << crc32_n(b_data, 80) << ":\t" << b_data << "\n";
}
Prints e.g.
1770803766: 0000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000
1770803766: 0000000000000000000000000000000000000000000000000000000000000000000000010000000001101001100011000101001000110110
or (on another run)
2436181323: 0000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000
2436181323: 0000000000000000000000000000000000000000000000000000000000000000000000100000000010010001001101010010110101001011
I would love to go on and remove the horrific to_string call but see How do I convert bitset to array of bytes/uint8?
If you're going to include the CRC in the CRC check, which in my view is the correct technique, the expected result is always zero. Not the CRC that was transmitted.
I have a large mass of integers that I'm reading from a file. All of them will be either 0 or 1, so I have converted each read integer to a boolean.
What I need to do is take advantage of the space (8 bits) that a character provides by packing every 8 bits/booleans into a single character. How can I do this?
I have experimented with binary operations, but I'm not coming up with what I want.
int count = 7;
unsigned char compressedValue = 0x00;
while(/*Not end of file*/)
{
...
compressedValue |= booleanValue << count;
count--;
if (count == 0)
{
count = 7;
//write char to stream
compressedValue &= 0;
}
}
Update
I have updated the code to reflect some corrections suggested so far. My next question is, how should I initialize/clear the unsigned char?
Update
Reflected the changes to clear the character bits.
Thanks for the help, everyone.
Several notes:
while(!in.eof()) is wrong, you have to first try(!) to read something and if that succeeded, you can use the data.
Use an unsigned char to get an integer of at least eight bits. Alternatively, look into stdint.h and use uint8_t (or uint_least8_t).
The shift operation is in the wrong direction, use uint8_t(1) << count instead.
If you want to do something like that in memory, I'd use a bigger type, like 32 or 64 bits, because reading a byte is still a single RAM access even if much more than a byte could be read at once.
After writing a byte, don't forget to zero the temporary.
As Mooing Duck suggested, you can use a bitset.
The source code is only a proof of concept - especially the file-read has to be implemented.
#include <bitset>
#include <cstdint>
#include <iostream>
int main() {
char const fcontent[56] { "\0\001\0\001\0\001\0\001\0\001"
"\001\001\001\001\001\001\001\001\001\001\001\001"
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"
"\0\001\0\001\0\001\0\001" };
for( int i { 0 }; i < 56; i += 8 ) {
std::bitset<8> const bs(fcontent+i, 8, '\0', '\001');
std::cout << bs.to_ulong() << " ";
}
std::cout << std::endl;
return 0;
}
Output:
85 127 252 0 0 1 84
The standard guaranties that vector<bool> is packed the way you want. Don't reinvent the wheel.more info here
I come across a very tricky problem with bit manipulation.
As far as I know, the smallest variable size to hold a value is one byte of 8 bits. The bit operations available in C/C++ apply to an entire unit of bytes.
Imagine that I have a map to replace a binary pattern 100100 (6 bits) with a signal 10000 (5 bits). If the 1st byte of input data from a file is 10010001 (8 bits) being stored in a char variable, part of it matches the 6 bit pattern and therefore be replaced by the 5 bit signal to give a result of 1000001 (7 bits).
I can use a mask to manipulate the bits within a byte to get a result of the left most bits to 10000 (5 bit) but the right most 3 bits become very tricky to manipulate. I cannot shift the right most 3 bits of the original data to get the correct result 1000001 (7 bit) followed by 1 padding bit in that char variable that should be filled by the 1st bit of next followed byte of input.
I wonder if C/C++ can actually do this sort of replacement of bit patterns of length that do not fit into a Char (1 byte) variable or even Int (4 bytes). Can C/C++ do the trick or we have to go for other assembly languages that deal with single bits manipulations?
I heard that Power Basic may be able to do the bit-by-bit manipulation better than C/C++.
If time and space are not important then you can convert the bits to a string representation and perform replaces on the string, then convert back when needed. Not an elegant solution but one that works.
<< shiftleft
^ XOR
>> shift right
~ one's complement
Using these operations, you could easily isolate the pieces that you are interested in and compare them as integers.
say the byte 001000100 and you want to check if it contains 1000:
char k = (char)68;
char c = (char)8;
int i = 0;
while(i<5){
if((k<<i)>>(8-3-i) == c){
//do stuff
break;
}
}
This is very sketchy code, just meant to be a demonstration.
I wonder if C/C++ can actually do this
sort of replacement of bit patterns of
length that do not fit into a Char (1
byte) variable or even Int (4 bytes).
What about std::bitset?
Here's a small bit reader class which may suit your needs. Of course, you may want to create a bit writer for your use case.
#include <iostream>
#include <sstream>
#include <cassert>
class BitReader {
public:
typedef unsigned char BitBuffer;
BitReader(std::istream &input) :
input(input), bufferedBits(8) {
}
BitBuffer peekBits(int numBits) {
assert(numBits <= 8);
assert(numBits > 0);
skipBits(0); // Make sure we have a non-empty buffer
return (((input.peek() << 8) | buffer) >> bufferedBits) & ((1 << numBits) - 1);
}
void skipBits(int numBits) {
assert(numBits >= 0);
numBits += bufferedBits;
while (numBits > 8) {
buffer = input.get();
numBits -= 8;
}
bufferedBits = numBits;
}
BitBuffer readBits(int numBits) {
assert(numBits <= 8);
assert(numBits > 0);
BitBuffer ret = peekBits(numBits);
skipBits(numBits);
return ret;
}
bool eof() const {
return input.eof();
}
private:
std::istream &input;
BitBuffer buffer;
int bufferedBits; // How many bits are buffered into 'buffer' (0 = empty)
};
Use a vector<bool> if you can read your data into the vector mostly at once. It may be more difficult to find-and-replace sequences of bits, though.
If I understood your questions correctly, you have an input stream and and output stream and you want to replace the 6bits of the input with 5 in the output - and your output still should be a bit stream?
So, the most important programmer's rule can be applied: Divide et impera!
You should split your component in three parts:
Input Stream converter: Convert every pattern in the input stream to a char array (ring) buffer. If I understood you correctly your input "commands" are 8bit long, so there is nothing special about this.
Do the replacement on the ring buffer in a way that you replace every matching 6-bit pattern with the 5bit one, but "pad" the 5 bit with a leading zero, so the total length is still 8bit.
Write an output handler that reads from the ring buffer and let this output handler write only the 7 LSB to the output stream from each input byte. Of course some bit manipulation is necessary again for this.
If your ring buffer size can be divided by 8 and 7 (= is a multiple of 56) you will have a clean buffer at the end and can start again with 1.
The most simplest way to implement this is to iterate over this 3 steps as long as input data is available.
If a performance really matters and you are running on a multi-core CPU you even could split the steps and 3 threads, but then you must carefully synchronize the access to the ring buffer.
I think the following does what you want.
PATTERN_LEN = 6
PATTERNMASK = 0x3F //6 bits
PATTERN = 0x24 //b100100
REPLACE_LEN = 5
REPLACEMENT = 0x10 //b10000
void compress(uint8* inbits, uint8* outbits, int len)
{
uint16 accumulator=0;
int nbits=0;
uint8 candidate;
while (len--) //for all input bytes
{
//for each bit (msb first)
for (i=7;i<=0;i--)
{
//add 1 bit to accumulator
accumulator<<=1;
accumulator|=(*inbits&(1<<i));
nbits++;
//check for pattern
candidate = accumulator&PATTERNMASK;
if (candidate==PATTERN)
{
//remove pattern
accumulator>>=PATTERN_LEN;
//add replacement
accumulator<<=REPLACE_LEN;
accumulator|=REPLACMENT;
nbits+= (REPLACE_LEN - PATTERN_LEN);
}
}
inbits++;
//move accumulator to output to prevent overflow
while (nbits>8)
{
//copy the highest 8 bits
nbits-=8;
*outbits++ = (accumulator>>nbits)&0xFF;
//clear them from accumulator
accumulator&= ~(0xFF<<nbits);
}
}
//copy remainder of accumulator to output
while (nbits>0)
{
nbits-=8;
*outbits++ = (accumulator>>nbits)&0xFF;
accumulator&= ~(0xFF<<nbits);
}
}
You could use a switch or a loop in the middle to check the candidate against multiple patterns. There might have to be some special handling after doing a replacment to ensure the replacement pattern is not re-checked for matches.
#include <iostream>
#include <cstring>
size_t matchCount(const char* str, size_t size, char pat, size_t bsize) noexcept
{
if (bsize > 8) {
return 0;
}
size_t bcount = 0; // curr bit number
size_t pcount = 0; // curr bit in pattern char
size_t totalm = 0; // total number of patterns matched
const size_t limit = size*8;
while (bcount < limit)
{
auto offset = bcount%8;
char c = str[bcount/8];
c >>= offset;
char tpat = pat >> pcount;
if ((c & 1) == (tpat & 1))
{
++pcount;
if (pcount == bsize)
{
++totalm;
pcount = 0;
}
}
else // mismatch
{
bcount -= pcount; // backtrack
//reset
pcount = 0;
}
++bcount;
}
return totalm;
}
int main(int argc, char** argv)
{
const char* str = "abcdefghiibcdiixyz";
char pat = 'i';
std::cout << "Num matches = " << matchCount(str, 18, pat, 7) << std::endl;
return 0;
}
In C/C++, is there an easy way to apply bitwise operators (specifically left/right shifts) to dynamically allocated memory?
For example, let's say I did this:
unsigned char * bytes=new unsigned char[3];
bytes[0]=1;
bytes[1]=1;
bytes[2]=1;
I would like a way to do this:
bytes>>=2;
(then the 'bytes' would have the following values):
bytes[0]==0
bytes[1]==64
bytes[2]==64
Why the values should be that way:
After allocation, the bytes look like this:
[00000001][00000001][00000001]
But I'm looking to treat the bytes as one long string of bits, like this:
[000000010000000100000001]
A right shift by two would cause the bits to look like this:
[000000000100000001000000]
Which finally looks like this when separated back into the 3 bytes (thus the 0, 64, 64):
[00000000][01000000][01000000]
Any ideas? Should I maybe make a struct/class and overload the appropriate operators? Edit: If so, any tips on how to proceed? Note: I'm looking for a way to implement this myself (with some guidance) as a learning experience.
I'm going to assume you want bits carried from one byte to the next, as John Knoeller suggests.
The requirements here are insufficient. You need to specify the order of the bits relative to the order of the bytes - when the least significant bit falls out of one byte, does to go to the next higher or next lower byte.
What you are describing, though, used to be very common for graphics programming. You have basically described a monochrome bitmap horizontal scrolling algorithm.
Assuming that "right" means higher addresses but less significant bits (ie matching the normal writing conventions for both) a single-bit shift will be something like...
void scroll_right (unsigned char* p_Array, int p_Size)
{
unsigned char orig_l = 0;
unsigned char orig_r;
unsigned char* dest = p_Array;
while (p_Size > 0)
{
p_Size--;
orig_r = *p_Array++;
*dest++ = (orig_l << 7) + (orig_r >> 1);
orig_l = orig_r;
}
}
Adapting the code for variable shift sizes shouldn't be a big problem. There's obvious opportunities for optimisation (e.g. doing 2, 4 or 8 bytes at a time) but I'll leave that to you.
To shift left, though, you should use a separate loop which should start at the highest address and work downwards.
If you want to expand "on demand", note that the orig_l variable contains the last byte above. To check for an overflow, check if (orig_l << 7) is non-zero. If your bytes are in an std::vector, inserting at either end should be no problem.
EDIT I should have said - optimising to handle 2, 4 or 8 bytes at a time will create alignment issues. When reading 2-byte words from an unaligned char array, for instance, it's best to do the odd byte read first so that later word reads are all at even addresses up until the end of the loop.
On x86 this isn't necessary, but it is a lot faster. On some processors it's necessary. Just do a switch based on the base (address & 1), (address & 3) or (address & 7) to handle the first few bytes at the start, before the loop. You also need to special case the trailing bytes after the main loop.
Decouple the allocation from the accessor/mutators
Next, see if a standard container like bitset can do the job for you
Otherwise check out boost::dynamic_bitset
If all fails, roll your own class
Rough example:
typedef unsigned char byte;
byte extract(byte value, int startbit, int bitcount)
{
byte result;
result = (byte)(value << (startbit - 1));
result = (byte)(result >> (CHAR_BITS - bitcount));
return result;
}
byte *right_shift(byte *bytes, size_t nbytes, size_t n) {
byte rollover = 0;
for (int i = 0; i < nbytes; ++i) {
bytes[ i ] = (bytes[ i ] >> n) | (rollover < n);
byte rollover = extract(bytes[ i ], 0, n);
}
return &bytes[ 0 ];
}
Here's how I would do it for two bytes:
unsigned int rollover = byte[0] & 0x3;
byte[0] >>= 2;
byte[1] = byte[1] >> 2 | (rollover << 6);
From there, you can generalize this into a loop for n bytes. For flexibility, you will want to generate the magic numbers (0x3 and 6) rather then hardcode them.
I'd look into something similar to this:
#define number_of_bytes 3
template<size_t num_bytes>
union MyUnion
{
char bytes[num_bytes];
__int64 ints[num_bytes / sizeof(__int64) + 1];
};
void main()
{
MyUnion<number_of_bytes> mu;
mu.bytes[0] = 1;
mu.bytes[1] = 1;
mu.bytes[2] = 1;
mu.ints[0] >>= 2;
}
Just play with it. You'll get the idea I believe.
Operator overloading is syntactic sugar. It's really just a way of calling a function and passing your byte array without having it look like you are calling a function.
So I would start by writing this function
unsigned char * ShiftBytes(unsigned char * bytes, size_t count_of_bytes, int shift);
Then if you want to wrap this up in an operator overload in order to make it easier to use or because you just prefer that syntax, you can do that as well. Or you can just call the function.