Can I use a binary literal in C or C++? - c++

I need to work with a binary number.
I tried writing:
const char x = 00010000;
But it didn't work.
I know that I can use a hexadecimal number that has the same value as 00010000, but I want to know if there is a type in C++ for binary numbers, and if there isn't, is there another solution for my problem?

If you are using GCC then you can use a GCC extension (which is included in the C++14 standard) for this:
int x = 0b00010000;

You can use binary literals. They are standardized in C++14. For example,
int x = 0b11000;
Support in GCC
Support in GCC began in GCC 4.3 (see https://gcc.gnu.org/gcc-4.3/changes.html) as extensions to the C language family (see https://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html#C-Extensions), but since GCC 4.9 it is now recognized as either a C++14 feature or an extension (see Difference between GCC binary literals and C++14 ones?)
Support in Visual Studio
Support in Visual Studio started in Visual Studio 2015 Preview (see https://www.visualstudio.com/news/vs2015-preview-vs#C++).

template<unsigned long N>
struct bin {
enum { value = (N%10)+2*bin<N/10>::value };
} ;
template<>
struct bin<0> {
enum { value = 0 };
} ;
// ...
std::cout << bin<1000>::value << '\n';
The leftmost digit of the literal still has to be 1, but nonetheless.

You can use BOOST_BINARY while waiting for C++0x. :) BOOST_BINARY arguably has an advantage over template implementation insofar as it can be used in C programs as well (it is 100% preprocessor-driven.)
To do the converse (i.e. print out a number in binary form), you can use the non-portable itoa function, or implement your own.
Unfortunately you cannot do base 2 formatting with STL streams (since setbase will only honour bases 8, 10 and 16), but you can use either a std::string version of itoa, or (the more concise, yet marginally less efficient) std::bitset.
#include <boost/utility/binary.hpp>
#include <stdio.h>
#include <stdlib.h>
#include <bitset>
#include <iostream>
#include <iomanip>
using namespace std;
int main() {
unsigned short b = BOOST_BINARY( 10010 );
char buf[sizeof(b)*8+1];
printf("hex: %04x, dec: %u, oct: %06o, bin: %16s\n", b, b, b, itoa(b, buf, 2));
cout << setfill('0') <<
"hex: " << hex << setw(4) << b << ", " <<
"dec: " << dec << b << ", " <<
"oct: " << oct << setw(6) << b << ", " <<
"bin: " << bitset< 16 >(b) << endl;
return 0;
}
produces:
hex: 0012, dec: 18, oct: 000022, bin: 10010
hex: 0012, dec: 18, oct: 000022, bin: 0000000000010010
Also read Herb Sutter's The String Formatters of Manor Farm for an interesting discussion.

A few compilers (usually the ones for microcontrollers) has a special feature implemented within recognizing literal binary numbers by prefix "0b..." preceding the number, although most compilers (C/C++ standards) don't have such feature and if it is the case, here it is my alternative solution:
#define B_0000 0
#define B_0001 1
#define B_0010 2
#define B_0011 3
#define B_0100 4
#define B_0101 5
#define B_0110 6
#define B_0111 7
#define B_1000 8
#define B_1001 9
#define B_1010 a
#define B_1011 b
#define B_1100 c
#define B_1101 d
#define B_1110 e
#define B_1111 f
#define _B2H(bits) B_##bits
#define B2H(bits) _B2H(bits)
#define _HEX(n) 0x##n
#define HEX(n) _HEX(n)
#define _CCAT(a,b) a##b
#define CCAT(a,b) _CCAT(a,b)
#define BYTE(a,b) HEX( CCAT(B2H(a),B2H(b)) )
#define WORD(a,b,c,d) HEX( CCAT(CCAT(B2H(a),B2H(b)),CCAT(B2H(c),B2H(d))) )
#define DWORD(a,b,c,d,e,f,g,h) HEX( CCAT(CCAT(CCAT(B2H(a),B2H(b)),CCAT(B2H(c),B2H(d))),CCAT(CCAT(B2H(e),B2H(f)),CCAT(B2H(g),B2H(h)))) )
// Using example
char b = BYTE(0100,0001); // Equivalent to b = 65; or b = 'A'; or b = 0x41;
unsigned int w = WORD(1101,1111,0100,0011); // Equivalent to w = 57155; or w = 0xdf43;
unsigned long int dw = DWORD(1101,1111,0100,0011,1111,1101,0010,1000); //Equivalent to dw = 3745774888; or dw = 0xdf43fd28;
Disadvantages (it's not such a big ones):
The binary numbers have to be grouped 4 by 4;
The binary literals have to be only unsigned integer numbers;
Advantages:
Total preprocessor driven, not spending processor time in pointless operations (like "?.. :..", "<<", "+") to the executable program (it may be performed hundred of times in the final application);
It works "mainly in C" compilers and C++ as well (template+enum solution works only in C++ compilers);
It has only the limitation of "longness" for expressing "literal constant" values. There would have been earlyish longness limitation (usually 8 bits: 0-255) if one had expressed constant values by parsing resolve of "enum solution" (usually 255 = reach enum definition limit), differently, "literal constant" limitations, in the compiler allows greater numbers;
Some other solutions demand exaggerated number of constant definitions (too many defines in my opinion) including long or several header files (in most cases not easily readable and understandable, and make the project become unnecessarily confused and extended, like that using "BOOST_BINARY()");
Simplicity of the solution: easily readable, understandable and adjustable for other cases (could be extended for grouping 8 by 8 too);

This thread may help.
/* Helper macros */
#define HEX__(n) 0x##n##LU
#define B8__(x) ((x&0x0000000FLU)?1:0) \
+((x&0x000000F0LU)?2:0) \
+((x&0x00000F00LU)?4:0) \
+((x&0x0000F000LU)?8:0) \
+((x&0x000F0000LU)?16:0) \
+((x&0x00F00000LU)?32:0) \
+((x&0x0F000000LU)?64:0) \
+((x&0xF0000000LU)?128:0)
/* User macros */
#define B8(d) ((unsigned char)B8__(HEX__(d)))
#define B16(dmsb,dlsb) (((unsigned short)B8(dmsb)<<8) \
+ B8(dlsb))
#define B32(dmsb,db2,db3,dlsb) (((unsigned long)B8(dmsb)<<24) \
+ ((unsigned long)B8(db2)<<16) \
+ ((unsigned long)B8(db3)<<8) \
+ B8(dlsb))
#include <stdio.h>
int main(void)
{
// 261, evaluated at compile-time
unsigned const number = B16(00000001,00000101);
printf("%d \n", number);
return 0;
}
It works! (All the credits go to Tom Torfs.)

The C++ over-engineering mindset is already well accounted for in the other answers here. Here's my attempt at doing it with a C, keep-it-simple-ffs mindset:
unsigned char x = 0xF; // binary: 00001111

As already answered, the C standards have no way to directly write binary numbers. There are compiler extensions, however, and apparently C++14 includes the 0b prefix for binary. (Note that this answer was originally posted in 2010.)
One popular workaround is to include a header file with helper macros. One easy option is also to generate a file that includes macro definitions for all 8-bit patterns, e.g.:
#define B00000000 0
#define B00000001 1
#define B00000010 2
…
This results in only 256 #defines, and if larger than 8-bit binary constants are needed, these definitions can be combined with shifts and ORs, possibly with helper macros (e.g., BIN16(B00000001,B00001010)). (Having individual macros for every 16-bit, let alone 32-bit, value is not plausible.)
Of course the downside is that this syntax requires writing all the leading zeroes, but this may also make it clearer for uses like setting bit flags and contents of hardware registers. For a function-like macro resulting in a syntax without this property, see bithacks.h linked above.

C does not have native notation for pure binary numbers. Your best bet here would be either octal (e.g. 07777) of hexadecimal (e.g. 0xfff).

You can use the function found in this question to get up to 22 bits in C++. Here's the code from the link, suitably edited:
template< unsigned long long N >
struct binary
{
enum { value = (N % 8) + 2 * binary< N / 8 > :: value } ;
};
template<>
struct binary< 0 >
{
enum { value = 0 } ;
};
So you can do something like binary<0101011011>::value.

The smallest unit you can work with is a byte (which is of char type). You can work with bits though by using bitwise operators.
As for integer literals, you can only work with decimal (base 10), octal (base 8) or hexadecimal (base 16) numbers. There are no binary (base 2) literals in C nor C++.
Octal numbers are prefixed with 0 and hexadecimal numbers are prefixed with 0x. Decimal numbers have no prefix.
In C++0x you'll be able to do what you want by the way via user defined literals.

Based on some other answers, but this one will reject programs with illegal binary literals. Leading zeroes are optional.
template<bool> struct BinaryLiteralDigit;
template<> struct BinaryLiteralDigit<true> {
static bool const value = true;
};
template<unsigned long long int OCT, unsigned long long int HEX>
struct BinaryLiteral {
enum {
value = (BinaryLiteralDigit<(OCT%8 < 2)>::value && BinaryLiteralDigit<(HEX >= 0)>::value
? (OCT%8) + (BinaryLiteral<OCT/8, 0>::value << 1)
: -1)
};
};
template<>
struct BinaryLiteral<0, 0> {
enum {
value = 0
};
};
#define BINARY_LITERAL(n) BinaryLiteral<0##n##LU, 0x##n##LU>::value
Example:
#define B BINARY_LITERAL
#define COMPILE_ERRORS 0
int main (int argc, char ** argv) {
int _0s[] = { 0, B(0), B(00), B(000) };
int _1s[] = { 1, B(1), B(01), B(001) };
int _2s[] = { 2, B(10), B(010), B(0010) };
int _3s[] = { 3, B(11), B(011), B(0011) };
int _4s[] = { 4, B(100), B(0100), B(00100) };
int neg8s[] = { -8, -B(1000) };
#if COMPILE_ERRORS
int errors[] = { B(-1), B(2), B(9), B(1234567) };
#endif
return 0;
}

You can also use inline assembly like this:
int i;
__asm {
mov eax, 00000000000000000000000000000000b
mov i, eax
}
std::cout << i;
Okay, it might be somewhat overkill, but it works.

The "type" of a binary number is the same as any decimal, hex or octal number: int (or even char, short, long long).
When you assign a constant, you can't assign it with 11011011 (curiously and unfortunately), but you can use hex. Hex is a little easier to mentally translate. Chunk in nibbles (4 bits) and translate to a character in [0-9a-f].

From C++14 you can use Binary Literals, now they are part of the language:
unsigned char a = 0b00110011;

You can use a bitset
bitset<8> b(string("00010000"));
int i = (int)(bs.to_ulong());
cout<<i;

I extended the good answer given by #renato-chandelier by ensuring the support of:
_NIBBLE_(…) – 4 bits, 1 nibble as argument
_BYTE_(…) – 8 bits, 2 nibbles as arguments
_SLAB_(…) – 12 bits, 3 nibbles as arguments
_WORD_(…) – 16 bits, 4 nibbles as arguments
_QUINTIBBLE_(…) – 20 bits, 5 nibbles as arguments
_DSLAB_(…) – 24 bits, 6 nibbles as arguments
_SEPTIBBLE_(…) – 28 bits, 7 nibbles as arguments
_DWORD_(…) – 32 bits, 8 nibbles as arguments
I am actually not so sure about the terms “quintibble” and “septibble”. If anyone knows any alternative please let me know.
Here is the macro rewritten:
#define __CAT__(A, B) A##B
#define _CAT_(A, B) __CAT__(A, B)
#define __HEX_0000 0
#define __HEX_0001 1
#define __HEX_0010 2
#define __HEX_0011 3
#define __HEX_0100 4
#define __HEX_0101 5
#define __HEX_0110 6
#define __HEX_0111 7
#define __HEX_1000 8
#define __HEX_1001 9
#define __HEX_1010 a
#define __HEX_1011 b
#define __HEX_1100 c
#define __HEX_1101 d
#define __HEX_1110 e
#define __HEX_1111 f
#define _NIBBLE_(N1) _CAT_(0x, _CAT_(__HEX_, N1))
#define _BYTE_(N1, N2) _CAT_(_NIBBLE_(N1), _CAT_(__HEX_, N2))
#define _SLAB_(N1, N2, N3) _CAT_(_BYTE_(N1, N2), _CAT_(__HEX_, N3))
#define _WORD_(N1, N2, N3, N4) _CAT_(_SLAB_(N1, N2, N3), _CAT_(__HEX_, N4))
#define _QUINTIBBLE_(N1, N2, N3, N4, N5) _CAT_(_WORD_(N1, N2, N3, N4), _CAT_(__HEX_, N5))
#define _DSLAB_(N1, N2, N3, N4, N5, N6) _CAT_(_QUINTIBBLE_(N1, N2, N3, N4, N5), _CAT_(__HEX_, N6))
#define _SEPTIBBLE_(N1, N2, N3, N4, N5, N6, N7) _CAT_(_DSLAB_(N1, N2, N3, N4, N5, N6), _CAT_(__HEX_, N7))
#define _DWORD_(N1, N2, N3, N4, N5, N6, N7, N8) _CAT_(_SEPTIBBLE_(N1, N2, N3, N4, N5, N6, N7), _CAT_(__HEX_, N8))
And here is Renato's using example:
char b = _BYTE_(0100, 0001); /* equivalent to b = 65; or b = 'A'; or b = 0x41; */
unsigned int w = _WORD_(1101, 1111, 0100, 0011); /* equivalent to w = 57155; or w = 0xdf43; */
unsigned long int dw = _DWORD_(1101, 1111, 0100, 0011, 1111, 1101, 0010, 1000); /* Equivalent to dw = 3745774888; or dw = 0xdf43fd28; */

Just use the standard library in C++:
#include <bitset>
You need a variable of type std::bitset:
std::bitset<8ul> x;
x = std::bitset<8>(10);
for (int i = x.size() - 1; i >= 0; i--) {
std::cout << x[i];
}
In this example, I stored the binary form of 10 in x.
8ul defines the size of your bits, so 7ul means seven bits and so on.

Here is my function without adding Boost library :
usage : BOOST_BINARY(00010001);
int BOOST_BINARY(int a){
int b = 0;
for (int i = 0;i < 8;i++){
b += a % 10 << i;
a = a / 10;
}
return b;
}

I nominate my solution:
#define B(x) \
((((x) >> 0) & 0x01) \
| (((x) >> 2) & 0x02) \
| (((x) >> 4) & 0x04) \
| (((x) >> 6) & 0x08) \
| (((x) >> 8) & 0x10) \
| (((x) >> 10) & 0x20) \
| (((x) >> 12) & 0x40) \
| (((x) >> 14) & 0x80))
const uint8 font6[] = {
B(00001110), //[00]
B(00010001),
B(00000001),
B(00000010),
B(00000100),
B(00000000),
B(00000100),
B(00000000),
I define 8-bit fonts and graphics this way, but could work with wider fonts as well. The macro B can be defined to produce the 0b format, if supported by the compiler.
Operation: the binary numbers are interpreted in octal, and then the bits are masked and shifted together. The intermediate value is limited by the largest integer the compiler can work with, I guess 64 bits should be OK.
It's entirely processed by the compiler, no code needed runtime.

Binary constants are to be standardised in C23. As of writing, 6.4.4.1/4 of the latest C2x draft standard says of the proposed notation:
[...] A binary constant consists of the prefix 0b or 0B followed by a sequence of the digits 0 or 1.

C++ provides a standard template named std::bitset. Try it if you like.

usage : BINARY(00010001);
int BINARY(int a){
int b = 0;
for (int i = 0;i < 8;i++){
b += a % 10 << i;
a = a / 10;
}
return b;
}

You could try using an array of bool:
bool i[8] = {0,0,1,1,0,1,0,1}

Related

How to perform multiplication for integers larger than 64 bits in C++ and VHDL?

I want to multiply a 57-bit integer with an 11-bit integer. The result can be up to 68 bits so I'm planning to split my result into 2 different integers. I cannot use any library and It should be as simple as possible because the code will be translated to VHDL.
There is some way to that online but all of them are not meet my criteria. I want to split the result as an 60-bit lower part and an 8-bit higher part.
C++
int main() {
unsigned long long int log2 = 0b101100010111001000010111111101111101000111001111011110011;
unsigned short int absE;
unsigned in result_M;
unsigned long long int result_L;
result_L = absE * log2;
result_M = 0;
}
VHDL
signal absE : std_logic_vector(10 downto 0);
signal log2 : std_logic_vector(57 downto 0) := "101100010111001000010111111101111101000111001111011110011";
signal result: std_logic_vector(67 downto 0);
result <= absE * log2;
You can split the 57-bit value into smaller chunks to perform the multiplications and recombine into the required parts, for example 8+49 bits:
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
int main() {
#define MASK(n) ((1ULL << (n)) - 1)
uint64_t log2 = MASK(57); // 57 bits
uint16_t absE = MASK(11); // 11 bits
uint32_t m1 = (log2 >> 49) * absE; // middle 19 bits at offset 49;
uint64_t m0 = (log2 & MASK(49)) * absE + ((m1 & MASK(11)) << 49); // low 61 bits
uint16_t result_H = (uint16_t)(m1 >> 11) + (uint16_t)(m0 >> 60); // final high 8 bits
uint64_t result_L = m0 & MASK(60);
printf("%#"PRIx64" * %#"PRIx16" = %#"PRIx16"%012"PRIx64"\n",
log2, absE, result_H, result_L);
return 0;
}
Output: 0x1ffffffffffffff * 0x7ff = 0xffdfffffffffff801
You may need more steps if you cannot use the 64-bit multiplication used for the 49-bit by 11-bit step.
In GCC:
__int128 a;
__int128 b;
__int128 c;
uint64_t c_lo;
uint8_t c_hi;
a = 0x789;
b = 0x123456789ABCDEF;
c = a * b;
c_lo = (uint64_t)c & ((UINT64_C(1) << 60) - 1);
c_hi = (unsigned __int128)c >> 60;
You will need the standard library for this. You will need the header file <stdint.h> (<cstdint> in C++), but that shouldn't be a problem when translating into VHDL.
VHDL is different than C - here you have the paper how to implement multiplication. Expand it to as many bits as you need:
http://www.eng.auburn.edu/~nelsovp/courses/elec4200/Slides/VHDL%207%20Multiplier%20Example.pdf
if you (or every wants who wants) are not dealing with arbitrary length you can use a library like this: int512.h

Is there a 128 bit integer in C++?

I need to store a 128 bits long UUID in a variable. Is there a 128-bit datatype in C++? I do not need arithmetic operations, I just want to easily store and read the value very fast.
A new feature from C++11 would be fine, too.
Although GCC does provide __int128, it is supported only for targets (processors) which have an integer mode wide enough to hold 128 bits. On a given system, sizeof() intmax_t and uintmax_t determine the maximum value that the compiler and the platform support.
GCC and Clang support __int128
Checkout boost's implementation:
#include <boost/multiprecision/cpp_int.hpp>
using namespace boost::multiprecision;
int128_t v = 1;
This is better than strings and arrays, especially if you need to do arithmetic operations with it.
Your question has two parts.
1.128-bit integer. As suggested by #PatrikBeck boost::multiprecision is good way for really big integers.
2.Variable to store UUID / GUID / CLSID or whatever you call it. In this case boost::multiprecision is not a good idea. You need GUID structure which is designed for that purpose. As cross-platform tag added, you can simply copy that structure to your code and make it like:
struct GUID
{
uint32_t Data1;
uint16_t Data2;
uint16_t Data3;
uint8_t Data4[8];
};
This format is defined by Microsoft because of some inner reasons, you can even simplify it to:
struct GUID
{
uint8_t Data[16];
};
You will get better performance having simple structure rather than object that can handle bunch of different stuff. Anyway you don't need to do math with GUIDS, so you don't need any fancy object.
I would recommend using std::bitset<128> (you can always do something like using UUID = std::bitset<128>;). It will probably have a similar memory layout to the custom struct proposed in the other answers, but you won't need to define your own comparison operators, hash etc.
There is no 128-bit integer in Visual-C++ because the Microsoft calling convention only allows returning of 2 32-bit values in the RAX:EAX pair. The presents a constant headache because when you multiply two integers together with the result is a two-word integer. Most load-and-store machines support working with two CPU word-sized integers but working with 4 requires software hack, so a 32-bit CPU cannot process 128-bit integers and 8-bit and 16-bit CPUs can't do 64-bit integers without a rather costly software hack. 64-bit CPUs can and regularly do work with 128-bit because if you multiply two 64-bit integers you get a 128-bit integer so GCC version 4.6 does support 128-bit integers. This presents a problem with writing portable code because you have to do an ugly hack where you return one 64-bit word in the return register and you pass the other in using a reference. For example, in order to print a floating-point number fast with Grisu we use 128-bit unsigned multiplication as follows:
#include <cstdint>
#if defined(_MSC_VER) && defined(_M_AMD64)
#define USING_VISUAL_CPP_X64 1
#include <intrin.h>
#include <intrin0.h>
#pragma intrinsic(_umul128)
#elif (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6))
#define USING_GCC 1
#if defined(__x86_64__)
#define COMPILER_SUPPORTS_128_BIT_INTEGERS 1
#endif
#endif
#if USING_VISUAL_CPP_X64
UI8 h;
UI8 l = _umul128(f, rhs_f, &h);
if (l & (UI8(1) << 63)) // rounding
h++;
return TBinary(h, e + rhs_e + 64);
#elif USING_GCC
UIH p = static_cast<UIH>(f) * static_cast<UIH>(rhs_f);
UI8 h = p >> 64;
UI8 l = static_cast<UI8>(p);
if (l & (UI8(1) << 63)) // rounding
h++;
return TBinary(h, e + rhs_e + 64);
#else
const UI8 M32 = 0xFFFFFFFF;
const UI8 a = f >> 32;
const UI8 b = f & M32;
const UI8 c = rhs_f >> 32;
const UI8 d = rhs_f & M32;
const UI8 ac = a * c;
const UI8 bc = b * c;
const UI8 ad = a * d;
const UI8 bd = b * d;
UI8 tmp = (bd >> 32) + (ad & M32) + (bc & M32);
tmp += 1U << 31; /// mult_round
return TBinary(ac + (ad >> 32) + (bc >> 32) + (tmp >> 32), e + rhs_e + 64);
#endif
}
Use the TBigInteger template and set any bit range in the template array like this TBigInt<128,true> for being a signed 128 bit integer or TBigInt<128,false> for being an unsigned 128 bit integer.
Hope that helps maybe a late reply and someone else found this method already.
The TBigInt is a structure defined by Unreal Engine. It provides a multi-bit integer override.
Basic usage (as far as I can tell):
#include <Math/BigInt.h>
void foo() {
TBigInt<128, true> signed128bInt = 0;
TBigInt<128, false> unsigned128bInt = 0;
}

what is this C++ define doing? how can I write it in Python?

I have this C++ define
#define CSYNC_VERSION_INT(a, b, c) ((a) << 16 | (b) << 8 | (c))
I need to define the same in Python. What is this doing? How can I do the same in Python?
The equivalent would be
def CSYNC_VERSION_INT(a, b, c):
return a << 16 | b << 8 | c
It byteshifts a left by 16 bits, b left by 8 and c intact; then all these numbers are bitwise orred together. It thus packs the a, b, c into three (four) bytes of an integer, so that the lowest byte is the value of c, the second lowest is b and the topmost bytes are the a value.
CSYNC_VERSION_INT(3, 2, 8) is equal to 0x30208 in hex, or 197128 in decimal.
I want to add to Antti Haapala's answer what that macro does: it creates an int from three bytes, which are a, b and c.
Example:
int main()
{
unsigned int a = 0x02;
unsigned int b = 0xf4;
unsigned int c = 0x56;
unsigned int p = CSYNC_VERSION_INT(a, b, c);
// now p == 0x02f456
}
It is using bit-shifts to store a version number in a single int. It will store the "major" version in the upper 16 bits, the "minor" version in the first 8 bits of the lower 16, and the "revision" number in the lowest 8 bits.
It will not work well if the inputs are too large (e.g. if a is outside the valid range for an unsigned short, or if b or c are outside the range of an unsigned char). Since it has no type-safety, a better approach would be to make an inline function that does the same operation with the appropriate types:
inline unsigned long MakeVersion(unsigned short major, unsigned char minor, unsigned char revision)
{
unsigned long l = (static_cast<unsigned long>(major) << 16) | (static_cast<unsigned long>(minor) << 8) | (static_cast<unsigned long>(revision);
return l;
}
Since Python is a C-derived language, you should be able to use the same bit-shifts to accomplish the same task.
You can write this in python with the same meaning
res = ((a) << 16 | (b) << 8 | (c))
Assuming you have 1B data type (like char) and you want to store all the data in bigger data type (>= 3B), you have use this shift, so for
a = 01001110
b = 11010001
c = 00100011
will be
res= 01001110 11010001 00100011
(dump, all in binary)
'<<' this means bitwise shift (left)
'|' this means bitwise or (logic or for every bit)
You can also use the opposite attitude, to make a, b, c from res
a = (res >> 16) & 0xFF
b = (res >> 8) & 0xFF
c = res & 0xFF
So shift out what you need and then select only the last byte and store it.
Very useful when making calculator with unlimited precision :)

convert md5 string to base 62 string in c++

i'm trying to convert an md5 string (base 16) to a base 62 string in c++. every solution i've found so far for converting to base 62 only works if you can represent your number as a 64 bit integer or smaller. an md5 string is 128 bits and i'm not getting anywhere with this on my own.
should i just include a bigint library and be done with it?
Let's see. 128/log2(62)=21.497. That means you'd need 22 "digits" for a base-62 representation.
If you're just interested in a string representation that's not longer than 22 characters and doesn't use more than 62 different characters, you don't need a real base-62 representation. You can break up 128 bits into smaller pieces and code the pieces separately. This way you won't need any 128 bit arithmetics. You could split the 128 bits to 2x64 bits and encode each 64 bit chunk with a string of length 11. Doing so is even possible with just 57 different characters. So, you could eliminate 5 of the 62 characters to avoid any "visual ambiguities". For example, remove l,1,B,8. That leaves 58 different characters and 11*log2(58)=64.438 which is just enough to encode 64 bits.
Getting the two 64 bit chunks is not that difficult:
#include <climits>
#if CHAR_BIT != 8
#error "platform not supported, CHAR_BIT==8 expected"
#endif
// 'long long' is not yet part of C++
// But it's usually a supported extension
typedef unsigned long long uint64;
uint64 bits2uint64_bigendian(unsigned char const buff[]) {
return (static_cast<uint64>(buff[0]) << 56)
| (static_cast<uint64>(buff[1]) << 48)
| (static_cast<uint64>(buff[2]) << 40)
| (static_cast<uint64>(buff[3]) << 32)
| (static_cast<uint64>(buff[4]) << 24)
| (static_cast<uint64>(buff[5]) << 16)
| (static_cast<uint64>(buff[6]) << 8)
| static_cast<uint64>(buff[7]);
}
int main() {
unsigned char md5sum[16] = {...};
uint64 hi = bits2uint64_bigendian(md5sum);
uint64 lo = bits2uint64_bigendian(md5sum+8);
}
For simplicity, you can use my uint128_t c++ class (http://www.codef00.com/code/uint128.h). With it, a base converter would look pretty much as simple as this:
#include "uint128.h"
#include <iostream>
#include <algorithm>
int main() {
char a[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
uint128_t x = U128_C(0x130eb6003540debd012d96ce69453aed);
std::string r;
r.reserve(22); // shouldn't result in more than 22 chars
// 6-bits per 62-bit value means (128 / 6 == 21.3)
while(x != 0) {
r += a[(x % 62).to_integer()];
x /= 62;
}
// when converting bases by division, the digits are reversed...fix that :-)
std::reverse(r.begin(), r.end());
std::cout << r << std::endl;
}
This prints:
J7JWEJ0YbMGqaJFCGkUxZ
GMP provides a convenient c++ binding for arbitrary precision integers

Doing 64 bit manipulation using 32 bit data in Fixed point arithmetic using C

I am stuck with a problem. I am working on a hardware which only does support 32 bit operations.
sizeof(int64_t) is 4. Sizeof(int) is 4.
and I am porting an application which assumes size of int64_t to be 8 bytes. The problem is it has this macro
BIG_MULL(a,b) ( (int64_t)(a) * (int64_t)(b) >> 23)
The result is always a 32 bit integer but since my system doesn't support 64 bit operation, it always return me the LSB of the operation, rounding of all the results making my system crash.
Can someone help me out?
Regards,
Vikas Gupta
You simply cannot reliably store 64 bits of data in a 32-bit integer. You either have to redesign the software to work with 32-bit integers as the maximum size available or provide a way of providing 64 bits of storage for the 64-bit integers. Neither is simple - to be polite about it.
One possibility - not an easy one - is to create a structure:
typedef struct { uint32_t msw; uint32_t lsw; } INT64_t;
You can then store the data in the two 32-bit integers, and do arithmetic with components of the structure. Of course, in general, a 32-bit by 32-bit multiply produces a 64-bit answer; to do full multiplication without overflowing, you may be forced to store 4 16-bit unsigned numbers (because 16-bit numbers can be multiplied to give 32-bit results w/o overflowing). You will use functions to do the hard work - so the macro becomes a call to a function that accepts two (pointers to?) the INT64_t structure and returns one.
It won't be as fast as before...but it has some chance of working if they used the macros everywhere that was necessary.
I assume that the numbers that you are trying to multiply together are 32-bit integers. You just want to generate a product that may be larger than 32 bits. You then want to drop some known number of least significant bits from the product.
As a start, this will multiply the two integers together and overflow.
#define WORD_MASK ((1<<16) - 1)
#define LOW_WORD(x) (x & WORD_MASK)
#define HIGH_WORD(x) ((x & (WORD_MASK<<16)) >> 16)
#define BIG_MULL(a, b) \
((LOW_WORD(a) * LOW_WORD(b)) << 0) + \
((LOW_WORD(a) * HIGH_WORD(b)) << 16) + \
((HIGH_WORD(a) * LOW_WORD(b)) << 16) + \
((HIGH_WORD(a) * HIGH_WORD(b)) << 32)
If you want to drop the 23 least-significant bits from this, you could adjust it like so.
#define WORD_MASK ((1<<16) - 1)
#define LOW_WORD(x) (x & WORD_MASK)
#define HIGH_WORD(x) ((x & (WORD_MASK<<16)) >> 16)
#define BIG_MULL(a, b) \
((LOW_WORD(a) * HIGH_WORD(b)) >> 7) + \
((HIGH_WORD(a) * LOW_WORD(b)) >> 7) + \
((HIGH_WORD(a) * HIGH_WORD(b)) << 9)
Note that this will still overflow if the actual product of the multiplication is greater than 41 (=64-23) bits.
Update:
I have adjusted the code to handle signed integers.
#define LOW_WORD(x) (((x) << 16) >> 16)
#define HIGH_WORD(x) ((x) >> 16)
#define ABS(x) (((x) >= 0) ? (x) : -(x))
#define SIGN(x) (((x) >= 0) ? 1 : -1)
#define UNSIGNED_BIG_MULT(a, b) \
(((LOW_WORD((a)) * HIGH_WORD((b))) >> 7) + \
((HIGH_WORD((a)) * LOW_WORD((b))) >> 7) + \
((HIGH_WORD((a)) * HIGH_WORD((b))) << 9))
#define BIG_MULT(a, b) \
(UNSIGNED_BIG_MULT(ABS((a)), ABS((b))) * \
SIGN((a)) * \
SIGN((b)))
If you change your macro to
#define BIG_MULL(a,b) ( (int64_t)(a) * (int64_t)(b))
since it looks like int64_t is defined for you it should work
While there are other questions raised by sizeof(int64_t) == 4, this is wrong:
#define BIG_MULL(a,b) ( (int64_t)(a) * (int64_t)(b) >> 23)
The standard requires intN_t types for values of N = 8, 16, 32, and 64... if the platform supports them.
The type you should use is intmax_t, which is defined to be the largest integral type the platform supports. If your platform doesn't have 64-bit integers, your code won't break with intmax_t.
You might want to look at a bignum library such as GNU GMP. In one sense a bignum library is overkill, since they typically support arbitrary sized numbers, not just a increased in fixed size numbers. However, since it's already done, the fact that it does more than you want might not be an issue.
The alternative is to pack a couple 32-bit ints into a struct similar to Microsoft's LARGE_INTEGER:
typedef union _LARGE_INTEGER {
struct {
DWORD LowPart;
LONG HighPart;
};
struct {
DWORD LowPart;
LONG HighPart;
} u;
LONGLONG QuadPart;
} LARGE_INTEGER;
And create functions that take parameters of this type and return results in structs of this type. You could also wrap these operations in a C++ class that will let you define operator overloads that let the expressions look more natural. But I'd look at the already made libraries (like GMP) to see if they can be used - it may save you a lot of work.
I just hope you don't need to implement division using structures like this in straight C - it's painful and runs slow.