C++ most efficient way to convert string to int (faster than atoi) - c++

As mentioned in the title, I'm looking for something that can give me more performance than atoi. Presently, the fastest way I know is
atoi(mystring.c_str())
Finally, I would prefer a solution that doesn't rely on Boost. Does anybody have good performance tricks for doing this?
Additional Information: int will not exceed 2 billion, it is always positive, the string has no decimal places in it.

I experimented with solutions using lookup tables, but found them fraught with issues, and actually not very fast. The fastest solution turned out to be the least imaginitive:
int fast_atoi( const char * str )
{
int val = 0;
while( *str ) {
val = val*10 + (*str++ - '0');
}
return val;
}
Running a benchmark with a million randomly generated strings:
fast_atoi : 0.0097 seconds
atoi : 0.0414 seconds
To be fair, I also tested this function by forcing the compiler not to inline it. The results were still good:
fast_atoi : 0.0104 seconds
atoi : 0.0426 seconds
Provided your data conforms to the requirements of the fast_atoi function, that is pretty reasonable performance. The requirements are:
Input string contains only numeric characters, or is empty
Input string represents a number from 0 up to INT_MAX

atoi can be improved upon significantly, given certain assumptions. This was demonstrated powerfully in a presentation by Andrei Alexandrescu at the C++ and Beyond 2012 conference. Hi s replacement used loop unrolling and ALU parallelism to achieve orders of magnitude in perf improvement. I don't have his materials, but this link uses a similar technique: http://tombarta.wordpress.com/2008/04/23/specializing-atoi/

This page compares conversion speed between different string->int functions using different compilers. The naive function, which offers no error checking, offers speeds roughly twice as fast as atoi(), according to the results presented.
// Taken from http://tinodidriksen.com/uploads/code/cpp/speed-string-to-int.cpp
int naive(const char *p) {
int x = 0;
bool neg = false;
if (*p == '-') {
neg = true;
++p;
}
while (*p >= '0' && *p <= '9') {
x = (x*10) + (*p - '0');
++p;
}
if (neg) {
x = -x;
}
return x;
}
it is always positive
Remove the negative checks in the above code for a micro optimization.
If you can guarantee the string will not have anything but numeric characters, you can micro optimize further by changing the loop
while (*p >= '0' && *p <= '9') {
to
while (*p != '\0' ) {
Which leaves you with
unsigned int naive(const char *p) {
unsigned int x = 0;
while (*p != '\0') {
x = (x*10) + (*p - '0');
++p;
}
return x;
}

Quite a few of the code examples here are quite complex and do unnecessary work, meaning the code could be slimmer and faster.
Conversion loops are often written to do three different things with each character:
bail out if it is the end-of-string character
bail out if it is not a digit
convert it from its code point to the actual digit value
First observation: there is no need to check for the end-of-string character separately, since it is not a digit. Hence the check for 'digitness' covers the EOS condition implicitly.
Second observation: double conditions for range testing as in (c >= '0' && c <= '9') can be converted to a single test condition by using an unsigned type and anchoring the range at zero; that way there can be no unwanted values below the beginning of the range, all unwanted values are mapped to the range above the upper limit: (uint8_t(c - '0') <= 9)
It just so happens that c - '0' needs to be computed here anyway...
Hence the inner conversion loop can be slimmed to
uint64_t n = digit_value(*p);
unsigned d;
while ((d = digit_value(*++p)) <= 9)
{
n = n * 10 + d;
}
The code here is called with the precondition that p be pointing at a digit, which is why the first digit is extracted without further ado (which also avoids a superfluous MUL).
That precondition is less outlandish than might appear at first, since p pointing at a digit is the reason why this code is called by the parser in the first place. In my code the whole shebang looks like this (assertions and other production-quality noise elided):
unsigned digit_value (char c)
{
return unsigned(c - '0');
}
bool is_digit (char c)
{
return digit_value(c) <= 9;
}
uint64_t extract_uint64 (char const **read_ptr)
{
char const *p = *read_ptr;
uint64_t n = digit_value(*p);
unsigned d;
while ((d = digit_value(*++p)) <= 9)
{
n = n * 10 + d;
}
*read_ptr = p;
return n;
}
The first call to digit_value() is often elided by the compiler, if the code gets inlined and the calling code has already computed that value by calling is_digit().
n * 10 happens to be faster than manual shifting (e.g. n = (n << 3) + (n << 1) + d), at least on my machine with gcc 4.8.1 and VC++ 2013. My guess is that both compilers use LEA with index scaling for adding up to three values in one go and scaling one of them by 2, 4, or 8.
In any case that's exactly how it should be: we write nice clean code in separate functions and express the desired logic (n * 10, x % CHAR_BIT, whatever) and the compiler converts it to shifting, masking, LEAing and so on, inlines everything into the big bad parser loop and takes care of all the required messiness under the hood to make things fast. We don't even have to stick inline in front of everything anymore. If anything then we have to do the opposite, by using __declspec(noinline) judiciously when compilers get over-eager.
I'm using the above code in a program that reads billions of numbers from text files and pipes; it converts 115 million uints per second if the length is 9..10 digits, and 60 million/s for length 19..20 digits (gcc 4.8.1). That's more than ten times as fast as strtoull() (and just barely enough for my purposes, but I digress...). That's the timing for converting text blobs containing 10 million numbers each (100..200 MB), meaning that memory timings make these numbers appear a bit worse than they would be in a synthetic benchmark running from cache.

Paddy's implementation of fast_atoi is faster than atoi - without the shadow of the doubt - however it works only for unsigned integers.
Below, I put evaluated version of Paddy's fast_atoi that also allows only unsigned integers but speeds conversion up even more by replacing costly operation * with +
unsigned int fast_atou(const char *str)
{
unsigned int val = 0;
while(*str) {
val = (val << 1) + (val << 3) + *(str++) - 48;
}
return val;
}
Here, I put complete version of fast_atoi() that i'm using sometimes which converts singed integers as well:
int fast_atoi(const char *buff)
{
int c = 0, sign = 0, x = 0;
const char *p = buff;
for(c = *(p++); (c < 48 || c > 57); c = *(p++)) {if (c == 45) {sign = 1; c = *(p++); break;}}; // eat whitespaces and check sign
for(; c > 47 && c < 58; c = *(p++)) x = (x << 1) + (x << 3) + c - 48;
return sign ? -x : x;
}

Here's the entirety of the atoi function in gcc:
long atoi(const char *str)
{
long num = 0;
int neg = 0;
while (isspace(*str)) str++;
if (*str == '-')
{
neg=1;
str++;
}
while (isdigit(*str))
{
num = 10*num + (*str - '0');
str++;
}
if (neg)
num = -num;
return num;
}
The whitespace and negative check are superfluous in your case, but also only use nanoseconds.
isdigit is almost certainly inlined, so that's not costing you any time.
I really don't see room for improvement here.

A faster convert function only for positive integers without error checking.
Multiplication is always slower that sum and shift, therefore change multiply with shift.
int fast_atoi( const char * str )
{
int val = 0;
while( *str ) {
val = (val << 3) + (val << 1) + (*str++ - '0');
}
return val;
}

I did a quick benchmark of the different functions given here + some extras, and I converted them to int64_t by default. Compiler = MSVC.
Here are the results (left = normal time, right = time with overhead deduction):
atoi : 153283912 ns => 1.000x : 106745800 ns => 1.000x
atoll : 174446125 ns => 0.879x : 127908013 ns => 0.835x
std::stoll : 358193237 ns => 0.428x : 311655125 ns => 0.343x
std::stoull : 354171912 ns => 0.433x : 307633800 ns => 0.347x
-----------------------------------------------------------------
fast_null : 46538112 ns => 3.294x : 0 ns => infx (overhead estimation)
fast_atou : 92299625 ns => 1.661x : 45761513 ns => 2.333x (#soerium)
FastAtoiBitShift: 93275637 ns => 1.643x : 46737525 ns => 2.284x (#hamSh)
FastAtoiMul10 : 93260987 ns => 1.644x : 46722875 ns => 2.285x (#hamSh but with *10)
FastAtoiCompare : 86691962 ns => 1.768x : 40153850 ns => 2.658x (#DarthGizka)
FastAtoiCompareu: 86960900 ns => 1.763x : 40422788 ns => 2.641x (#DarthGizka + uint)
-----------------------------------------------------------------
FastAtoi32 : 92779375 ns => 1.652x : 46241263 ns => 2.308x (handle the - sign)
FastAtoi32u : 86577312 ns => 1.770x : 40039200 ns => 2.666x (no sign)
FastAtoi32uu : 87298600 ns => 1.756x : 40760488 ns => 2.619x (no sign + uint)
FastAtoi64 : 93693575 ns => 1.636x : 47155463 ns => 2.264x
FastAtoi64u : 86846912 ns => 1.765x : 40308800 ns => 2.648x
FastAtoi64uu : 86890537 ns => 1.764x : 40352425 ns => 2.645x
FastAtoiDouble : 90126762 ns => 1.701x : 43588650 ns => 2.449x (only handle int)
FastAtoiFloat : 92062775 ns => 1.665x : 45524663 ns => 2.345x (same)
DarthGizka's code is the fastest and has the advantage of stopping when the char is non-digit.
Also, the bitshifting "optimization" is a tiny bit slower than just doing * 10.
The benchmark runs each algorithm with 10 million iterations on a pseudo-random string, to limit the branch prediction as much as possible, and then it re-runs everything 15 more times. For each algorithm, the 4 slowest and 4 fastest times are discarded, and the result given is the average of the 8 median times. This provides a lot of stability. Also, I run fast_null in order to estimate the overhead in the benchmark (loop + string changes + function call), and then this value is deducted in the second numbers.
Here is the code for the functions:
int64_t fast_null(const char* str) { return (str[0] - '0') + (str[1] - '0'); }
int64_t fast_atou(const char* str)
{
int64_t val = 0;
while (*str) val = (val << 1) + (val << 3) + *(str++) - 48;
return val;
}
int64_t FastAtoiBitShift(const char* str)
{
int64_t val = 0;
while (*str) val = (val << 3) + (val << 1) + (*str++ - '0');
return val;
}
int64_t FastAtoiMul10(const char* str)
{
int64_t val = 0;
while (*str) val = val * 10 + (*str++ - '0');
return val;
}
int64_t FastAtoiCompare(const char* str)
{
int64_t val = 0;
uint8_t x;
while ((x = uint8_t(*str++ - '0')) <= 9) val = val * 10 + x;
return val;
}
uint64_t FastAtoiCompareu(const char* str)
{
uint64_t val = 0;
uint8_t x;
while ((x = uint8_t(*str++ - '0')) <= 9) val = val * 10 + x;
return val;
}
int32_t FastAtoi32(const char* str)
{
int32_t val = 0;
int sign = 0;
if (*str == '-')
{
sign = 1;
++str;
}
uint8_t digit;
while ((digit = uint8_t(*str++ - '0')) <= 9) val = val * 10 + digit;
return sign ? -val : val;
}
int32_t FastAtoi32u(const char* str)
{
int32_t val = 0;
uint8_t digit;
while ((digit = uint8_t(*str++ - '0')) <= 9) val = val * 10 + digit;
return val;
}
uint32_t FastAtoi32uu(const char* str)
{
uint32_t val = 0;
uint8_t digit;
while ((digit = uint8_t(*str++ - '0')) <= 9) val = val * 10u + digit;
return val;
}
int64_t FastAtoi64(const char* str)
{
int64_t val = 0;
int sign = 0;
if (*str == '-')
{
sign = 1;
++str;
}
uint8_t digit;
while ((digit = uint8_t(*str++ - '0')) <= 9) val = val * 10 + digit;
return sign ? -val : val;
}
int64_t FastAtoi64u(const char* str)
{
int64_t val = 0;
uint8_t digit;
while ((digit = uint8_t(*str++ - '0')) <= 9) val = val * 10 + digit;
return val;
}
uint64_t FastAtoi64uu(const char* str)
{
uint64_t val = 0;
uint8_t digit;
while ((digit = uint8_t(*str++ - '0')) <= 9) val = val * 10u + digit;
return val;
}
float FastAtoiFloat(const char* str)
{
float val = 0;
uint8_t x;
while ((x = uint8_t(*str++ - '0')) <= 9) val = val * 10.0f + x;
return val;
}
double FastAtoiDouble(const char* str)
{
double val = 0;
uint8_t x;
while ((x = uint8_t(*str++ - '0')) <= 9) val = val * 10.0 + x;
return val;
}
And the benchmark code I used, just in case...
void Benchmark()
{
std::map<std::string, std::vector<int64_t>> funcTimes;
std::map<std::string, std::vector<int64_t>> funcTotals;
std::map<std::string, int64_t> funcFinals;
#define BENCH_ATOI(func) \
do \
{ \
auto start = NowNs(); \
int64_t z = 0; \
char string[] = "000001987"; \
for (int i = 1e7; i >= 0; --i) \
{ \
string[0] = '0' + (i + 0) % 10; \
string[1] = '0' + (i + 1) % 10; \
string[2] = '0' + (i + 3) % 10; \
string[3] = '0' + (i + 5) % 10; \
string[4] = '0' + (i + 9) % 10; \
z += func(string); \
} \
auto elapsed = NowNs() - start; \
funcTimes[#func].push_back(elapsed); \
funcTotals[#func].push_back(z); \
} \
while (0)
for (int i = 0; i < 16; ++i)
{
BENCH_ATOI(atoi);
BENCH_ATOI(atoll);
BENCH_ATOI(std::stoll);
BENCH_ATOI(std::stoull);
//
BENCH_ATOI(fast_null);
BENCH_ATOI(fast_atou);
BENCH_ATOI(FastAtoiBitShift);
BENCH_ATOI(FastAtoiMul10);
BENCH_ATOI(FastAtoiCompare);
BENCH_ATOI(FastAtoiCompareu);
//
BENCH_ATOI(FastAtoi32);
BENCH_ATOI(FastAtoi32u);
BENCH_ATOI(FastAtoi32uu);
BENCH_ATOI(FastAtoi64);
BENCH_ATOI(FastAtoi64u);
BENCH_ATOI(FastAtoi64uu);
BENCH_ATOI(FastAtoiFloat);
BENCH_ATOI(FastAtoiDouble);
}
for (auto& [func, times] : funcTimes)
{
std::sort(times.begin(), times.end(), [](const auto& a, const auto& b) { return a < b; });
fmt::print("{:<16}: {}\n", func, funcTotals[func][0]);
int64_t total = 0;
for (int i = 4; i <= 11; ++i) total += times[i];
total /= 8;
funcFinals[func] = total;
}
const auto base = funcFinals["atoi"];
const auto overhead = funcFinals["fast_null"];
for (const auto& [func, final] : funcFinals)
fmt::print("{:<16}: {:>9} ns => {:.3f}x : {:>9} ns => {:.3f}x\n", func, final, base * 1.0 / final, final - overhead, (base - overhead) * 1.0 / (final - overhead));
}

Why not use a stringstream? I'm not sure of its particular overhead, but you could define:
int myInt;
string myString = "1561";
stringstream ss;
ss(myString);
ss >> myInt;
Of course, you'd need to
#include <stringstream>

The only definitive answer is with checking with your compiler, your real data.
Something I'd try (even if it's using memory accesses so it may be slow depending on caching) is
int value = t1[s[n-1]];
if (n > 1) value += t10[s[n-2]]; else return value;
if (n > 2) value += t100[s[n-3]]; else return value;
if (n > 3) value += t1000[s[n-4]]; else return value;
... continuing for how many digits you need to handle ...
if t1, t10 and so on are statically allocated and constant the compiler shouldn't fear any aliasing and the machine code generated should be quite decent.

Here is mine. Atoi is the fastest I could come up with. I compiled with msvc 2010 so it might be possible to combine both templates. In msvc 2010, when I combined templates it made the case where you provide a cb argument slower.
Atoi handles nearly all the special atoi cases, and is as fast or faster than this:
int val = 0;
while( *str )
val = val*10 + (*str++ - '0');
Here is the code:
#define EQ1(a,a1) (BYTE(a) == BYTE(a1))
#define EQ1(a,a1,a2) (BYTE(a) == BYTE(a1) && EQ1(a,a2))
#define EQ1(a,a1,a2,a3) (BYTE(a) == BYTE(a1) && EQ1(a,a2,a3))
// Atoi is 4x faster than atoi. There is also an overload that takes a cb argument.
template <typename T>
T Atoi(LPCSTR sz) {
T n = 0;
bool fNeg = false; // for unsigned T, this is removed by optimizer
const BYTE* p = (const BYTE*)sz;
BYTE ch;
// test for most exceptions in the leading chars. Most of the time
// this test is skipped. Note we skip over leading zeros to avoid the
// useless math in the second loop. We expect leading 0 to be the most
// likely case, so we test it first, however the cpu might reorder that.
for ( ; (ch=*p-'1') >= 9 ; ++p) { // unsigned trick for range compare
// ignore leading 0's, spaces, and '+'
if (EQ1(ch, '0'-'1', ' '-'1', '+'-'1'))
continue;
// for unsigned T this is removed by optimizer
if (!((T)-1 > 0) && ch==BYTE('-'-'1')) {
fNeg = !fNeg;
continue;
}
// atoi ignores these. Remove this code for a small perf increase.
if (BYTE(*p-9) > 4) // \t, \n, 11, 12, \r. unsigned trick for range compare
break;
}
// deal with rest of digits, stop loop on non digit.
for ( ; (ch=*p-'0') <= 9 ; ++p) // unsigned trick for range compare
n = n*10 + ch;
// for unsigned T, (fNeg) test is removed by optimizer
return (fNeg) ? -n : n;
}
// you could go with a single template that took a cb argument, but I could not
// get the optimizer to create good code when both the cb and !cb case were combined.
// above code contains the comments.
template <typename T>
T Atoi(LPCSTR sz, BYTE cb) {
T n = 0;
bool fNeg = false;
const BYTE* p = (const BYTE*)sz;
const BYTE* p1 = p + cb;
BYTE ch;
for ( ; p<p1 && (ch=*p-'1') >= 9 ; ++p) {
if (EQ1(ch,BYTE('0'-'1'),BYTE(' '-'1'),BYTE('+'-'1')))
continue;
if (!((T)-1 > 0) && ch == BYTE('-'-'1')) {
fNeg = !fNeg;
continue;
}
if (BYTE(*p-9) > 4) // \t, \n, 11, 12, \r
break;
}
for ( ; p<p1 && (ch=*p-'0') <= 9 ; ++p)
n = n*10 + ch;
return (fNeg) ? -n : n;
}

Related

hex string arithmetic in c++

I want to do basic arithmetic (addition, subtraction and comparison) with 64 digit hex numbers represented as strings. for example
"ffffa"+"2" == "ffffc"
Since binary representation of such a number requires 256 bits, I cannot convert the string to basic integer types. one solution is to use gmp or boost/xint but they are too big for this simple functionality.
Is there a lightweight solution that can help me?
Just write a library which will handle the strings with conversion between hex to int and will add one char at a time, taking care of overflow. It took minutes to implement such an algorithm:
#include <cstdio>
#include <sstream>
#include <iostream>
using namespace std;
namespace hexstr {
char int_to_hexchar(int v) {
if (0 <= v && v <= 9) {
return v + '0';
} else {
return v - 10 + 'a';
}
}
int hexchar_to_int(char c) {
if ('0' <= c && c <= '9') {
return c - '0';
} else {
return c - 'a' + 10;
}
}
int add_digit(char a, char b) {
return hexchar_to_int(a) + hexchar_to_int(b);
}
void reverseStr(string& str) {
int n = str.length();
for (int i = 0; i < n / 2; i++)
swap(str[i], str[n - i - 1]);
}
void _add_val_to_string(string& s, int& val) {
s.push_back(int_to_hexchar(val % 16));
val /= 16;
}
string add(string a, string b)
{
auto ita = a.end();
auto itb = b.end();
int tmp = 0;
string ret;
while (ita != a.begin() && itb != b.begin()) {
tmp += add_digit(*--ita, *--itb);
_add_val_to_string(ret, tmp);
}
while (ita != a.begin()) {
tmp += hexchar_to_int(*--ita);
_add_val_to_string(ret, tmp);
}
while (itb != b.begin()) {
tmp += hexchar_to_int(*--itb);
_add_val_to_string(ret, tmp);
}
while (tmp) {
_add_val_to_string(ret, tmp);
}
reverseStr(ret);
return ret;
}
}
int main()
{
std::cout
<< "1bd5adead01230ffffc" << endl
<< hexstr::add(
std::string() + "dead0000" + "00000" + "ffffa",
std::string() + "deaddead" + "01230" + "00002"
) << endl;
return 0;
}
This can be optimized, the reversing string maybe can be omitted and some cpu cycles and memory allocations spared. Also error handling is lacking. It will work only on implementations that use ASCII table as the character set and so on... But it's as simple as that. I guess this small lib can handle any hex strings way over 64 digits, depending only on the host memory.
Implementing addition, subtraction and comparison over fixed-base numeric strings yourself should be quite easy.
For instance, for addition and subtraction, simply do it as you would in paper: start on the right-hand end of both strings, parse the chars, compute the result, then carry over, etc. Comparison is even easier, and you go left-to-right.
Of course, all this is assuming you don't need performance (otherwise you should be using a proper library).

How can I keep only non-zero digits from an integer?

I am currently using the code below that removes all digits equal to zero from an integer.
int removeZeros(int candid)
{
int output = 0;
string s(itoa(candid));
for (int i = s.size(); i != 0; --i)
{
if (s[i] != '0') output = output * 10 + atoi(s[i]);
}
return output;
}
The expected output for e.g. 102304 would be 1234.
Is there a more compact way of doing this by directly working on the integer, that is, not string representation? Is it actually going to be faster?
Here's a way to do it without strings and buffers.
I've only tested this with positive numbers. To make this work with negative numbers is an exercise left up to you.
int removeZeros(int x)
{
int result = 0;
int multiplier = 1;
while (x > 0)
{
int digit = x % 10;
if (digit != 0)
{
int val = digit * multiplier;
result += val;
multiplier *= 10;
}
x = x / 10;
}
return result;
}
For maintainability, I would suggest, don't work directly on the numeric value. You can express your requirements in a very straightforward way using string manipulations, and while it's true that it will likely perform slower than number manipulations, I expect either to be fast enough that you don't have to worry about the performance unless it's in an extremely tight loop.
int removeZeros(int n) {
auto s = std::to_string(n);
s.erase(std::remove(s.begin(), s.end(), '0'), s.end());
return std::stoi(s);
}
As a bonus, this simpler implementation handles negative numbers correctly. For zero, it throws std::invalid_argument, because removing all zeros from 0 doesn't produce a number.
You could try something like this:
template<typename T> T nozeros( T const & z )
{
return z==0 ? 0 : (z%10?10:1)*nozeros(z/10)+(z%10);
}
If you want to take your processing one step further you can do a nice tail recursion , no need for a helper function:
template<typename T> inline T pow10(T p, T res=1)
{
return p==0 ? res : pow10(--p,res*10);
}
template<typename T> T nozeros( T const & z , T const & r=0, T const & zp =0)
{
static int digit =-1;
return not ( z ^ r ) ? digit=-1, zp : nozeros(z/10,z%10, r ? r*pow10(++digit)+zp : zp);
}
Here is how this will work with input 32040
Ret, z, r, zp, digits
-,32040,0,0, -1
-,3204,0,0, -1
-,320,4,0,0, -1
-,32,0,4,4, 0
-,3,2,4, 0
-,0,3,24, 1
-,0,0,324, 2
324,-,-,-, -1
Integer calculations are always faster than actually transforming your integer to string, making comparisons on strings, and looking up strings to turn them back to integers.
The cool thing is that if you try to pass floats you get nice compile time errors.
I claim this to be slightly faster than other solutions as it makes less conditional evaluations which will make it behave better with CPU branch prediction.
int number = 9042100;
stringstream strm;
strm << number;
string str = strm.str();
str.erase(remove(str.begin(), str.end(), '0'), str.end());
number = atoi(str.c_str());
No string representation is used here. I can't say anything about the speed though.
int removezeroes(int candid)
{
int x, y = 0, n = 0;
// I did this to reverse the number as my next loop
// reverses the number while removing zeroes.
while (candid>0)
{
x = candid%10;
n = n *10 + x;
candid /=10;
}
candid = n;
while (candid>0)
{
x = candid%10;
if (x != 0)
y = y*10 + x;
candid /=10;
}
return y;
}
If C++11 is available, I do like this with lambda function:
int removeZeros(int candid){
std::string s=std::to_string(candid);
std::string output;
std::for_each(s.begin(), s.end(), [&](char& c){ if (c != '0') output += c;});
return std::stoi(output);
}
A fixed implementation of g24l recursive solution:
template<typename T> T nozeros(T const & z)
{
if (z == 0) return 0;
if (z % 10 == 0) return nozeros(z / 10);
else return (z % 10) + ( nozeros(z / 10) * 10);
}

Print long long via fast i/o

The following code is used to print an int. How can I modify it to print a long long int? Please explain.
For pc, read putchar_unlocked
inline void writeInt (int n)
{
int N = n, rev, count = 0;
rev = N;
if (N == 0) { pc('0'); pc('\n'); return ;}
while ((rev % 10) == 0) { count++; rev /= 10;}
rev = 0;
while (N != 0) { rev = (rev<<3) + (rev<<1) + N % 10; N /= 10;}
while (rev != 0) { pc(rev % 10 + '0'); rev /= 10;}
while (count--) pc('0');
pc('\n');
return ;
}
There's nothing specific about int in the code. Just replace both occurrences of "int" by "long long int", and you're done.
(I find the "optimization" of *10 via shift and add quite ridiculous with all the divisions that remain. Any decent C compiler will do that (and much more) automatically. And don't forget to profile this "fast" version against the stdlib routine, to be sure it really was worth the effort).
This code is a lit more complex than it needs to be:
inline void writeLongLong (long long n)
{
char buffer[sizeof(n) * 8 * 3 / 10 + 3]; // 3 digits per 10 bits + two extra and space for terminating zero.
int index = sizeof(buffer)-1;
int end = index;
buffer[index--] = 0;
do {
buffer[index--] = (n % 10) + '0';
n /= 10;
} while(n);
puts(&buffer[index+1]);
}
This does the same job, with about half as many divide/modulo operations and at least I can follow it better. Note that stdio/stdlib functions are probably better than this, and this function does not cope with negative numbers (neither does the one posted above).

Efficiently convert an unsigned short to a char*

What would be an efficient, portable way to convert a unsigned short to a char* (i.e. convert 25 to '25').
I'd like to avoid things such as getting (std::string) strings involved. Performance is important in this case since this conversion will need to happen quickly and often.
I was looking into things such as using sprintf but would like to explorer any and all ideas.
First off, do it right, then do it fast--only optimize if you can see for certain that a piece of code is not performant.
snprintf() into a buffer will do what you want. Is it the fastest possible solution? Not at all. But it is among the simplest, and it will suffice to get your code into a working state. From there, if you see that those calls to snprintf() are so laborious that they need to be optimized, then and only then seek out a faster solution.
An array of strings such that
array[25] = "25";
array[26] = "26";
array[255] = "255";
maybe? You could write a small program that generates the table source code for you quite easily, and then use this file in your project.
Edit: I don't get what you mean by you don't want to ge strings involved.
try this:
int convert(unsigned short val, char* dest)
{
int i = 0;
if (val > 10000)
{
dest[i++] = (val / 10000) | 0x30;
val %= 10000;
}
if (val > 1000)
{
dest[i++] = (val / 1000) | 0x30;
val %= 1000;
}
if (val > 100)
{
dest[i++] = (val / 100) | 0x30;
val %= 100;
}
if (val > 10)
{
dest[i++] = (val / 10) | 0x30;
val %= 10;
}
dest[i++] = (val) | 0x30;
dest[i] = 0;
return i;
}
I would say at least try sprintf and since you have this tagged as C++, try StringStream, and actually profile them. In many cases the compiler is smart enough to build something that works pretty well. Only when you know it's going to be a bottleneck do you need to actually find a faster way.
I hacked together a test of various functions here, and this is what I came up with:
write_ushort: 7.81 s
uShortToStr: 8.16 s
convert: 6.71 s
use_sprintf: 49.66 s
(Write_ushort is my version, which I tried to write as clearly as possible, rather than micro-optimize, to format into a given character buffer; use_sprintf is the obvious sprintf(buf, "%d", x) and nothing else; the other two are taken from other answers here.)
This is a pretty amazing difference between them, isn't it? Who would ever think to use sprintf faced with almost an order of magnitude difference? Oh, yeah, how many times did I iterate each tested function?
// Taken directly from my hacked up test, but should be clear.
// Compiled with gcc 4.4.3 and -O2. This test is interesting, but not authoritative.
int main() {
using namespace std;
char buf[100];
#define G2(NAME,STMT) \
{ \
clock_t begin = clock(); \
for (int count = 0; count < 3000; ++count) { \
for (unsigned x = 0; x <= USHRT_MAX; ++x) { \
NAME(x, buf, sizeof buf); \
} \
} \
clock_t end = clock(); \
STMT \
}
#define G(NAME) G2(NAME,) G2(NAME,cout << #NAME ": " << double(end - begin) / CLOCKS_PER_SEC << " s\n";)
G(write_ushort)
G(uShortToStr)
G(convert)
G(use_sprintf)
#undef G
#undef G2
return 0;
}
Sprintf converted the entire possible range of unsigned shorts, then did the whole range again 2,999 more times at about 0.25 µs per conversion, on average, on my ~5 year old laptop.
Sprintf is portable; is it also efficient enough for your requirements?
My version:
// Returns number of non-null bytes written, or would be written.
// If ret is null, does not write anything; otherwise retlen is the length of
// ret, and must include space for the number plus a terminating null.
int write_ushort(unsigned short x, char *ret, int retlen) {
assert(!ret || retlen >= 1);
char s[uint_width_10<USHRT_MAX>::value]; // easy implementation agnosticism
char *n = s;
if (x == 0) {
*n++ = '0';
}
else while (x != 0) {
*n++ = '0' + x % 10;
x /= 10;
}
int const digits = n - s;
if (ret) {
// not needed by checking retlen and only writing to available space
//assert(retlen >= digits + 1);
while (--retlen && n != s) {
*ret++ = *--n;
}
*ret = '\0';
}
return digits;
}
Compile-time log TMP functions are nothing new, but including this complete example because it's what I used:
template<unsigned N>
struct uint_width_10_nonzero {
enum { value = uint_width_10_nonzero<N/10>::value + 1 };
};
template<>
struct uint_width_10_nonzero<0> {
enum { value = 0 };
};
template<unsigned N>
struct uint_width_10 {
enum { value = uint_width_10_nonzero<N>::value };
};
template<>
struct uint_width_10<0> {
enum { value = 1 };
};

How to check if the binary representation of an integer is a palindrome?

How to check if the binary representation of an integer is a palindrome?
Hopefully correct:
_Bool is_palindrome(unsigned n)
{
unsigned m = 0;
for(unsigned tmp = n; tmp; tmp >>= 1)
m = (m << 1) | (tmp & 1);
return m == n;
}
Since you haven't specified a language in which to do it, here's some C code (not the most efficient implementation, but it should illustrate the point):
/* flip n */
unsigned int flip(unsigned int n)
{
int i, newInt = 0;
for (i=0; i<WORDSIZE; ++i)
{
newInt += (n & 0x0001);
newInt <<= 1;
n >>= 1;
}
return newInt;
}
bool isPalindrome(int n)
{
int flipped = flip(n);
/* shift to remove trailing zeroes */
while (!(flipped & 0x0001))
flipped >>= 1;
return n == flipped;
}
EDIT fixed for your 10001 thing.
Create a 256 lines chart containing a char and it's bit reversed char.
given a 4 byte integer,
take the first char, look it on the chart, compare the answer to the last char of the integer.
if they differ it is not palindrome, if the are the same repeat with the middle chars.
if they differ it is not palindrome else it is.
Plenty of nice solutions here. Let me add one that is not the most efficient, but very readable, in my opinion:
/* Reverses the digits of num assuming the given base. */
uint64_t
reverse_base(uint64_t num, uint8_t base)
{
uint64_t rev = num % base;
for (; num /= base; rev = rev * base + num % base);
return rev;
}
/* Tells whether num is palindrome in the given base. */
bool
is_palindrome_base(uint64_t num, uint8_t base)
{
/* A palindrome is equal to its reverse. */
return num == reverse_base(num, base);
}
/* Tells whether num is a binary palindrome. */
bool
is_palindrome_bin(uint64_t num)
{
/* A binary palindrome is a palindrome in base 2. */
return is_palindrome_base(num, 2);
}
The following should be adaptable to any unsigned type. (Bit operations on signed types tend to be fraught with problems.)
bool test_pal(unsigned n)
{
unsigned t = 0;
for(unsigned bit = 1; bit && bit <= n; bit <<= 1)
t = (t << 1) | !!(n & bit);
return t == n;
}
int palidrome (int num)
{
int rev = 0;
num = number;
while (num != 0)
{
rev = (rev << 1) | (num & 1); num >> 1;
}
if (rev = number) return 1; else return 0;
}
I always have a palindrome function that works with Strings, that returns true if it is, false otherwise, e.g. in Java. The only thing I need to do is something like:
int number = 245;
String test = Integer.toString(number, 2);
if(isPalindrome(test)){
...
}
A generic version:
#include <iostream>
#include <limits>
using namespace std;
template <class T>
bool ispalindrome(T x) {
size_t f = 0, l = (CHAR_BIT * sizeof x) - 1;
// strip leading zeros
while (!(x & (1 << l))) l--;
for (; f != l; ++f, --l) {
bool left = (x & (1 << f)) > 0;
bool right = (x & (1 << l)) > 0;
//cout << left << '\n';
//cout << right << '\n';
if (left != right) break;
}
return f != l;
}
int main() {
cout << ispalindrome(17) << "\n";
}
I think the best approach is to start at the ends and work your way inward, i.e. compare the first bit and the last bit, the second bit and the second to last bit, etc, which will have O(N/2) where N is the size of the int. If at any point your pairs aren't the same, it isn't a palindrome.
bool IsPalindrome(int n) {
bool palindrome = true;
size_t len = sizeof(n) * 8;
for (int i = 0; i < len / 2; i++) {
bool left_bit = !!(n & (1 << len - i - 1));
bool right_bit = !!(n & (1 << i));
if (left_bit != right_bit) {
palindrome = false;
break;
}
}
return palindrome;
}
Sometimes it's good to report a failure too;
There are lots of great answers here about the obvious way to do it, by analyzing in some form or other the bit pattern. I got to wondering, though, if there were any mathematical solutions? Are there properties of palendromic numbers that we might take advantage of?
So I played with the math a little bit, but the answer should really have been obvious from the start. It's trivial to prove that all binary palindromic numbers must be either odd or zero. That's about as far as I was able to get with it.
A little research showed no such approach for decimal palindromes, so it's either a very difficult problem or not solvable via a formal system. It might be interesting to prove the latter...
public static bool IsPalindrome(int n) {
for (int i = 0; i < 16; i++) {
if (((n >> i) & 1) != ((n >> (31 - i)) & 1)) {
return false;
}
}
return true;
}
bool PaLInt (unsigned int i, unsigned int bits)
{
unsigned int t = i;
unsigned int x = 0;
while(i)
{
x = x << bits;
x = x | (i & ((1<<bits) - 1));
i = i >> bits;
}
return x == t;
}
Call PalInt(i,1) for binary pallindromes
Call PalInt(i,3) for Octal Palindromes
Call PalInt(i,4) for Hex Palindromes
I know that this question has been posted 2 years ago, but I have a better solution which doesn't depend on the word size and all,
int temp = 0;
int i = num;
while (1)
{ // let's say num is the number which has to be checked
if (i & 0x1)
{
temp = temp + 1;
}
i = i >> 1;
if (i) {
temp = temp << 1;
}
else
{
break;
}
}
return temp == num;
In JAVA there is an easy way if you understand basic binary airthmetic, here is the code:
public static void main(String []args){
Integer num=73;
String bin=getBinary(num);
String revBin=reverse(bin);
Integer revNum=getInteger(revBin);
System.out.println("Is Palindrome: "+((num^revNum)==0));
}
static String getBinary(int c){
return Integer.toBinaryString(c);
}
static Integer getInteger(String c){
return Integer.parseInt(c,2);
}
static String reverse(String c){
return new StringBuilder(c).reverse().toString();
}
#include <iostream>
#include <math.h>
using namespace std;
int main()
{
unsigned int n = 134217729;
unsigned int bits = floor(log(n)/log(2)+1);
cout<< "Number of bits:" << bits << endl;
unsigned int i=0;
bool isPal = true;
while(i<(bits/2))
{
if(((n & (unsigned int)pow(2,bits-i-1)) && (n & (unsigned int)pow(2,i)))
||
(!(n & (unsigned int)pow(2,bits-i-1)) && !(n & (unsigned int)pow(2,i))))
{
i++;
continue;
}
else
{
cout<<"Not a palindrome" << endl;
isPal = false;
break;
}
}
if(isPal)
cout<<"Number is binary palindrome" << endl;
}
The solution below works in python:
def CheckBinPal(b):
b=str(bin(b))
if b[2:]==b[:1:-1]:
return True
else:
return False
where b is the integer
If you're using Clang, you can make use of some __builtins.
bool binaryPalindrome(const uint32_t n) {
return n == __builtin_bitreverse32(n << __builtin_clz(n));
}
One thing to note is that __builtin_clz(0) is undefined so you'll need to check for zero. If you're compiling on ARM using Clang (next generation mac), then this makes use of the assembly instructions for reverse and clz (compiler explorer).
clz w8, w0
lsl w8, w0, w8
rbit w8, w8
cmp w8, w0
cset w0, eq
ret
x86 has instructions for clz (sort of) but not reversing. Still, Clang will emit the fastest code possible for reversing on the target architecture.
Javascript Solution
function isPalindrome(num) {
const binaryNum = num.toString(2);
console.log(binaryNum)
for(let i=0, j=binaryNum.length-1; i<=j; i++, j--) {
if(binaryNum[i]!==binaryNum[j]) return false;
}
return true;
}
console.log(isPalindrome(0))