Understanding how long long int get stored? - c++

I am running following program: (URL: http://ideone.com/aoJoI5)
#include <iostream>
#include <cmath>
using namespace std;
int main()
{
long long int N=pow(2, 36);
cout << N <<endl;
int count = 0;
cout << "Positions where bits are set : " << endl;
for(int j=0; j<sizeof(long long int)*8; ++j){
if(N&(1<<j)){
++count;
cout << j << endl;
}
}
return 0;
}
This program gives me output as:
68719476736
Positions where bits are set :
31
63
Now as I am using N=2^36, which means 36th bit should be 1 and nothing else, but why program gives me position 31 and 63? is anything wrong with my program?
I have one observation that if we use N=2^{exp} where exp >= 32 it always give positions for set bit to be 31 and 63. Can anybody please explain why this happens?

If int is 32-bit long, 1<<j will do shifting too much and invoke undefined behavior.
Here is my guess of the cause:
When j becomes 31, the 1 bit comes to the sign bit.
Seeing the sign bit being 1, to calculate bitwise AND with N, the value is sign-extended, so bits from 31st to 63th (0-origin) become 1.
The 36th bit (0-origin) in N is 1, so the result of bitwise AND will be nonzero.
The condition is evaluated as true and the number is printed.
When j is 63, if you use IA-32 CPU, the width to be shifted is masked to 5 bits, so it will be interpreted as 31 and the same thing will happen.
To avoid this undefined behavior, use unsigned long long value to shift like 1ull<<j.
Note that using long long is not good because shifting the 1 bit to sign bit invokes undefined behavior.

Related

Finding the largest prime factor? (Doesn't work in large number?)

I am a beginner in C++, and I just finished reading chapter 1 of the C++ Primer. So I try the problem of computing the largest prime factor, and I find out that my program works well up to a number of sizes 10e9 but fails after that e.g.600851475143 as it always returns a wired number e.g.2147483647 when I feed any large number into it. I know a similar question has been asked many times, I just wonder why this could happen to me. Thanks in advance.
P.S. I guess the reason has to do with some part of my program that is not capable to handle some large number.
#include <iostream>
int main()
{
int val = 0, temp = 0;
std::cout << "Please enter: " << std::endl;
std::cin >> val;
for (int num = 0; num != 1; val = num){
num = val/2;
temp = val;
while (val%num != 0)
--num;
}
std::cout << temp << std::endl;
return 0;
}
Your int type is 32 bits (like on most systems). The largest value a two's complement signed 32 bit value can store is 2 ** 31 - 1, or 2147483647. Take a look at man limits.h if you want to know the constants defining the limits of your types, and/or use larger types (e.g. unsigned would double your range at basically no cost, uint64_t from stdint.h/inttypes.h would expand it by a factor of 8.4 billion and only cost something meaningful on 32 bit systems).
2147483647 isn't a wired number its INT_MAX which is defined in climits header file. This happens when you reach maximum capacity of an int.
You can use a bigger data type such as std::size_t or unsigned long long int, for that purpose, which have a maximum value of 18446744073709551615.

I ran into a weird bug in c++ where a statement calculating an addition of 2 small integers overflow into a long long value

I recently ran into this weird C++ bug that I could not understand. Here's my code:
#include <bits/stdc++.h>
using namespace std;
typedef vector <int> vi;
typedef pair <int, int> ii;
#define ff first
#define ss second
#define pb push_back
const int N = 2050;
int n, k, sum = 0;
vector <ii> a;
vi pos;
int main (void) {
cin >> n >> k;
for (int i = 1; i < n+1; ++i) {
int val;
cin >> val;
a.pb(ii(val, i));
}
cout << a.size()-1 << " " << k << " " << a.size()-k-1 << "\n";
}
When I tried out with test:
5 5
1 1 1 1 1
it returned:
4 5 4294967295
but when I changed the declaration from:
int n, k, sum = 0;
to:
long long n, k, sum = 0;
then the program returned the correct value which was:
4 5 -1
I could not figure out why the program behaved like that since -1 should not exceed an integer value. Can anyone explain this to me? I'm really appreciated your kind helps.
Thanks
Obviously, on your machine, your size_t is a 32-bit integer, whereas long long is 64 bit. size_t always is an unsigned type, so you get:
cout << a.size() - 1
// ^ unsigned ^ promoted to unsigned
// output as uint32_t
// ^ (!)
a.size() - k - 1
// ^ promoted to long long, as of smaller size!
// -> overall expression is int64_t
// ^ (!)
You would not have seen any difference in the two values printed (would have been 18446744073709551615) if size_t was 64 bit as well, as then the signed long long k (int64_t) would have promoted to unsigned (uint64_t) instead.
Be aware that static_cast<UnsignedType>(-1) always evaluates (according to C++ conversion rules) to std::numeric_limits<UnsignedType>::max()!
Side note about size_t: This is defined as an unsigned integral type large enough to hold the maximum size you can allocate on your system for an object, so the size in bits is hardware dependent and in the end, correlates with the size in bits of the memory address bus (first power of two not smaller than).
vector::size returns size_t (unsigned), the expression a.size()-k-1 evaluates to an unsigned type, so you end up with an underflow.

Discrepancy in Snippet

Though the two snippets below have a slight difference in the manipulation of the find variable, still the output seems to be the same. Why so?
First Snippet
#include<iostream>
using namespace std;
int main()
{
int number = 3,find;
find = number << 31;
find *= -1;
cout << find;
return 0;
}
Second Snippet
#include<iostream>
using namespace std;
int main()
{
int number = 3,find;
find = number << 31;
find *= 1;
cout << find;
return 0;
}
Output for both snippets:
-2147483648
(according to Ideone: 1, 2)
In both your samples, assuming 32bit ints, you're invoking undefined behavior as pointed out in Why does left shift operation invoke Undefined Behaviour when the left side operand has negative value?
Why? Because number is a signed int, with 32bits of storage. (3<<31) is not representable in that type.
Once you're in undefined behavior territory, the compiler can do as it pleases.
(You can't rely on any of the following because it is UB - this is just an observation of what your compiler appears to be doing).
In this case it looks like the compiler is doing the right shift, resulting in 0x80000000 as a binary representation. This happens to be the two's complement representation of INT_MIN. So the second snippet is not surprising.
Why does the first one output the same thing? In two's complement, MIN_INT would be -2^31. But the max value is 2^31-1. MIN_INT * -1 would be 2^31 (if it was representable). And guess what representation that would have? 0x80000000. Back to where you started!

computing permutation of specific bits in a number

As part of my master thesis, I get a number (e.g. 5 bits) with 2 significant bits (2nd and 4th). This means for example x1x0x, where $x \in {0,1}$ (x could be 0 or 1) and 1,0 are bits with fixed values.
My first task is to compute all the combinations of the above given number , 2^3 = 8. This is called S_1 group.
Then I need to compute 'S_2' group and this is all the combinations of the two numbers x0x0x and x1x1x(this means one mismatch in the significant bits), this should give us $\bin{2}{1} * 2^3 = 2 * 2^3 = 16.
EDIT
Each number, x1x1x and x0x0x, is different from the Original number, x1x0x, at one significant bit.
Last group, S_3, is of course two mismatches from the significant bits, this means, all the numbers which pass the form x0x1x, 8 possibilities.
The computation could be computed recursively or independently, that is not a problem.
I would be happy if someone could give a starting point for these computations, since what I have is not so efficient.
EDIT
Maybe I chose my words wrongly, using significant bits. What I meant to say is that a specific places in a five bits number the bit are fixed. Those places I defined as specific bits.
EDIT
I saw already 2 answers and it seems I should have been clearer. What I am more interested in, is finding the numbers x0x0x, x1x1x and x0x1x with respect that this is a simply example. In reality, the group S_1 (in this example x1x0x) would be built with at least 12 bit long numbers and could contain 11 significant bits. Then I would have 12 groups...
If something is still not clear please ask ;)
#include <vector>
#include <iostream>
#include <iomanip>
using namespace std;
int main()
{
string format = "x1x0x";
unsigned int sigBits = 0;
unsigned int sigMask = 0;
unsigned int numSigBits = 0;
for (unsigned int i = 0; i < format.length(); ++i)
{
sigBits <<= 1;
sigMask <<= 1;
if (format[i] != 'x')
{
sigBits |= (format[i] - '0');
sigMask |= 1;
++numSigBits;
}
}
unsigned int numBits = format.length();
unsigned int maxNum = (1 << numBits);
vector<vector<unsigned int> > S;
for (unsigned int i = 0; i <= numSigBits; i++)
S.push_back(vector<unsigned int>());
for (unsigned int i = 0; i < maxNum; ++i)
{
unsigned int changedBits = (i & sigMask) ^ sigBits;
unsigned int distance = 0;
for (unsigned int j = 0; j < numBits; j++)
{
if (changedBits & 0x01)
++distance;
changedBits >>= 1;
}
S[distance].push_back(i);
}
for (unsigned int i = 0; i <= numSigBits; ++i)
{
cout << dec << "Set with distance " << i << endl;
vector<unsigned int>::iterator iter = S[i].begin();
while (iter != S[i].end())
{
cout << hex << showbase << *iter << endl;
++iter;
}
cout << endl;
}
return 0;
}
sigMask has a 1 where all your specific bits are. sigBits has a 1 wherever your specific bits are 1. changedBits has a 1 wherever the current value of i is different from sigBits. distance counts the number of bits that have changed. This is about as efficient as you can get without precomputing a lookup table for the distance calculation.
Of course, it doesn't actually matter what the fixed-bit values are, only that they're fixed. xyxyx, where y is fixed and x isn't, will always yield 8 potentials. The potential combinations of the two groups where y varies between them will always be a simple multiplication- that is, for each state that the first may be in, the second may be in each state.
Use bit logic.
//x1x1x
if(01010 AND test_byte) == 01010) //--> implies that the position where 1s are are 1.
There's probably a number-theoretic solution, but, this is very simple.
This needs to be done with a fixed-bit integer type. Some dynamic languages (python for example), will extend bits out if they think it's a good idea.
This is not hard, but it is time consuming, and TDD would be particularly appropriate here.

64bit shift problem

Why this code does not write 0 as a last element but 18446744073709551615?
(compiled with g++)
#include <iostream>
using namespace std;
int main(){
unsigned long long x = (unsigned long long) (-1);
for(int i=0; i <= 64; i++)
cout << i << " " << (x >> i) << endl;
cout << (x >> 64) << endl;
return 0;
}
When you shift a value by more bits than word size, it usually gets shifted by mod word-size. Basically, shifting it by 64 means shifting by 0 bits which is equal to no shifting at all. You shouldn't rely on this though as it's not defined by the standard and it can be different on different architectures.
Shifting a number a number of bits that is equal to or larger than its width is undefined behavior. You can only safely shift a 64-bit integer between 0 and 63 positions.
This warning from the compiler should be a hint:
"warning: right shift count >= width of type"
This results in undefined behavior:
http://sourcefrog.net/weblog/software/languages/C/bitshift.html
well, you are shifting one too many times. you are shifting from 0 to 64 inclusive which is a total of 65 times. You generally want:
for(int i=0; i < 64; i++)
....
You overflow the shift. If you've noticed, GCC even warns you:
warning: right shift count >= width of type
How come? You include 64 as a valid shift, which is an undefined behavior.
counting from 0 to 64 there are 65 numbers (0 included). 0 being the first bit (much like arrays).
#include <iostream>
using namespace std;
int main(){
unsigned long long x = (unsigned long long) (-1);
for(int i=0; i < 64; i++)
cout << i << " " << (x >> i) << endl;
cout << (x >> 63) << endl;
return 0;
}
Will produce the output you'd expect.
You can use:
static inline pack_t lshift_fix64(pack_t shiftee, short_idx_t shifter){
return (shiftee << shifter) & (-(shifter < 64));
}
for such a trick,
(-(shifter < 64)) == 0xffff ffff ffff ffff
if shifter < 64 and
(-(shifter < 64)) == 0x0
otherwise.
I get:
test.c:8: warning: right shift count >= width of type
so perhaps it's undefined behavior?
The bit pattern of -1 looks like 0xFFFFFFFFFFFFFFFF in hex, for 64 bit types. Thus if you print it as an unsigned variable you will see the largest value an unsigned 64 bit variable can hold, i.e. 18446744073709551615.
When bit shifting we don't care what a value means in this case, i.e. it doesn't matter if the variable is signed or unsigned it is treated the same way (shifting all bits one step to the right in this case).
Another trap for the unwary: I know this is an old thread, but I came here looking for help. I got caught out on a 64 bit machine using 1&LT;&LT;k when I meant 1L&LT;&LT;k; no help from the compiler in this case :(