Converting a size_t into an integer (c++) - c++

I've been trying to make a for loop that will iterate based off of the length of a network packet. In the API there exists a variable (size_t) by event.packet->dataLength. I want to iterate from 0 to event.packet->dataLength - 7 increasing i by 10 each time it iterates but I am having a world of trouble.
I looked for solutions but have been unable to find anything useful. I tried converting the size_t to an unsigned int and doing the arithmetic with that but unfortunately it didn't work. Basically all I want is this:
for (int i = 0; i < event.packet->dataLength - 7; i+=10) { }
Though every time I do something like this or attempt at my conversions the i < # part is a huge number. They gave a printf statement in a tutorial for the API which used "%u" to print the actual number however when I convert it to an unsigned int it is still incorrect. I'm not sure where to go from here. Any help would be greatly appreciated :)

Why don't you change the type of i?
for (size_t i = 0; i < event.packet->dataLength - 7; i+=10) { }
Try to keep the types of all variables used together the same type; casts should be avoided.
There is no format specifier for size_t in C++03, you have to cast to the largest unsigned integer type you can and print that. (The format specifier for size_t in C++0x is %zu). However, you shouldn't be using printf anyway:
std::cout << i; // print i, even if it's a size_t
While streams may be more verbose, they're more type safe and don't require you to memorize anything.
Keep in mind your actual loop logic may be flawed. (What happens, as genpfault notes, when dataLength - 7 is negative?)

Do everything with signed arithmetic. Try:
for (int i = 0; i < int(event.packet->dataLength) - 7; i+=10) { }
Once you start using unsigned arithmetic with values that may be negative, and using comparison operators like <, you're in trouble. Much easier to keep things signed.

Is dataLength >= 7? If the result of dataLength-7 is negative, if you interpret it as unsigned, the result is a very large integer.

Use size_t for i.
For printf, if you don't have C99, only C90, cast to unsigned long, or unsigned long long. E.g.:
for (size_t i = 0; i < 10; ++i)
//printf("%llu\n", (unsigned long long)i);
printf("%lu\n", (unsigned long)i);
Otherwise use %zu

You should first check if event.packet->dataLength < 7. Now if it's less then 7 you get values less than 0 used as unsigned: e.g. 0 = 0x00000000; -1 = 0 - 1 = 0xFFFFFFFF.
Again, the check:
if (event.packet->dataLength < 7) {
...
} else {
for (size_t i = 0; i < event.packet->dataLength - 7; i+=10) { }
}

"every time I do something like this or attempt at my conversions the i < # part is a huge number."
That indicates that original packet length is less than 7 (you're subtracting 7).
One fix is to use an in-practice-large-enough signed integer type, and the standard library provides ptrdiff_t for that purpose. Like,
#include <stdlib.h> // Not sure, but I think it was this one.
typedef ptrdiff_t Size;
typedef Size Index;
void foo()
{
// ...
for( Index i = 0; i < Size( event.packet->dataLength ) - 7; i += 10 )
{
// ...
}
}
A more cumbersome workaround is to embed the whole thing in an if that checks that the size is at least 7.
Cheers & hth.,

Since event.packet->dataLength returns an unsigned type size_t:
1) Use size_t as the index variable type.
2) Insure math does not underflow. #beldaz. Rather than subtract 7 from event.packet->dataLength, add 7 to i.
// for (int i = 0; i < event.packet->dataLength - 7; i+=10) { }
for (size_t i = 0; i + 7 < event.packet->dataLength; i += 10) { }

Related

Get exact bit representation of a double, in C++

Let us say that we have a double, say, x = 4.3241;
Quite simply, I would like to know, how in C++, can one simply retrieve an int for each bit in the representation of a number?
I have seen other questions and read the page on bitset, but I'm afraid I still do not understand how to retrieve those bits.
So, for example, I would like the input to be x = 4.53, and if the bit representation was 10010101, then I would like 8 ints, each one representing each 1 or 0.
Something like:
double doubleValue = ...whatever...;
uint8_t *bytePointer = (uint8_t *)&doubleValue;
for(size_t index = 0; index < sizeof(double); index++)
{
uint8_t byte = bytePointer[index];
for(int bit = 0; bit < 8; bit++)
{
printf("%d", byte&1);
byte >>= 1;
}
}
... will print the bits out, ordered from least significant to most significant within bytes and reading the bytes from first to last. Depending on your machine architecture that means the bytes may or may not be in order of significance. Intel is strictly little endian so you should get all bits from least significant to most; most CPUs use the same endianness for floating point numbers as for integers but even that's not guaranteed.
Just allocate an array and store the bits instead of printing them.
(an assumption made: that there are eight bits in a byte; not technically guaranteed in C but fairly reliable on any hardware you're likely to encounter nowadays)
This is extremely architecture-dependent. After gathering the following information
The Endianess of your target architecture
The floating point representation (e.g. IEEE754)
The size of your double type
you should be able to get the bit representation you're searching for. An example tested on a x86_64 system
#include <iostream>
#include <climits>
int main()
{
double v = 72.4;
// Boilerplate to circumvent the fact bitwise operators can't be applied to double
union {
double value;
char array[sizeof(double)];
};
value = v;
for (int i = 0; i < sizeof(double) * CHAR_BIT; ++i) {
int relativeToByte = i % CHAR_BIT;
bool isBitSet = (array[sizeof(double) - 1 - i / CHAR_BIT] &
(1 << (CHAR_BIT - relativeToByte - 1))) == (1 << (CHAR_BIT - relativeToByte - 1));
std::cout << (isBitSet ? "1" : "0");
}
return 0;
}
Live Example
The output is
0100000001010010000110011001100110011001100110011001100110011010
which, split into sign, exponent and significand (or mantissa), is
0 10000000101 (1.)0010000110011001100110011001100110011001100110011010
(Image taken from wikipedia)
Anyway you're required to know how your target representation works, otherwise these numbers will pretty much be useless to you.
Since your question is unclear whether you want those integers to be in the order that makes sense with regard to the internal representation of your number of simply dump out the bytes at that address as you encounter them, I'm adding another easier method to just dump out every byte at that address (and showing another way of dealing with bit operators and double)
double v = 72.4;
uint8_t *array = reinterpret_cast<uint8_t*>(&v);
for (int i = 0; i < sizeof(double); ++i) {
uint8_t byte = array[i];
for (int bit = CHAR_BIT - 1; bit >= 0; --bit) // Print each byte
std::cout << ((byte & (1 << bit)) == (1 << bit));
}
The above code will simply print each byte from the one at lower address to the one with higher address.
Edit: since it seems you're just interested in how many 1s and 0s are there (i.e. the order totally doesn't matter), in this specific instance I agree with the other answers and I would also just go for a counting solution
uint8_t *array = reinterpret_cast<uint8_t*>(&v);
for (int i = 0; i < sizeof(double); ++i) {
uint8_t byte = array[i];
for (int j = 0; j < CHAR_BIT; ++j) {
std::cout << (byte & 0x1);
byte >>= 1;
}
}

Convert a decoded Base64 byte string to a vector of bools

I have a base64 string containing bits, I have alredy decoded it with the code in here. But I'm unable to transform the resultant string in bits I could work with. Is there a way to convert the bytes contained in the code to a vector of bools containing the bits of the string?
I have tried converting the char with this code but it failed to conver to a proper char
void DecodedStringToBit(std::string const& decodedString, std::vector<bool> &bits) {
int it = 0;
for (int i = 0; i < decodedString.size(); ++i) {
unsigned char c = decodedString[i];
for (unsigned char j = 128; j > 0; j <<= 1) {
if (c&j) bits[++it] = true;
else bits[++it] = false;
}
}
}
Your inner for loop is botched: it's shifting j the wrong way. And honestly, if you want to work with 8-bit values, you should use the proper <stdint.h> types instead of unsigned char:
for (uint8_t j = 128; j; j >>= 1)
bits.push_back(c & j);
Also, remember to call bits.reserve(decodedString.size() * 8); so your program doesn't waste a bunch of time on resizing.
I'm assuming the bit order is MSB first. If you want LSB first, the loop becomes:
for (uint8_t j = 1; j; j <<= 1)
In OP's code, it is not clear if the vector bits is of sufficient size, for example, if it is resized by the caller (It should not be!). If not, then the vector does not have space allocated, and hence bits[++it] may not work; the appropriate thing might be to push_back. (Moreover, I think the code might need the post-increment of it, i.e. bits[it++] to start from bits[0].)
Furthermore, in OP's code, the purpose of unsigned char j = 128 and j <<= 1 is not clear. Wouldn't j be all zeros after the first iteration? If so, the inner loop would always run for only one iteration.
I would try something like this (not compiled):
void DecodedStringToBit(std::string const& decodedString,
std::vector<bool>& bits) {
for (auto charIndex = 0; charIndex != decodedString.size(); ++charIndex) {
const unsigned char c = decodedString[charIndex];
for (int bitIndex = 0; bitIndex != CHAR_BIT; ++bitIndex) {
// CHAR_BIT = bits in a char = 8
const bool bit = c & (1 << bitIndex); // bitwise-AND with mask
bits.push_back(bit);
}
}
}

Separating digits of integer using pointers

I have an integer(i) occupying 4 bytes and i am assuming that it is stored in the memory like this, with starting address as 1000,
If i write int*p=&i;
p now stores the starting address which is 1000 here.
if i increment p it points to the address 1004.
Is there any way to traverse the address 1000, 1001, 1002 and 1003 so that i can separate and print the digits 1 ,5,2,6 using pointers?
Please help..... :( (newbie)
My assumption of storage maybe wrong Can anyone please help me correct it? :(
EDIT 1
According to the answer given by Mohit Jain below and suggestions by others,
unsigned char *cp = reinterpret_cast<unsigned char *>(&i);
for(size_t idx = 0; idx < sizeof i; ++idx) {
cout << static_cast<int>(cp[idx]);
}
I am getting the answer as
246 5 0 0 .
I realized that the way I was assuming the memory structure was wrong,
So is there no way to get the actual digits using pointers??
An int with the value 1526 will not normally be stored as four bytes with the values 1, 5, 2 and 6.
Instead, it'll be stored in binary. Assuming a little-endian machine, the bytes will have the values: 0, 0, 5, 246 (and if it's big-endian, you'll get the same values in the reverse order). The reason for those numbers is that it can store values from 0 to 255 in each byte. Therefore, it's stored as 5 * 256 + 246. When dealing with values in memory like this, it's often convenient (and quite common) to use hexadecimal instead of decimal, in which case you'd be looking at it as 0x05F6.
The usual way to get decimal digits involves more math than pointers. For example, the least significant digit will be the remainder after dividing the value by 10.
To list the memory contents
Using pointer (endian-ness dependent output)
unsigned char *cp = reinterpret_cast<unsigned char *>(&i);
for(size_t idx = 0; idx < sizeof i; ++idx) {
cout << static_cast<int>(cp[idx]);
}
Without using pointer (endian-ness independent output), because digits are not stored the way you assume.
int copy = i;
unsigned int mask = (1U << CHAR_BIT) - 1U;
for(size_t idx = 0; idx < sizeof i; ++idx) {
cout << (copy & mask);
copy >>= CHAR_BIT;
}
To list the digits
If you want the digits of integer using pointer you should first convert the integer to a string:
std::string digits = std::to_string(i); // You can alternatively use stringstream
char *p = digits.c_str();
for(size_t idx = 0; idx < digits.length(); ++idx) cout << (*p++);
You can cast the pointer to (char *) and increment that pointer to point to beginning of individual bytes. However, your assumption of storage is wrong, so you will not get the digits like that.
As I can see you want to extract each digit of a number.
To achieve it You need to:
get reminder of i divided by 10. Do it like this: const int r = i % 10;
divide i by 10: i /= 10;
if i is not 0, go to 1.
Implementation (not tested) could be like this:
do
{
const int r = i % 10;
// do anything you need with r
i /= 10;
} while (i > 0);
This will give you each digit starting from the less significant.

Find largest unsigned int .... Why doesn't this work?

Couldn't you initialize an unsigned int and then increment it until it doesn't increment anymore? That's what I tried to do and I got a runtime error "Timeout." Any idea why this doesn't work? Any idea how to do it correctly?
#include
int main() {
unsigned int i(0), j(1);
while (i != j) {
++i;
++j;
}
std::cout << i;
return 0;
}
Unsigned arithmetic is defined as modulo 2^n in C++ (where n is
the number of bits). So when you increment the maximum value,
you get 0.
Because of this, the simplest way to get the maximum value is to
use -1:
unsigned int i = -1;
std::cout << i;
(If the compiler gives you a warning, and this bothers you, you
can use 0U - 1, or initialize with 0, and then decrement.)
Since i will never be equal to j, you have an infinite loop.
Additionally, this is a very inefficient method for determining the maximum value of an unsigned int. numeric_limits gives you the result without looping for 2^(16, 32, 64, or however many bits are in your unsigned int) iterations. If you didn't want to do that, you could write a much smaller loop:
unsigned int shifts = sizeof(unsigned int) * 8; // or CHAR_BITS
unsigned int maximum_value = 1;
for (int i = 1; i < shifts; ++i)
{
maximum_value <<= 1;
++maximum_value;
}
Or simply do
unsigned int maximum = (unsigned int)-1;
i will always be different than j, so you have entered an endless loop. If you want to take this approach, your code should look like this:
unsigned int i(0), j(1);
while (i < j) {
++i;
++j;
}
std::cout << i;
return 0;
Notice I changed it to while (i<j). Once j overflows i will be greater than j.
When an overflow happens, the value doesn't just stay at the highest, it wraps back abound to the lowest possible number.
i and j will be never equal to each other. When an unsigned integral value achieves its maximum adding to it 1 will result in that the next value will be the minimum that is 0.
For example if to consider unsigned char then its maximum is 255. After adding 1 you will get 0.
So your loop is infinite.
I assume you're trying to find the maximum limit that an unsigned integer can store (which is 65,535 in decimal). The reason that the program will time out is because when the int hits the maximum value it can store, it "Goes off the end." The next time j increments, it will be 65,535; i will be 0.
This means that the way you're going about it, i would NEVER equal j, and the loop would run indefinitely. If you changed it to what Damien has, you'd have i == 65,535; j equal to 0.
Couldn't you initialize an unsigned int and then increment it until it doesn't increment anymore?
No. Unsigned arithmetic is modular, so it wraps around to zero after the maximum value. You can carry on incrementing it forever, as your loop does.
Any idea how to do it correctly?
unsigned int max = -1; // or
unsigned int max = std::numeric_limits<unsigned int>::max();
or, if you want to use a loop to calculate it, change your condition to (j != 0) or (i < j) to stop when j wraps. Since i is one behind j, that will contain the maximum value. Note that this might not work for signed types - they give undefined behaviour when they overflow.

Bit packing of array of integers

I have an array of integers, lets assume they are of type int64_t. Now, I know that only every first n bits of every integer are meaningful (that is, I know that they are limited by some bounds).
What is the most efficient way to convert the array in the way that all unnecessary space is removed (i.e. I have the first integer at a[0], the second one at a[0] + n bits and so on) ?
I would like it to be general as much as possible, because n would vary from time to time, though I guess there might be smart optimizations for specific n like powers of 2 or sth.
Of course I know that I can just iterate value over value, I just want to ask you StackOverflowers if you can think of some more clever way.
Edit:
This question is not about compressing the array to take as least space as possible. I just need to "cut" n bits from every integer and given the array I know the exact n of bits I can safely cut.
Today I released: PackedArray: Packing Unsigned Integers Tightly (github project).
It implements a random access container where items are packed at the bit-level. In other words, it acts as if you were able to manipulate a e.g. uint9_t or uint17_t array:
PackedArray principle:
. compact storage of <= 32 bits items
. items are tightly packed into a buffer of uint32_t integers
PackedArray requirements:
. you must know in advance how many bits are needed to hold a single item
. you must know in advance how many items you want to store
. when packing, behavior is undefined if items have more than bitsPerItem bits
PackedArray general in memory representation:
|-------------------------------------------------- - - -
| b0 | b1 | b2 |
|-------------------------------------------------- - - -
| i0 | i1 | i2 | i3 | i4 | i5 | i6 | i7 | i8 | i9 |
|-------------------------------------------------- - - -
. items are tightly packed together
. several items end up inside the same buffer cell, e.g. i0, i1, i2
. some items span two buffer cells, e.g. i3, i6
I agree with keraba that you need to use something like Huffman coding or perhaps the Lempel-Ziv-Welch algorithm. The problem with bit-packing the way you are talking about is that you have two options:
Pick a constant n such that the largest integer can be represented.
Allow n to vary from value to value.
The first option is relatively easy to implement, but is really going to waste a lot of space unless all integers are rather small.
The second option has the major disadvantage that you have to convey changes in n somehow in the output bitstream. For instance, each value will have to have a length associated with it. This means you are storing two integers (albeit smaller integers) for every input value. There's a good chance you'll increase the file size with this method.
The advantage of Huffman or LZW is that they create codebooks in such a way that the length of the codes can be derived from the output bitstream without actually storing the lengths. These techniques allow you to get very close to the Shannon limit.
I decided to give your original idea (constant n, remove unused bits and pack) a try for fun and here is the naive implementation I came up with:
#include <sys/types.h>
#include <stdio.h>
int pack(int64_t* input, int nin, void* output, int n)
{
int64_t inmask = 0;
unsigned char* pout = (unsigned char*)output;
int obit = 0;
int nout = 0;
*pout = 0;
for(int i=0; i<nin; i++)
{
inmask = (int64_t)1 << (n-1);
for(int k=0; k<n; k++)
{
if(obit>7)
{
obit = 0;
pout++;
*pout = 0;
}
*pout |= (((input[i] & inmask) >> (n-k-1)) << (7-obit));
inmask >>= 1;
obit++;
nout++;
}
}
return nout;
}
int unpack(void* input, int nbitsin, int64_t* output, int n)
{
unsigned char* pin = (unsigned char*)input;
int64_t* pout = output;
int nbits = nbitsin;
unsigned char inmask = 0x80;
int inbit = 0;
int nout = 0;
while(nbits > 0)
{
*pout = 0;
for(int i=0; i<n; i++)
{
if(inbit > 7)
{
pin++;
inbit = 0;
}
*pout |= ((int64_t)((*pin & (inmask >> inbit)) >> (7-inbit))) << (n-i-1);
inbit++;
}
pout++;
nbits -= n;
nout++;
}
return nout;
}
int main()
{
int64_t input[] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20};
int64_t output[21];
unsigned char compressed[21*8];
int n = 5;
int nbits = pack(input, 21, compressed, n);
int nout = unpack(compressed, nbits, output, n);
for(int i=0; i<=20; i++)
printf("input: %lld output: %lld\n", input[i], output[i]);
}
This is very inefficient because is steps one bit at a time, but that was the easiest way to implement it without dealing with issues of endianess. I have not tested this either with a wide range of values, just the ones in the test. Also, there is no bounds checking and it is assumed the output buffers are long enough. So what I am saying is that this code is probably only good for educational purposes to get you started.
Most any compression algorithm will get close to the minimum entropy needed to encode the integers, for example, Huffman coding, but accessing it like an array will be non-trivial.
Starting from Jason B's implementation, I eventually wrote my own version which processes bit-blocks instead of single bits. One difference is that it is lsb: It starts from lowest output bits going to highest. This only makes it harder to read with a binary dump, like Linux xxd -b. As a detail, int* can be trivially changed to int64_t*, and it should even better be unsigned. I have already tested this version with a few million arrays and it seems solid, so I share will the rest:
int pack2(int *input, int nin, unsigned char* output, int n)
{
int obit = 0;
int ibit = 0;
int ibite = 0;
int nout = 0;
if(nin>0) output[0] = 0;
for(int i=0; i<nin; i++)
{
ibit = 0;
while(ibit < n) {
ibite = std::min(n, ibit + 8 - obit);
output[nout] |= (input[i] & (((1 << ibite)-1) ^ ((1 << ibit)-1))) >> ibit << obit;
obit += ibite - ibit;
nout += obit >> 3;
if(obit & 8) output[nout] = 0;
obit &= 7;
ibit = ibite;
}
}
return nout;
}
int unpack2(int *oinput, int nin, unsigned char* ioutput, int n)
{
int obit = 0;
int ibit = 0;
int ibite = 0;
int nout = 0;
for(int i=0; i<nin; i++)
{
oinput[i] = 0;
ibit = 0;
while(ibit < n) {
ibite = std::min(n, ibit + 8 - obit);
oinput[i] |= (ioutput[nout] & (((1 << (ibite-ibit+obit))-1) ^ ((1 << obit)-1))) >> obit << ibit;
obit += ibite - ibit;
nout += obit >> 3;
obit &= 7;
ibit = ibite;
}
}
return nout;
}
I know this might seem like the obvious thing to say as I'm sure there's actually a solution, but why not use a smaller type, like uint8_t (max 255)? or uint16_t (max 65535)?. I'm sure you could bit-manipulate on an int64_t using defined values and or operations and the like, but, aside from an academic exercise, why?
And on the note of academic exercises, Bit Twiddling Hacks is a good read.
If you have fixed sizes, e.g. you know your number is 38bit rather than 64, you can build structures using bit specifications. Amusing you also have smaller elements to fit in the remaining space.
struct example {
/* 64bit number cut into 3 different sized sections */
uint64_t big_num:38;
uint64_t small_num:16;
uint64_t itty_num:10;
/* 8 bit number cut in two */
uint8_t nibble_A:4;
uint8_t nibble_B:4;
};
This isn't big/little endian safe without some hoop-jumping, so can only be used within a program rather than in a exported data format. It's quite often used to store boolean values in single bits without defining shifts and masks.
I don't think you can avoid iterating across the elements.
AFAIK Huffman encoding requires the frequencies of the "symbols", which unless you know the statistics of the "process" generating the integers, you will have to compute (by iterating across every element).