Make your own data range in c++ - c++

I want to have a data variable which will be an integer and its range will be from
0 - 1.000.000.
For example normal int variables can store numbers from -2,147,483,648 to 2,147,483,647.
I want the new data type to have less range so it can have LESS SIZE.
If there is a way to do that please let me know?

There isn't; you can't specify arbitrary ranges for variables like this in C++.
You need 20 bits to store 1,000,000 different values, so using a 32-bit integer is the best you can do without creating a custom data type (even then you'd only be saving 1 byte at 24 bits, since you can't allocate less than 8 bits).
As for enforcing the range of values, you could do that with a custom class, but I assume your goal isn't the validation but the size reduction.

So, there's no true good answer to this problem. Here are a few thoughts though:
If you're talking about an array of these 20 bit values, then perhaps the answers at this question will be helpful: Bit packing of array of integers
On the other hand, perhaps we are talking about an object, that has 3 int20_ts in it, and you'd like it to take up less space than it would normally. In that case, we could use a bitfield.
struct object {
int a : 20;
int b : 20;
int c : 20;
} __attribute__((__packed__));
printf("sizeof object: %d\n", sizeof(struct object));
This code will probably print 8, signifying that it is using 8 bytes of space, not the 12 that you would normally expect.

You can only have data types to be multiple of 8 bits. This is because, otherwise that data type won't be addressable. Imagine a pointer to a 5 bit data. That won't exist.

Related

How can I have exactly 2 bits in memory?

I should be able to store a value in a data structure that could go from 0 to 3.. so I need 2 bits. This data structure I will be great 2 ^ 16 locations. So, i want to have 2 ^ 16 * 2 (bits). In C + + do you use to have exactly 2 bits in memory?
You need two bits per unit (not three), so you can pack four units into one byte, or 16 units into one 32-bit integer.
So you will need a std::array<uint32_t, 4096> to accomodate 216 units of 2-bit values.
You access the nth value as follows:
unsigned int get(std::size_t n, std::array<uint32_t, 4096> const & arr)
{
const uint32_t u = arr[n / 16];
return (u >> (2 * (n % 16))) & 0x3;
}
Alternatively, you could go with a bitfield:
struct BF32 {
uint32_t u0 : 2;
uint32_t u1 : 2;
//...
uint32_t uF : 2;
}
And then make an std::array<BF32, 4096>.
You cannot allocate a single object that is less than 1 byte (because 1 byte is the smallest addressable unit in the system).
You can, however, have portions of a structure that are smaller than a byte using bitfields. You could create one of these to hold 8 of your values, the size of this is exactly 3 bytes:
#pragma pack(1) // MSVC requires this
struct three_by_eight {
unsigned value1 : 3;
unsigned value2 : 3;
unsigned value3 : 3;
unsigned value4 : 3;
unsigned value5 : 3;
unsigned value6 : 3;
unsigned value7 : 3;
unsigned value8 : 3;
}
__attribute__ ((packed)) // GCC requires this
;
These can be clumsy to work with since they can't be accessed using [].... Your best be would be to create your own class that works similar to a bitset but works on 3 bits instead of 1.
If you are not working on an embedded system and resources are sufficient, you can have a look at std::bitset<> which will make your job as a programmer easier.
But if you are working on an embedded system, the bitset is probably not good for you (your compiler probably doesn't even support templates). There are a number of techniques for manipulating bits, each with its own quirks; here's an article that might help you:
> http://www.atmel.com/dyn/resources/prod_documents/avr_3_04.pdf
0 to 3 has 4 possible values. Since log2(4) == 2, or because 2^2 == 4, you need two bits, no three.
You might want to use bit fields
There was a discussion on the size allocated to bit-field structs last night. A struct cannot be smaller than a byte, and with most machines and compilers will be either 2 or 4, depending on the compiler and word size. So, no, you can't get a 3-bit struct (2-bit as you actually need). You can, however, pack bits yourself into an array of, say, uint64_ts. Or you could make a struct with 16 2-bit members and see if gcc makes that a 4-byte struct, then use an array of those.
There a very old trick to sneak a couple of bits around if you already have some data structures. This is quite nasty and unless you have extremely good reasons, it is most likely not at all a good idea. I'm just pointing this out in case you really really need to save a couple of bits.
Due to alignment, pointers on x86 or x64 are often multiples of 4, hence the two least significant bits of such pointers (e.g. pointers to int) are always 0. You can exploit this and sneak your two bits in there, but you have to make sure to remove them, when accessing those pointers (depending on the architecture, I'm not sure here).
Again, this is nasty, dangerous and pretty UB but perhaps it is worth it in your case.
3^5 = 243
and can fit 5 entries in 8bits. You spend like 20% less space storing lot of data this way. All you need is lookup table for 2 directional lookups and manipulations.

Creating integer variable of a defined size

I want to define an integer variable in C/C++ such that my integer can store 10 bytes of data or may be a x bytes of data as defined by me in the program.
for now..!
I tried the
int *ptr;
ptr = (int *)malloc(10);
code. Now if I'm finding the sizeof ptr, it is showing as 4 and not 10. Why?
C and C++ compilers implement several sizes of integer (typically 1, 2, 4, and 8 bytes {8, 16, 32, and 64 bits}), but without some helper code to preform arithmetic operations you can't really make arbitrary sized integers.
The declarations you did:
int *ptr;
ptr = (int *)malloc(10);
Made what is probably a broken array of integers. Broken because unless you are on a system where (10 % sizeof(int) ) == 0) then you have extra bytes at the end which can't be used to store an entire integer.
There are several big number Class libraries you should be able to locate for C++ which do implement many of the operations you may want preform on your 10 byte (80 bit) integers. With C you would have to do operation as function calls because it lacks operator overloading.
Your sizeof(ptr) evaluated to 4 because you are using a machine that uses 4 byte pointers (a 32 bit system). sizeof tells you nothing about the size of the data that a pointer points to. The only place where this should get tricky is when you use sizeof on an array's name which is different from using it on a pointer. I mention this because arrays names and pointers share so many similarities.
Because on you machine, size of a pointer is 4 byte. Please note that type of the variable ptr is int *. You cannot get complete allocated size by sizeof operator if you malloc or new the memory, because sizeof is a compile time operator, meaning that at compile time the value is evaluated.
It is showing 4 bytes because a pointer on your platform is 4 bytes. The block of memory the pointer addresses may be of any arbitrary size, in your case it is 10 bytes. You need to create a data structure if you need to track that:
struct VariableInteger
{
int *ptr;
size_t size;
};
Also, using an int type for your ptr variable doesn't mean the language will allow you to do arithmetic operations on anything of a size different than the size of int on your platform.
Because the size of the pointer is 4. Try something like:
typedef struct
{
int a[10];
} big_int_t;
big_int_t x;
printf("%d\n", sizeof(x));
Note also that an int is typically not 1 byte in size, so this will probably print 20 or 40, depending on your platform.
Integers in C++ are of a fixed size. Do you mean an array of integers? As for sizeof, the way you are using it, it tells you that your pointer is four bytes in size. It doesn't tell you the size of a dynamically allocated block.
Few or no compilers support 10-byte integer arithmetic. If you want to use integers bigger than the values specified in <limits.h>, you'll need to either find a library with support for big integers or make your own class which defines the mathematical operators.
I believe what you're looking for is known as "Arbitrary-precision arithmetic". It allows you to have numbers of any size and any number of decimals. Instead of using fixed-size assembly level math functions, these libraries are coded to do math how one would do them on paper.
Here's a link to a list of arbitrary-precision arithmetic libraries in a few different languages, compliments of Wikipedia: link.

how do i define 24 bit array in c++?

how do i define 24 bit array in c++? (variable declaration)
There is no 24-bit variable type in C++.
You can use a bitpacked struct:
struct ThreeBytes {
uint32_t value:24;
};
But it is not guaranteed that sizeof ThreeBytes == 3.
You can also just use uint32_t or sint32_t, depending on what you need.
Another choice is to use std::bitset:
typedef std::bitset<24> ThreeBytes;
Then make an array out of that:
ThreeBytes *myArray = new ThreeBytes[10];
Of course, if you really just need "three bytes", you can make an array of arrays:
typedef uint8_t ThreeBytes[3];
Note that uint8_t and friends are non-standard, and are used simply for clarification.
An unsigned byte array of 3 bytes is 24 bits. Depending on how you are planning to use it, it could do.
unsigned char arrayname[3]
As #GMan points out you should be aware that it's not 100% of all systems that has 8 bits chars.
If you intend to perform bitwise operations on them, then simply use an integral type that has at least 24 bits. An int is 32 bits on most platforms, so an int may be suitable for this purpose.
EDIT: Since you actually wanted an array of 24 bit variables, the most straightforward way to do this is to create an array of ints or longs (as long as it's an integral data type that contains at least 24 bits) and treat each element as though it was 24 bits.
Depending on your purpose (for example, if you are concerned that using a 32bit type might waste too much memory), you might also consider creating an array of bytes with three times the length.
I used to do that a lot for storing RGB images. To access the n'th element, you would have to multiply by three and then add zero, one or two depending on which "channel" out of the element you wanted. Of course, if you want to access all 24 bits as one integer, this approach requires some additional arithmetics.
So simply unsigned char myArray[ELEMENTS * 3]; where ELEMENTS is the number of 24bit elements that you want.
Use bitset or bitvector if they are supported on your platform. (They are :) )
std::vector<std::bitset<24> > myArray;

Custom byte size?

So, you know how the primitive of type char has the size of 1 byte? How would I make a primitive with a custom size? So like instead of an in int with the size of 4 bytes I make one with size of lets say 16.
Is there a way to do this? Is there a way around it?
It depends on why you are doing this. Usually, you can't use types of less than 8 bits, because that is the addressable unit for the architecture. You can use structs, however, to define different lengths:
struct s {
unsigned int a : 4; // a is 4 bits
unsigned int b : 4; // b is 4 bits
unsigned int c : 16; // c is 16 bits
};
However, there is no guarantee that the struct will be 24 bits long. Also, this can cause endian issues. Where you can, it's best to use system independent types, such as uint16_t, etc. You can also use bitwise operators and bit shifts to twiddle things very specifically.
Normally you'd just make a struct that represents the data in which you're interested. If it's 16 bytes of data, either it's an aggregate of a number of smaller types or you're working on a processor that has a native 16-byte integral type.
If you're trying to represent extremely large numbers, you may need to find a special library that handles arbitrarily-sized numbers.
In C++11, there is an excellent solution for this: std::aligned_storage.
#include <memory>
#include <type_traits>
int main()
{
typedef typename std::aligned_storage<sizeof(int)>::type memory_type;
memory_type i;
reinterpret_cast<int&>(i) = 5;
std::cout << reinterpret_cast<int&>(i) << std::endl;
return 0;
}
It allows you to declare a block of uninitialized storage on the stack.
If you want to make a new type, typedef it. If you want it to be 16-bytes in size, typedef a struct that has 16-bytes of member data within it. Just beware that quite often compilers will pad things on you to match your systems alignment needs. A 1 byte struct rarely remains 1 bytes without care.
You could just static cast to and from std::string. I don't know enough C++ to give an example, but I think this would be pretty intuitive.

Define smallest possible datatype in c++ that can hold six values

I want to define my own datatype that can hold a single one of six possible values in order to learn more about memory management in c++. In numbers, I want to be able to hold 0 through 5. Binary, It would suffice with three bits (101=5), although some (6 and 7) wont be used. The datatype should also consume as little memory as possible.
Im not sure on how to accomplish this. First, I tried an enum with defined values for all the fields. As far as I know, the values are in hex there, so one "hexbit" should allow me to store 0 through 15. But comparing it to a char (with sizeof) it stated that its 4 times the size of a char, and a char holds 0 through 255 if Im not misstaken.
#include <iostream>
enum Foo
{
a = 0x0,
b = 0x1,
c = 0x2,
d = 0x3,
e = 0x4,
f = 0x5,
};
int main()
{
Foo myfoo = a;
char mychar = 'a';
std::cout << sizeof(myfoo); // prints 4
std::cout << sizeof(mychar); // prints 1
return 1;
}
Ive clearly misunderstood something, but fail to see what, so I turn to SO. :)
Also, when writing this post I realised that I clearly lack some parts of the vocabulary. Ive made this post a community wiki, please edit it so I can learn the correct words for everything.
A char is the smallest possible type.
If you happen to know that you need several such 3 bit values in a single place you get use a structure with bitfield syntax:
struct foo {
unsigned int val1:3;
unsigned int val2:3;
};
and hence get 2 of them within one byte. In theory you could pack 10 such fields into a 32-bit "int" value.
C++ 0x will contain Strongly typed enumerations where you can specify the underlying datatype (in your example char), but current C++ does not support this. The standard is not clear about the use of a char here (the examples are with int, short and long), but they mention the underlying integral type and that would include char as well.
As of today Neil Butterworth's answer to create a class for your problem seems the most elegant, as you can even extend it to contain a nested enumeration if you want symbolical names for the values.
C++ does not express units of memory smaller than bytes. If you're producing them one at a time, That's the best you can do. Your own example works well. If you need to get just a few, You can use bit-fields as Alnitak suggests. If you're planning on allocating them one at a time, then you're even worse off. Most archetectures allocate page-size units, 16 bytes being common.
Another choice might be to wrap std::bitset to do your bidding. This will waste very little space, if you need many such values, only about 1 bit for every 8.
If you think about your problem as a number, expressed in base-6, and convert that number to base two, possibly using an Unlimited precision integer (for example GMP), you won't waste any bits at all.
This assumes, of course, that you're values have a uniform, random distribution. If they follow a different distribution, You're best bet will be general compression of the first example, with something like gzip.
You can store values smaller than 8 or 32 bits. You just need to pack them into a struct (or class) and use bit fields.
For example:
struct example
{
unsigned int a : 3; //<Three bits, can be 0 through 7.
bool b : 1; //<One bit, the stores 0 or 1.
unsigned int c : 10; //<Ten bits, can be 0 through 1023.
unsigned int d : 19; //<19 bits, can be 0 through 524287.
}
In most cases, your compiler will round up the total size of your structure to 32 bits on a 32 bit platform. The other problem is, like you pointed out, that your values may not have a power of two range. This will make for wasted space. If you read the entire struct as one number, you will find values that will be impossible to set, if your input ranges aren't all powers of 2.
Another feature you may find interesting is a union. They work like a struct, but share memory. So if you write to one field it overwrites the others.
Now, if you are really tight for space, and you want to push each bit to the maximum, there is a simple encoding method. Let's say you want to store 3 numbers, each can be from 0 to 5. Bit fields are wasteful, because if you use 3 bits each, you'll waste some values (i.e. you could never set 6 or 7, even though you have room to store them). So, lets do an example:
//Here are three example values, each can be from 0 to 5:
const int one = 3, two = 4, three = 5;
To pack them together most efficiently, we should think in base 6 (since each value is from 0-5). So packed into the smallest possible space is:
//This packs all the values into one int, from 0 - 215.
//pack could be any value from 0 - 215. There are no 'wasted' numbers.
int pack = one + (6 * two) + (6 * 6 * three);
See how it looks like we're encoding in base six? Each number is multiplied by it's place like 6^n, where n is the place (starting at 0).
Then to decode:
const int one = pack % 6;
pack /= 6;
const int two = pack % 6;
pack /= 6;
const int three = pack;
Theses schemes are extremely handy when you have to encode some fields in a bar code or in an alpha numeric sequence for human typing. Just saying those few partial bits can make a huge difference. Also, the fields don't all have to have the same range. If one field is from 0 through 7, you'd use 8 instead of 6 in the proper place. There is no requirement that all fields have the same range.
Minimal size what you can use - 1 byte.
But if you use group of enum values ( writing in file or storing in container, ..), you can pack this group - 3 bits per value.
You don't have to enumerate the values of the enum:
enum Foo
{
a,
b,
c,
d,
e,
f,
};
Foo myfoo = a;
Here Foo is an alias of int, which on your machine takes 4 bytes.
The smallest type is char, which is defined as the smallest addressable data on the target machine. The CHAR_BIT macro yields the number of bits in a char and is defined in limits.h.
[Edit]
Note that generally speaking you shouldn't ask yourself such questions. Always use [unsigned] int if it's sufficient, except when you allocate quite a lot of memory (e.g. int[100*1024] vs char[100*1024], but consider using std::vector instead).
The size of an enumeration is defined to be the same of an int. But depending on your compiler, you may have the option of creating a smaller enum. For example, in GCC, you may declare:
enum Foo {
a, b, c, d, e, f
}
__attribute__((__packed__));
Now, sizeof(Foo) == 1.
The best solution is to create your own type implemented using a char. This should have sizeof(MyType) == 1, though this is not guaranteed.
#include <iostream>
using namespace std;
class MyType {
public:
MyType( int a ) : val( a ) {
if ( val < 0 || val > 6 ) {
throw( "bad value" );
}
}
int Value() const {
return val;
}
private:
char val;
};
int main() {
MyType v( 2 );
cout << sizeof(v) << endl;
cout << v.Value() << endl;
}
It is likely that packing oddly sized values into bitfields will incur a sizable performance penalty due to the architecture not supporting bit-level operations (thus requiring several processor instructions per operation). Before you implement such a type, ask yourself if it is really necessary to use as little space as possible, or if you are committing the cardinal sin of programming that is premature optimization. At most, I would encapsulate the value in a class whose backing store can be changed transparently if you really do need to squeeze every last byte for some reason.
You can use an unsigned char. Probably typedef it into an BYTE. It will occupy only one byte.