Adding a 1 bit flag in a 64 bit pointer - c++

If you define a union for traversing a data structure squentially (start and end offset) or via a pointer to a tree structure depending on the number of data elements on a 64 bit system where these unions are aligned with cache lines, is there a possibility of both adding a one bit flag at one of those 64 bits in order to know which traversal must be used and still allowing to reconstruct the right pointer?
union {
uint32_t offsets[2];
Tree<NodeData> * tree;
};

It's system dependent, but I don't think any 64-bit system really uses its full pointer-length yet.
Also, if you know your data is 2n-aligned, chances are those n bits are just sitting idle there (On some old systems they just would not exist. But I don't think any of those were 64-bit systems, and anyway they are no longer of interest).
As an example, x86_64 uses 48bits, the upper 16 must be the same as bit47. (sign-extended)
Another example, ARM64 uses 49bits (2 mappings of 48bit at the same time), so there you only have 15 bits left.
Just remember to corect the pilfered bits. (You might want to use uintptr_t instead of a pointer, and convert after the correction.)
Using a mis-aligned or impossible pointer causes behavior ranging from silent auto-correction, ranging over silent mis-behavior, to loud crashes.

Related

Range of pointer values on 64 bit systems

Recently I was reading about the small string optimization (SSO): What are the mechanics of short string optimization in libc++?. As we know, a string typically consists of 3 pointers, which is 24 bytes on a 64 bit system. The linked answer says that in libc++'s implementation, the very first bit of the first pointer is used to indicate whether the string is in "long" or "short" mode, i.e. heap allocation and external storage vs internal storage of up to some 22 characters.
This however assumes however that the first bit of the first pointer cannot ever meaningfully be part of the address, because whenever the string is in "long" mode, that bit will always be set (or unset, depending which convention was chosen). This seems reasonable on its face, since with 64 bit pointers that allows 2^64 addresses, larger than 1 followed by 18 zeroes in bytes, or more than 1 billion gigabytes.
So this is reasonable, though not certain. My question is: is this guaranteed somewhere? And if it is guaranteed, where is it guaranteed? By the architecture spec, or by something else? To take it a step further: how many bits is it safe to do this with? I have a vague recollection reading somewhere that only 48 bits are used, but I don't recall.
If there are some number of bits, e.g. 8 or 16 that are guaranteed to be untouched, that is certainly something that could be leveraged in some interesting ways. It would be nice to exploit this, but not at the cost of having code mysteriously failure on some machine.
As we know, a string typically consists of 3 pointers, which is 24 bytes on a 64 bit system.
This is not true with libc++. The __long structure, for "long strings" is defined as:
struct __long
{
size_type __cap_;
size_type __size_;
pointer __data_;
};
The short flag therefore goes into the capacity field, making the whole thing moot.
As for pointer tagging, there is no universal guarantee about the size of a pointer. On x86_64, the data structures that the CPU uses for virtual address translation only use 48 bits (or 52 with physical address extension), so virtual addresses never use the upper 16 (or 12) bits. Additionally, most operating systems map their kernel into every process and reserve some amount of the high end of the address space for it, so in practice, user-mode pointers are even more restricted. On Windows, the most significant hardware-usable bit of a pointer tells whether it belongs to kernel-space or user-space.
These limits can change in the future and will vary across platforms, so it would be bad form to use them in a platform-independent standard library. In general, it's much better practice to use the least-significant bits for pointer tagging, since your application is in control of these.
The "long-bit" isn't part of a pointer, but of the capacity:
struct __long
{
size_type __cap_;
size_type __size_;
pointer __data_;
};
The "trick" is that if you always allocate an even number of characters and reserve one for the nul terminator, the resulting capacity will always be an odd number. And you get the 1-bit for free!

Why booleans take a whole byte? [duplicate]

In C++,
Why is a boolean 1 byte and not 1 bit of size?
Why aren't there types like a 4-bit or 2-bit integers?
I'm missing out the above things when writing an emulator for a CPU
Because the CPU can't address anything smaller than a byte.
From Wikipedia:
Historically, a byte was the number of
bits used to encode a single character
of text in a computer and it is
for this reason the basic addressable
element in many computer
architectures.
So byte is the basic addressable unit, below which computer architecture cannot address. And since there doesn't (probably) exist computers which support 4-bit byte, you don't have 4-bit bool etc.
However, if you can design such an architecture which can address 4-bit as basic addressable unit, then you will have bool of size 4-bit then, on that computer only!
Back in the old days when I had to walk to school in a raging blizzard, uphill both ways, and lunch was whatever animal we could track down in the woods behind the school and kill with our bare hands, computers had much less memory available than today. The first computer I ever used had 6K of RAM. Not 6 megabytes, not 6 gigabytes, 6 kilobytes. In that environment, it made a lot of sense to pack as many booleans into an int as you could, and so we would regularly use operations to take them out and put them in.
Today, when people will mock you for having only 1 GB of RAM, and the only place you could find a hard drive with less than 200 GB is at an antique shop, it's just not worth the trouble to pack bits.
The easiest answer is; it's because the CPU addresses memory in bytes and not in bits, and bitwise operations are very slow.
However it's possible to use bit-size allocation in C++. There's std::vector specialization for bit vectors, and also structs taking bit sized entries.
Because a byte is the smallest addressible unit in the language.
But you can make bool take 1 bit for example if you have a bunch of them
eg. in a struct, like this:
struct A
{
bool a:1, b:1, c:1, d:1, e:1;
};
You could have 1-bit bools and 4 and 2-bit ints. But that would make for a weird instruction set for no performance gain because it's an unnatural way to look at the architecture. It actually makes sense to "waste" a better part of a byte rather than trying to reclaim that unused data.
The only app that bothers to pack several bools into a single byte, in my experience, is Sql Server.
You can use bit fields to get integers of sub size.
struct X
{
int val:4; // 4 bit int.
};
Though it is usually used to map structures to exact hardware expected bit patterns:
// 1 byte value (on a system where 8 bits is a byte)
struct SomThing
{
int p1:4; // 4 bit field
int p2:3; // 3 bit field
int p3:1; // 1 bit
};
bool can be one byte -- the smallest addressable size of CPU, or can be bigger. It's not unusual to have bool to be the size of int for performance purposes. If for specific purposes (say hardware simulation) you need a type with N bits, you can find a library for that (e.g. GBL library has BitSet<N> class). If you are concerned with size of bool (you probably have a big container,) then you can pack bits yourself, or use std::vector<bool> that will do it for you (be careful with the latter, as it doesn't satisfy container requirments).
Think about how you would implement this at your emulator level...
bool a[10] = {false};
bool &rbool = a[3];
bool *pbool = a + 3;
assert(pbool == &rbool);
rbool = true;
assert(*pbool);
*pbool = false;
assert(!rbool);
Because in general, CPU allocates memory with 1 byte as the basic unit, although some CPU like MIPS use a 4-byte word.
However vector deals bool in a special fashion, with vector<bool> one bit for each bool is allocated.
The byte is the smaller unit of digital data storage of a computer. In a computer the RAM has millions of bytes and anyone of them has an address. If it would have an address for every bit a computer could manage 8 time less RAM that what it can.
More info: Wikipedia
Even when the minimum size possible is 1 Byte, you can have 8 bits of boolean information on 1 Byte:
http://en.wikipedia.org/wiki/Bit_array
Julia language has BitArray for example, and I read about C++ implementations.
Bitwise operations are not 'slow'.
And/Or operations tend to be fast.
The problem is alignment and the simple problem of solving it.
CPUs as the answers partially-answered correctly are generally aligned to read bytes and RAM/memory is designed in the same way.
So data compression to use less memory space would have to be explicitly ordered.
As one answer suggested, you could order a specific number of bits per value in a struct. However what does the CPU/memory do afterward if it's not aligned? That would result in unaligned memory where instead of just +1 or +2, or +4, there's not +1.5 if you wanted to use half the size in bits in one value, etc. so it must anyway fill in or revert the remaining space as blank, then simply read the next aligned space, which are aligned by 1 at minimum and usually by default aligned by 4(32bit) or 8(64bit) overall. The CPU will generally then grab the byte value or the int value that contains your flags and then you check or set the needed ones. So you must still define memory as int, short, byte, or the proper sizes, but then when accessing and setting the value you can explicitly compress the data and store those flags in that value to save space; but many people are unaware of how it works, or skip the step whenever they have on/off values or flag present values, even though saving space in sent/recv memory is quite useful in mobile and other constrained enviornments. In the case of splitting an int into bytes it has little value, as you can just define the bytes individually (e.g. int 4Bytes; vs byte Byte1;byte Byte2; byte Byte3; byte Byte4;) in that case it is redundant to use int; however in virtual environments that are easier like Java, they might define most types as int (numbers, boolean, etc.) so thus in that case, you could take advantage of an int dividing it up and using bytes/bits for an ultra efficient app that has to send less integers of data (aligned by 4). As it could be said redundant to manage bits, however, it is one of many optimizations where bitwise operations are superior but not always needed; many times people take advantage of high memory constraints by just storing booleans as integers and wasting 'many magnitudes' 500%-1000% or so of memory space anyway. It still easily has its uses, if you use this among other optimizations, then on the go and other data streams that only have bytes or few kb of data flowing in, it makes the difference if overall you optimized everything to load on whether or not it will load,or load fast, at all in such cases, so reducing bytes sent could ultimately benefit you alot; even if you could get away with oversending tons of data not required to be sent in an every day internet connection or app. It is definitely something you should do when designing an app for mobile users and even something big time corporation apps fail at nowadays; using too much space and loading constraints that could be half or lower. The difference between not doing anything and piling on unknown packages/plugins that require at minumim many hundred KB or 1MB before it loads, vs one designed for speed that requires say 1KB or only fewKB, is going to make it load and act faster, as you will experience those users and people who have data constraints even if for you loading wasteful MB or thousand KB of unneeded data is fast.

How to find out if we are using really 48, 56 or 64 bits pointers

I am using tricks to store extra information in pointers, At the moment some bits are not used in pointers(the highest 16 bits), but this will change in the future. I would like to have a way to detect if we are compiling or running on a platform that will use more than 48 bits for pointers.
related things:
Why can't OS use entire 64-bits for addressing? Why only the 48-bits?
http://developer.amd.com/wordpress/media/2012/10/24593_APM_v2.pdf
The solution is needed for x86-64, Windows, C/C++, preferably something that can be done compile-time.
Solutions for other platforms are also of interest but will not marked as correct answer.
Windows has exactly one switch for 32bit and 64bit programs to determine the top of their virtual-address-space:
IMAGE_FILE_LARGE_ADDRESS_AWARE
For both types, omitting it limits the program to the lower 2 GB of address-space, severely reducing the memory an application can map and thus also reducing effectiveness of Address-Space-Layout-Randomization (ASLR, an attack mitigation mechanism).
There is one upside to it though, and just what you seem to want: Only the lower 31 bits of a pointer can be set, so pointers can be safely round-tripped through int (32 bit integer, sign- or zero-extension).
At run-time the situation is slightly better:
Just use the cpuid-instruction from intel, function EAX=80000008H, and read the maximum number of used address bits for virtual addresses from bits 8-15.
The OS cannot use more than the CPU supports, remember that intel insists on canonical addresses (sign-extended).
See here for how to use cpuid from C++: CPUID implementations in C++

On Linux, in C/C++, will a pointer ever have the MSB set?

I want to use a long integer that will be interpreted as a number when the MSB is set otherwise it will be interpreted as a pointer. So would this work or would I run into problems in either C or C++?
This is on a 64-bit system.
Edited for clarity and a better description.
On x86-64, you WILL have a pointer that is over 47 bits in address have the 63rd bit set, since all the bits above "max number of bits supported by the architecture" (which is currently 48) must all have the same value as the most significant bit of the value itself. (That is any address above 0007 FFFF FFFF FFFF will be FFF8 0000 0000 0000 - everything in between is "invalid" as a pointer)
That may well be addresses ONLY used by the kernel, but I'm not sure it's guaranteed to be.
However, I would try to avoid using tricks like this - it's likely to come back and haunt you at some point.
People have tried tricks like this before.
It never works out well in the long run.
Simply don't do it.
Edit: better link - see reference to 'bit31', which was previously never returned as set. Once it could be set (over 2 gigs of RAM, gasp!) it would break naughty programs and therefore programs needed to opt into this option once this much memory became the norm as people had used trickery like this (amongst other things). And now my lovely, short and to the point answer has become too long :-)
So would this work or would I run into problems in either C or C++?
Do you have 64 bits? Do you want your code to be portable to 32 bit systems? long does not necessarily have 64 bits. Big-endian v. little-endian? (Do you know which your system is?)
Plus, hopeless confusion. Please just use an extra variable to store this information or you will have many many bugs surrounding this.
It depends on the architecture. x86_64 architecture, for example, is currently using 48-bit addressing. It means that you could use 16 bits for your own needs (a trick that sometimes referred to as "pointer packing"). However, even the x86_64 architecture definition allows this limit to be raised in future implementations to the full 64 bits. If that happens, you may run into a situation where a lot of your code might need to be changed. So if you really must go that way, make sure your pointer packing is kept in one place that is easy to change in the future. For other architectures you have to check for yourself.
Unless you really need the space, or you're keeping alot of these things around, I would just use a plain union, and add a tag field. If you're going to go down that route, make sure that your memory is aligned to fit your needs.
Take a look at boost::lockfree::detail::tagged_ptr from boost.lockfree
This is a class that was introduced in latest 1_53 boost. It stores pointer and additional 16 bites in 64 bites variable.
Don't do such tricks. If you need to distinguish integers from pointers inside some container, consider using separate bit set to indicate such flag. In C++ std::bitset could be good enough.
Reasons:
Actually nobody guarantees pointers are long unsigned or long long unsigned. If you need
to store them, always apply sizeof() and void * type (if you need
to remove information about pointed object).
Even on one system addresses are highly dependent on architecture.
Kernel modules could seriously change mapping logics for process so you never know what addresses you will need.
Remember that the virtual address returned to your program does may necessarily line up to the actual physical address in memory. Infact, unless you are directly manipulating pretty special memory [e.g. some forms of graphics memory] then this is absolutely the case.
In this case, its the maximum value of the MMU which defines the values of the pointers your program sees. In which case, for x64 I'm pretty sure its (currently) 48bits, but as Mats specifies above once you've got the top bit set in the 48, you get the 63'd bit says aswell.
So taking his answer and mine - its entirely possible to get a pointer with the 47th bit set even with a small amount of RAM, and once you do you get the 63rd bit set.
If the "64-bit system" in question is x86_64, then yes, it will work.

Why the size of a pointer is 4bytes in C++

On the 32-bit machine, why the size of a pointer is 32-bit? Why not 16-bit or 64-bit? What's the cons and pros?
Because it mimics the size of the actual "pointers" in assembler. On a machine with a 64 bit address bus, it will be 64 bits. In the old 6502, it was an 8 bit machine, but it had 16 bit address bus so that it could address 64K of memory. On most 32 bit machines, 32 bits were enough to address all the memory, so that's what the pointer size was in C++. I know that some of the early M68000 series chips only had a 24 bit memory address space, but it was addressed from a 32 bit register so even on those the pointer would be 32 bits.
In the bad old days of the 80286, it was worse - there was a 16 bit address register, and a 16 bit segment register. Some C++ compilers didn't hide that from you, and made you declare your pointers as near or far depending on whether you wanted to change the segment register. Mercifully, I've recycled most of those brain cells, so I forget if near pointers were 16 bits - but at the machine level, they would be.
The size of a pointer in C++ is implementation-defined. C++ might run on anything from your toaster's chip up to huge mainframes. Different architectures require different sizes of the data types.
If on your implementation a pointer is 32bit, then that's very likely an architecture which can address 2^32 bytes. (Note that even the size of bytes might be different depending on the implementation.) 64bit architectures generally can address 2^64 bytes, so implementations on these architectures will likely have a pointer size of 64bit.
16 bit would obviously be insufficient - you could only address 64K then.
Why not emulate 64 bit on 32 bit systems - I guess because the performance of pointer arithmetic would degrade.
As mentioned in many other answers, the size of a pointer need not be 32-bits - the implementation will set the size of a pointer to be whatever the architecture of the platform dictates. On a system with 64-bit addressing, the size of a pointer will generally be 64-bits.
However, you should also note that even on a single implementation, different types of pointers might have different sizes. In particular, pointer-to-member types (which I'll grant are odd-ball pointers) may have different sizes than plain-old pointers to objects.
The same is true about pointers to plain old functions - they might have a different size than pointers to objects (this applies to C as well as C++). However on modern desktop systems you'll usually find that pointers to functions are the same size as pointers to objects.
Here's a short example of fun with pointer-to-member-functions:
#include <stdio.h>
class A {};
class B {};
class VirtD: public virtual A, public virtual B {
public:
virtual int Dfunc() { return 5; };
};
typedef int (VirtD::* Derived_mfp)();
int main()
{
VirtD virtd;
Derived_mfp mfp = &VirtD::Dfunc;
printf( "sizeof( mfp) == %u\n", (unsigned int) sizeof( mfp));
}
Displays: sizeof( mfp) == 12 on MSVC.
The size of the pointer has little to do with the architecture(32bit, 64bit). 32bit usually refers to the fact that the register size is 32bit. As a result, the maximum possible number of address that you can address using one register is 2^32. So, it boils down to efficiency of addressing the memory slots using a register.
With a 32-bit pointer you can point to a wider range of memory than with 16-bit pointers. When 32-bit pointers were standardized, 64-bit CPUs were not very popular (or even existent?). Therefor a pointer would not be able to fit inside the CPU register, which is a very important factor for speed.
Why not 16-bit? Because, presuming a flat 32-bit address space, you cannot address every byte. Far from it: you can only address 216 unique locations with a 16-bit pointer. Even if your pointers only point to dwords and not bytes, this still leaves 1073676288 dwords unaddressable.
Assuming a flat 32-bit address space, you can already address every single byte with a 32-bit pointer. At this point, 64-bit pointers are just wasting space, unless you want to add additional information to each pointer. For example, on 32-bit PowerPC, a function descriptor is actually a 96-bit entity, with one third pointing to the executable code and the rest being data that helps make relocating modules easier.
In a segmented address space, having larger-than-32-bit pointers to data could be useful. Windows NT on the DEC Alpha was a 32-bit operating system, but the Alpha hardware was 64-bit capable. Your ordinary address space was still 32-bit, but there were special APIs to allow 32-bit programs to access 64-bit addresses, as if they were in otherwise-inaccessible segments.
To answer your question: C++ itself says very little about the size of a pointer, and certainly not that it has to be 32 bits or anything. The size of a pointer should be the natural one for the machine architecture.