A very long boolean array/std::bitset - c++

For some C++ code, my logic requires a boolean array containing 4*10^10 indices. I am using the STL container std::bitset. But its implementation as template < size_t N > class bitset; restricts the number of bits to the upper limit of the size_t (unsigned integral type) data type, which is 2^32-1 (or 2^64-1) {Can someone confirm this as well}.
I thought of a workaround for this issue by creating an array of bitset, as in bitset<100000000> checkSum[400];
Is this legal? I am getting the following compilation error (test.cpp is my C++ file)
/tmp/cc0gR0c6.o: In function `__static_initialization_and_destruction_0(int, int)':
test.cpp:(.text+0x35f): relocation truncated to fit: R_X86_64_32 against `.bss'
test.cpp:(.text+0x373): relocation truncated to fit: R_X86_64_32 against `.bss'
collect2: ld returned 1 exit status
Can this somehow be fixed or is there a better workaround?

I think you should use vector instead of array,just like:
vector<bitset<1000000> > checkSum;

size_t is either 32 or 64 bits, depending whether you're compiling 32 or 64 bit code. It looks like you can simply compile for a 64 bit target and solve this.

Related

C++ How do you initialize a long double? [duplicate]

I am trying print a simple long double, but it doesn't work
What I tried:
long double ld=5.32;
printf("ld with le = %Le \n",ld);
printf("ld with lf = %Lf \n",ld);
printf("ld with lg = %Lg \n",ld);
Output:
ld with le = -3.209071e-105
ld with lf = -0.000000
ld with lg = -3.20907e-105
With a new value:
ld=6.72;
Output:
ld with le = -1.972024e+111
ld with lf = -1972024235903379200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.000000
ld with lg = -1.97202e+111
There's a similar problem with MinGW under Windows. If that's not what you're using, this answer probably doesn't apply.
The problem is that the compiler (GCC) and the runtime library (Microsoft's) are implemented by different groups that happen to have different ideas about how the type long double should be represented. (gcc uses 128 bits for long double; Microsoft uses 64 bits, with the same representation as double.)
Either choice of representation is perfectly legitimate, but they're incompatible with each other. It's not a bug either in GCC or in Microsoft's library, but in the way MinGW integrates them.
Your options are to use an implementation other than MinGW, to write or copy code that handles long double correctly , or to avoid calling any library functions that take arguments or return results of type long double (computations on long double shouldn't be a problem as long as they don't call any library functions). For example, you can convert to double and print with %g, with some loss of range and precision.
Another (probably better) workaround is to compile with -D__USE_MINGW_ANSI_STDIO, which causes MinGW to use its own implementation of printf and friends rather than relying on the Microsoft implementation.

What is a valid pointer in gcc linux x86-64 C++?

I am programming C++ using gcc on an obscure system called linux x86-64. I was hoping that may be there are a few folks out there who have used this same, specific system (and might also be able to help me understand what is a valid pointer on this system). I do not care to access the location pointed to by the pointer, just want to calculate it via pointer arithmetic.
According to section 3.9.2 of the standard:
A valid value of an object pointer type represents either the address of a byte in memory (1.7) or a null pointer.
And according to [expr.add]/4:
When an expression that has integral type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
expression P points to element x[i] of an array object x with n
elements, the expressions P + J and J + P (where J has the value j)
point to the (possibly-hypothetical) element x[i + j] if 0 ≤ i + j ≤
n; otherwise, the behavior is undefined. Likewise, the expression P -
J points to the (possibly-hypothetical) element x[i − j] if 0 ≤ i − j
≤ n; otherwise, the behavior is undefined.
And according to a stackoverflow question on valid C++ pointers in general:
Is 0x1 a valid memory address on your system? Well, for some embedded systems it is. For most OSes using virtual memory, the page beginning at zero is reserved as invalid.
Well, that makes it perfectly clear! So, besides NULL, a valid pointer is a byte in memory, no, wait, it's an array element including the element right after the array, no, wait, it's a virtual memory page, no, wait, it's Superman!
(I guess that by "Superman" here I mean "garbage collectors"... not that I read that anywhere, just smelled it. Seriously, though, all the best garbage collectors don't break in a serious way if you have bogus pointers lying around; at worst they just don't collect a few dead objects every now and then. Doesn't seem like anything worth messing up pointer arithmetic for.).
So, basically, a proper compiler would have to support all of the above flavors of valid pointers. I mean, a hypothetical compiler having the audacity to generate undefined behavior just because a pointer calculation is bad would be dodging at least the 3 bullets above, right? (OK, language lawyers, that one's yours).
Furthermore, many of these definitions are next to impossible for a compiler to know about. There are just so many ways of creating a valid memory byte (think lazy segfault trap microcode, sideband hints to a custom pagetable system that I'm about to access part of an array, ...), mapping a page, or simply creating an array.
Take, for example, a largish array I created myself, and a smallish array that I let the default memory manager create inside of that:
#include <iostream>
#include <inttypes.h>
#include <assert.h>
using namespace std;
extern const char largish[1000000000000000000L];
asm("largish = 0");
int main()
{
char* smallish = new char[1000000000];
cout << "largish base = " << (long)largish << "\n"
<< "largish length = " << sizeof(largish) << "\n"
<< "smallish base = " << (long)smallish << "\n";
}
Result:
largish base = 0
largish length = 1000000000000000000
smallish base = 23173885579280
(Don't ask how I knew that the default memory manager would allocate something inside of the other array. It's an obscure system setting. The point is I went through weeks of debugging torment to make this example work, just to prove to you that different allocation techniques can be oblivious to one another).
Given the number of ways of managing memory and combining program modules that are supported in linux x86-64, a C++ compiler really can't know about all of the arrays and various styles of page mappings.
Finally, why do I mention gcc specifically? Because it often seems to treat any pointer as a valid pointer... Take, for instance:
char* super_tricky_add_operation(char* a, long b) {return a + b;}
While after reading all the language specs you might expect the implementation of super_tricky_add_operation(a, b) to be rife with undefined behavior, it is in fact very boring, just an add or lea instruction. Which is so great, because I can use it for very convenient and practical things like non-zero-based arrays if nobody is putzing with my add instructions just to make a point about invalid pointers. I love gcc.
In summary, it seems that any C++ compiler supporting standard linkage tools on linux x86-64 would almost have to treat any pointer as a valid pointer, and gcc appears to be a member of that club. But I'm not quite 100% sure (given enough fractional precision, that is).
So... can anyone give a solid example of an invalid pointer in gcc linux x86-64? By solid I mean leading to undefined behavior. And explain what gives rise to the undefined behavior allowed by the language specs?
(or provide gcc documentation proving the contrary: that all pointers are valid).
Usually pointer math does exactly what you'd expect regardless of whether pointers are pointing at objects or not.
UB doesn't mean it has to fail. Only that it's allowed to make the whole rest of the program behave strangely in some way. UB doesn't mean that just the pointer-compare result can be "wrong", it means the entire behaviour of the whole program is undefined. This tends to happen with optimizations that depend on a violated assumption.
Interesting corner cases include an array at the very top of virtual address space: a pointer to one-past-the-end would wrap to zero, so start < end would be false?!? But pointer comparison doesn't have to handle that case, because the Linux kernel won't ever map the top page, so pointers into it can't be pointing into or just past objects. See Why can't I mmap(MAP_FIXED) the highest virtual page in a 32-bit Linux process on a 64-bit kernel?
Related:
GCC does have a max object size of PTRDIFF_MAX (which is a signed type). So for example, on 32-bit x86, an array larger than 2GB isn't fully supported for all cases of code-gen, although you can mmap one.
See my comment on What is the maximum size of an array in C? - this restriction lets gcc implement pointer subtraction (to get a size) without keeping the carry-out from the high bit, for types wider than char where the C subtraction result is in objects, not bytes, so in asm it's (a - b) / sizeof(T).
Don't ask how I knew that the default memory manager would allocate something inside of the other array. It's an obscure system setting. The point is I went through weeks of debugging torment to make this example work, just to prove to you that different allocation techniques can be oblivious to one another).
First of all, you never actually allocated the space for large[]. You used inline asm to make it start at address 0, but did nothing to actually get those pages mapped.
The kernel won't overlap existing mapped pages when new uses brk or mmap to get new memory from the kernel, so in fact static and dynamic allocation can't overlap.
Second, char[1000000000000000000L] ~= 2^59 bytes. Current x86-64 hardware and software only support canonical 48-bit virtual addresses (sign-extended to 64-bit). This will change with a future generation of Intel hardware which adds another level of page tables, taking us up to 48+9 = 57-bit addresses. (Still with the top half used by the kernel, and a big hole in the middle.)
Your unallocated space from 0 to ~2^59 covers all user-space virtual memory addresses that are possible on x86-64 Linux, so of course anything you allocate (including other static arrays) will be somewhere "inside" this fake array.
Removing the extern const from the declaration (so the array is actually allocated, https://godbolt.org/z/Hp2Exc) runs into the following problems:
//extern const
char largish[1000000000000000000L];
//asm("largish = 0");
/* rest of the code unchanged */
RIP-relative or 32-bit absolute (-fno-pie -no-pie) addressing can't reach static data that gets linked after large[] in the BSS, with the default code model (-mcmodel=small where all static code+data is assumed to fit in 2GB)
$ g++ -O2 large.cpp
/usr/bin/ld: /tmp/cc876exP.o: in function `_GLOBAL__sub_I_largish':
large.cpp:(.text.startup+0xd7): relocation truncated to fit: R_X86_64_PC32 against `.bss'
/usr/bin/ld: large.cpp:(.text.startup+0xf5): relocation truncated to fit: R_X86_64_PC32 against `.bss'
collect2: error: ld returned 1 exit status
compiling with -mcmodel=medium places large[] in a large-data section where it doesn't interfere with addressing other static data, but it itself is addressed using 64-bit absolute addressing. (Or -mcmodel=large does that for all static code/data, so every call is indirect movabs reg,imm64 / call reg instead of call rel32.)
That lets us compile and link, but then the executable won't run because the kernel knows that only 48-bit virtual addresses are supported and won't map the program in its ELF loader before running it, or for PIE before running ld.so on it.
peter#volta:/tmp$ g++ -fno-pie -no-pie -mcmodel=medium -O2 large.cpp
peter#volta:/tmp$ strace ./a.out
execve("./a.out", ["./a.out"], 0x7ffd788a4b60 /* 52 vars */) = -1 EINVAL (Invalid argument)
+++ killed by SIGSEGV +++
Segmentation fault (core dumped)
peter#volta:/tmp$ g++ -mcmodel=medium -O2 large.cpp
peter#volta:/tmp$ strace ./a.out
execve("./a.out", ["./a.out"], 0x7ffdd3bbad00 /* 52 vars */) = -1 ENOMEM (Cannot allocate memory)
+++ killed by SIGSEGV +++
Segmentation fault (core dumped)
(Interesting that we get different error codes for PIE vs non-PIE executables, but still before execve() even completes.)
Tricking the compiler + linker + runtime with asm("largish = 0"); is not very interesting, and creates obvious undefined behaviour.
Fun fact #2: x64 MSVC doesn't support static objects larger than 2^31-1 bytes. IDK if it has a -mcmodel=medium equivalent. Basically GCC fails to warn about objects too large for the selected memory model.
<source>(7): error C2148: total size of array must not exceed 0x7fffffff bytes
<source>(13): warning C4311: 'type cast': pointer truncation from 'char *' to 'long'
<source>(14): error C2070: 'char [-1486618624]': illegal sizeof operand
<source>(15): warning C4311: 'type cast': pointer truncation from 'char *' to 'long'
Also, it points out that long is the wrong type for pointers in general (because Windows x64 is an LLP64 ABI, where long is 32 bits). You want intptr_t or uintptr_t, or something equivalent to printf("%p") that prints a raw void*.
The Standard does not anticipate the existence of any storage beyond that which the implementation provides via objects of static, automatic, or thread duration, or the use of standard-library functions like calloc. It consequently imposes no restrictions on how implementations process pointers to such storage, since from its perspective such storage doesn't exist, pointers that meaningfully identify non-existent storage don't exist, and things that don't exist don't need to have rules written about them.
That doesn't mean that the people on the Committee weren't well aware that many execution environments provided forms of storage that C implementations might know nothing about. The expected, however, that people who actually worked with various platforms would be better placed than the Committee to determine what kinds of things programmers would need to do with such "outside" addresses, and how to best support such needs. No need for the Standard to concern itself with such things.
As it happens, there are some execution environments where it is more convenient for a compiler to treat pointers arithmetic like integer math than to do anything else, and many compilers for such platforms treat pointer arithmetic usefully even in cases where they're not required to do so. For 32-bit and 64-bit x86 and x64, I don't think there are any bit patterns for invalid non-null addresses, but it may be possible to form pointers that don't behave as valid pointers to the objects they address.
For example, given something like:
char x=1,y=2;
ptrdiff_t delta = (uintptr_t)&y - (uintptr_t)&x;
char *p = &x+delta;
*p = 3;
even if pointer representation is defined in such a way that using integer arithmetic to add delta to the address of x would yield y, that would in no way guarantee that a compiler would recognize that operations on *p might affect y, even if p holds y's address. Pointer p would effectively behave as though its address was invalid even though the bit pattern would match that of y's address.
The following examples show that GCC specifically assumes at least the following:
A global array cannot be at address 0.
An array cannot wrap around address 0.
Examples of unexpected behavior arising from arithmetic on invalid pointers in gcc linux x86-64 C++ (thank you melpomene):
largish == NULL evaluates to false in the program in the question.
unsigned n = ...; if (ptr + n < ptr) { /*overflow */ } can be optimized to if (false).
int arr[123]; int n = ...; if (arr + n < arr || arr + n > arr + 123) can be optimized to if (false).
Note that these examples all involve comparison of the invalid pointers, and therefore may not affect the practical case of non-zero-based arrays. Therefore I have opened a new question of a more practical nature.
Thank you everyone in the chat for helping to narrow down the question.

`Relocation truncated to fit` error in Fortran with large arrays

I have written a Fortran 90 code to extract angles from molecular simulation data.
In this code I used a module with name all_parameter. In this module I defined an array such as: CH_Angles
INTEGER,PARAMETER :: totalFrames = 32000
INTEGER,PARAMETER :: AAA=75
REAL,DIMENSION(45:AAA,1:256,1:totalFrames) :: CH_Angles
If I use the value of AAA = 75, I can compile this code without any error and I can get the values I wanted. But if I change the value of AAA to be AAA=105, then I get some error messages as shown below:
gfortran lipid-Tilt-Magnitude-thermo-cello.f90
/tmp/ccXOhMqQ.o: In function `__all_parameter_MOD_find_angle_ch':
lipid-Tilt-Magnitude-thermo-cello.f90:(.text+0x35): relocation truncated to fit: R_X86_64_32S against symbol `__all_parameter_MOD_x' defined in .bss section in /tmp/ccXOhMqQ.o
lipid-Tilt-Magnitude-thermo-cello.f90:(.text+0x48): relocation truncated to fit: R_X86_64_32S against symbol `__all_parameter_MOD_y' defined in .bss section in /tmp/ccXOhMqQ.o
lipid-Tilt-Magnitude-thermo-cello.f90:(.text+0x5b): relocation truncated to fit: R_X86_64_32S against symbol `__all_parameter_MOD_z' defined in .bss section in /tmp/ccXOhMqQ.o
lipid-Tilt-Magnitude-thermo-cello.f90:(.text+0x6e): relocation truncated to fit: R_X86_64_32S against symbol `__all_parameter_MOD_x' defined in .bss section in /tmp/ccXOhMqQ.o
lipid-Tilt-Magnitude-thermo-cello.f90:(.text+0x81): relocation truncated to fit: R_X86_64_32S against symbol `__all_parameter_MOD_y' defined in .bss section in /tmp/ccXOhMqQ.o
lipid-Tilt-Magnitude-thermo-cello.f90:(.text+0x94): relocation truncated to fit: R_X86_64_32S against symbol `__all_parameter_MOD_z' defined in .bss section in /tmp/ccXOhMqQ.o
/tmp/ccXOhMqQ.o: In function `__all_parameter_MOD_find_mid_point_vector':
lipid-Tilt-Magnitude-thermo-cello.f90:(.text+0x126): relocation truncated to fit: R_X86_64_32S against symbol `__all_parameter_MOD_x' defined in .bss section in /tmp/ccXOhMqQ.o
lipid-Tilt-Magnitude-thermo-cello.f90:(.text+0x139): relocation truncated to fit: R_X86_64_32S against symbol `__all_parameter_MOD_y' defined in .bss section in /tmp/ccXOhMqQ.o
lipid-Tilt-Magnitude-thermo-cello.f90:(.text+0x14c): relocation truncated to fit: R_X86_64_32S against symbol `__all_parameter_MOD_z' defined in .bss section in /tmp/ccXOhMqQ.o
lipid-Tilt-Magnitude-thermo-cello.f90:(.text+0x15f): relocation truncated to fit: R_X86_64_32S against symbol `__all_parameter_MOD_x' defined in .bss section in /tmp/ccXOhMqQ.o
lipid-Tilt-Magnitude-thermo-cello.f90:(.text+0x172): additional relocation overflows omitted from the output
collect2: ld returned 1 exit status
vijay#glycosim:~/Simulation-Folder-Feb2013/chapter5-thermo-paper2-Vj/thermo2-Analysis/analysis-bcm-/23_acf-tail-tilt-angle-bcm-thermo2/chain1/acf-chain1-CH-bcm-thermo-all-layers$ gfortran lipid-Tilt-Magnitude-thermo-cello.f90
I also tried compiling this code with different values for AAA. With a value 80, the compilation goes without error. But, if the AAA is 85, then the compilation stop with error messages.
I found the AAA=82 is the limiting value. Any value of AAA more than 82, it gives error.
I can not figure out what causes the error.
Is there anyway to find solution for this issue?
Note: I am using gfortran compiler from Ubuntu 11.10 64 bit with 16 GB RAM memory.
The error you get is returned by the linker because the size of the statically-allocated block exceeds the range of what can be addressed by a 32-bit addressing instruction, which is 2 GB. This is irrelevant of whether you index your array using 32-bit or 64-bit integers - the problem is related to the total size of a statically-allocated array. This is explained in details here:
gfortran for dummies: What does mcmodel=medium do exactly?
In order to work around this, as you have noticed, you can compile your code with -mcmodel=medium or -mcmodel=large. Statically-allocated arrays larger than 2 GB are then allowed.
A better way to deal with this, but involves more work, is to dynamically allocate any large arrays.
If I remember aright, gfortran, like most current Fortran compilers, still defaults to 4-byte integers even on 64-bit hardware. This means, inter alia, that the largest array index will be 2^31 or 2^32. Since multi-rank arrays are just a convenient wrapper for rank-1 arrays it's no surprise to me (or to #MikeDunlavey) that your compiler baulks at allocating an array with so many elements as you want.
Try using 64-bit integers for array indexing. You could do this either by explicitly setting the kind for them, eg
use, intrinsic :: iso_fortran_env, only : int64
...
INTEGER(int64),PARAMETER :: totalFrames = 32000
INTEGER(int64),PARAMETER :: AAA=75
REAL,DIMENSION(45_int64:AAA,1_int64:256_int64,1_int64:totalFrames) :: CH_Angles
or by using a compiler flag to set the default size of integers to 64 bits. For gfortran this would be -fdefault-integer-8.
I won't guarantee that this will work for gfortran, which is not a compiler I use regularly, but it does for Intel Fortran.
Your array CH_Angles is pushing a gigabyte in size, so index arithmetic is going to push the 32-bit limit.
I would expect things to get a bit screwy at that size.

Error typecasting void pointer to int

I am trying to build a project on Xcode 4.2 in which there is some code which typecasts a void* to an int. This typecasting does not result in error during c++ compilation as I tried here.
It was also working fine in my project until I changed the Valid Architectures in the Build Settings from i386 to i386 x86_64 which basically compiles the code in 64-bit mode too. I had to perform this change since I am working on de-carbonizing the project. So, after that change, many errors were introduced including this one which I am finding a bit difficult to digest. Any ideas what might be going on?
On x86, a void* is 32 bits long, and an int is very likely to also be 32 bits long, so everything works.
On x86_64, however, a void* is 64 bits long, while an int is likely to remain 32 bits, so the value no longer fits.
If you need to store a pointer in an integral type, use intptr_t or uintptr_t, which are designed for this purpose.

how to use macro for a unsigned long number?

Here're my codes:
#define MSK 0x0F
#define UNT 1
#define N 3000000000
unsigned char aln[1+N];
unsigned char pileup[1+N];
void set(unsigned long i)
{
if ((aln[i] & MSK) != MSK ) {
aln[i] += UNT;
}
}
int main(void) {}
When I try to compile it, the compiler complains like this:
tmp/ccJ4IgSa.o: In function `set':
bitmacs.c:(.text+0xf): relocation truncated to fit: R_X86_64_32S against symbol `aln' defined in COMMON \
section in /tmp/ccJ4IgSa.o
bitmacs.c:(.text+0x29): relocation truncated to fit: R_X86_64_32S against symbol `aln' defined in COMMON\
section in /tmp/ccJ4IgSa.o
bitmacs.c:(.text+0x32): relocation truncated to fit: R_X86_64_32S against symbol `aln' defined in COMMON\
section in /tmp/ccJ4IgSa.o
I think the reason may be the N is too big, because it can compile successfully if I change N to 2000000000. But I need 3000000000 as the value of N..
Anyone has idea about that?
Per your original question: use the integer literal suffix UL (or similar) to force the storage type of N:
#define N 3000000000UL
However, (per your comment on HLundvall's answer) the relocation truncated to fit error obviously isn't due to this - it may (as Mystical and Matt Lacey say) simply be too big to fit in the segment.
As an aside, if you ask a seperate question explaining what you're trying to accomplish with your huge arrays, someone may be able to suggest a better solution (that is more likely to fit in memory)
For example:
your sample code is only using the low nibble of each byte in the code shown: you could pack this into half the size (which is admittedly still much too large)
depending on your access patterns, you might be able to keep the array on disk and cache a working subset in memory
there may be better overall algorithms and data structures if we knew what you needed
Disregarding the "formal" problem that your numeric literal isn't of the correct type (see the other answers for the correct syntax), the key point here is that it's a very bad idea to allocate a 3 GB static/global array.
static and global1 variables on most platforms are mapped directly from the executable image, which means that your executable would have to be as big as 3 GB, which is quite big even for current day standards. Even if on some platforms this limitation may be lifted (see the comments), you don't have any control on how to handle the failure of allocation.
Most importantly, global variables are not intended for such big stuff, and you are likely to find problems with arbitrary limits imposed by the linker (such as the one you found) and the loader. Instead, you should allocate anything that's bigger than a few KBs on the heap, using malloc, new or some platform-specific function, handling gracefully the possible failure at runtime.
Still, keep in mind that for an application running under almost any 32 bit operating system it's not possible to get 3 GB of contiguous memory as you request, and it's impossible altogether to get more than one of these arrays (=more than 4 GB of contiguous memory) without resorting to platform-specific tricks (e.g. mapping only specific parts of the arrays in memory at a given moment).
Also, are you sure that you do need all that contiguous memory since your program starts to run? Isn't there some better data structure/algorithm that could avoid allocating all that memory?
In general, what the standard calls variables with static storage duration.
To enter a numeric constant of type unsigned long use:
#define N 3000000000UL
The problem is that gcc (by default) uses pc-relative accesses to get the address of static data objects on x86_64 targets, and those accesses are limited to 2^31 bytes maximum. So if the symbol ends up getting placed more than 2GB away from the code that accesses it, you'll end up getting this link error when it tries to use an offset that is too big to fit in the 32 bits of space allowed in the instruction.
You can avoid this problem by using the -mcmodel=large option to gcc. This tells it to not assume that it can use 32-bit PC relative offsets to access symbols (among other things)
Note that the type suffix of the constant literal is mostly irrelevant -- a constant literal that is too big for an int will automatically become a long (or even long long if needed) without any suffix. See 6.4.4.1.5 of the C99 spec.
Your executable is trying to put objects in memory past the 4GB mark, which is not allowed. See this link: http://www.technovelty.org/code/c/relocation-truncated.html.
From the article: "If you're seeing this and you're not hand-coding, you probably want to check out the -mmodel argument to gcc."