Very large array on the heap (Visual C++) - c++

I hope some one can help me, i'm trying to create an int[400000000] (400 millions) array on my application using visual c++ 2010 but it generates an overflow error
The same code runs on linux with g++.
I need this because i'm working with large matrices.
Thank you in advance.

If you are using a 32-bit application then by default you have just 2GB of user address space. 400 million integers is about 1.5GB. You are very likely not to have this much contiguous address space. It is possible to force 32-bit windows to allocate a 3GB user address space for each process but this may just be a stop gap for your situation.
If you can move to a 64-bit architecture then this should not be an issue; otherwise you should find a way of storing your matrix data in a way that does not require a single block of contiguous storage, for example storing it in chunks.

I think what you need is a Divide-and-Conquer algorithm. Not memory space.

I'm not sure if in you're case it wouldn't even be better to use STXXL.

Perhaps sparse matrices are of use in your application. This concept is used when dealing with big matrices which have a lot of 0 entries, which can be the case in quite a lot of applications.
And by the way, you do not gain anything by storing such a huge amount of data on the heap. Consider, that your CPU cache has perhaps 12 MB! At least use some intelligent dynamic memory allocation mechanism.

Does the whole array really needs to be allocated ? do you really use the whole array ? Is it an array with lots of 0 ? if it is the case, then the fact that it works better on linux can be explained.
In that case using a sparse array might be more appropriate. Using an existing sparse array implementation would reduce the memory footprint and maybe allow faster computation.

I just found a very simple solution but i don't know if it is advisable
int tab[400000000]={0};//global array
int main(array<System::String ^> ^args)
{
std::cout<<tab[399999999]<<std::endl;//ok
/*
int* tab=new int[400000000];//doesn't work
...
delete[] tab;
*/
return 0;
}

Related

How to make a character array of size 1000000000000

Since I am new to the programming field and I was trying to make a character array of very large size say for example 1000000000000 but my compiler is showing error:
Array too large
I am using turbo c++.
Can anyone please tell me how to do that?
You have several problems:
Firstly Turbo-C++ is a 16-bit compiler, and even with the best will in the world, it is not going to be able to cope. Even a 32-bit compiler (maximum address space just over 4,000,000,000 bytes) won't be able to cope. You need to use a 64-bit compiler.
Your next problem is that if you try to allocate such an enormous array on the stack, it won't fit. Most systems use a stack of around 1MB. You need to allocate the array on the heap. I normally(*) I would recommend using std::vector (because it manages releasing the memory for you). So instead of:
char big[1000ull*1000*1000*1000];
You need:
std::vector<char> big(1000ull*1000*1000*1000);
Your final problem is that very few machines are going to have 1TB of RAM installed. On Windows 10 you can allocate that much address space - but most of it is going to be in the swap, not in RAM.
*: This is why I wouldn't recommend std::vector here. Something involving either memory mapped files, or a more efficient data structure is going to be better. We can't tell what, unless you explain your actual problem.

c++ Alternative implementation to avoid shifting between RAM and SWAP memory

I have a program, that uses dynamic programming to calculate some information. The problem is, that theoretically the used memory grows exponentially. Some filters that I use limit this space, but for a big input they also can't avoid that my program runs out of RAM - Memory.
The program is running on 4 threads. When I run it with a really big input I noticed, that at some point the program starts to use the swap memory, because my RAM is not big enough. The consequence of this is, that my CPU-usage decreases from about 380% to 15% or lower.
There is only one variable that uses the memory which is the following datastructure:
Edit (added type) with CLN library:
class My_Map {
typedef std::pair<double,short> key;
typedef cln::cl_I value;
public:
tbb::concurrent_hash_map<key,value>* map;
My_Map() { map = new tbb::concurrent_hash_map<myType>(); }
~My_Map() { delete map; }
//some functions for operations on the map
};
In my main program I am using this datastructure as globale variable:
My_Map* container = new My_Map();
Question:
Is there a way to avoid the shifting of memory between SWAP and RAM? I thought pushing all the memory to the Heap would help, but it seems not to. So I don't know if it is possible to maybe fully use the swap memory or something else. Just this shifting of memory cost much time. The CPU usage decreases dramatically.
If you have 1 Gig of RAM and you have a program that uses up 2 Gb RAM, then you're going to have to find somewhere else to store the excess data.. obviously. The default OS way is to swap but the alternative is to manage your own 'swapping' by using a memory-mapped file.
You open a file and allocate a virtual memory block in it, then you bring pages of the file into RAM to work on. The OS manages this for you for the most part, but you should think about your memory usage so not to try to keep access to the same blocks while they're in memory if you can.
On Windows you use CreateFileMapping(), on Linux you use mmap(), on Mac you use mmap().
The OS is working properly - it doesn't distinguish between stack and heap when swapping - it pages you whatever you seem not to be using and loads whatever you ask for.
There are a few things you could try:
consider whether myType can be made smaller - e.g. using int8_t or even width-appropriate bitfields instead of int, using pointers to pooled strings instead of worst-case-length character arrays, use offsets into arrays where they're smaller than pointers etc.. If you show us the type maybe we can suggest things.
think about your paging - if you have many objects on one memory page (likely 4k) they will need to stay in memory if any one of them is being used, so try to get objects that will be used around the same time onto the same memory page - this may involve hashing to small arrays of related myType objects, or even moving all your data into a packed array if possible (binary searching can be pretty quick anyway). Naively used hash tables tend to flay memory because similar objects are put in completely unrelated buckets.
serialisation/deserialisation with compression is a possibility: instead of letting the OS swap out full myType memory, you may be able to proactively serialise them into a more compact form then deserialise them only when needed
consider whether you need to process all the data simultaneously... if you can batch up the work in such a way that you get all "group A" out of the way using less memory then you can move on to "group B"
UPDATE now you've posted your actual data types...
Sadly, using short might not help much because sizeof key needs to be 16 anyway for alignment of the double; if you don't need the precision, you could consider float? Another option would be to create an array of separate maps...
tbb::concurrent_hash_map<double,value> map[65536];
You can then index to map[my_short][my_double]. It could be better or worse, but is easy to try so you might as well benchmark....
For cl_I a 2-minute dig suggests the data's stored in a union - presumably word is used for small values and one of the pointers when necessary... that looks like a pretty good design - hard to improve on.
If numbers tend to repeat a lot (a big if) you could experiment with e.g. keeping a registry of big cl_Is with a bi-directional mapping to packed integer ids which you'd store in My_Map::map - fussy though. To explain, say you get 987123498723489 - you push_back it on a vector<cl_I>, then in a hash_map<cl_I, int> set [987123498723489 to that index (i.e. vector.size() - 1). Keep going as new numbers are encountered. You can always map from an int id back to a cl_I using direct indexing in the vector, and the other way is an O(1) amortised hash table lookup.

C++ Vector Memory

I'm working on a largish project, and we are having some memory issues now. Vectors have been used for all arrays, and a quick search there seems to be about 2000 member vectors.
Going through the code it seems nobody has ever used a reserve or a swap (were not on C++11 yet for this project).
Are there any tools or techniques I can do to find out how much memory is being lost in these vectors?
use valgrind for debugging memory issues.
http://valgrind.org/docs/manual/ms-manual.html
One fast but dirty trick to see the effect of capacity on memory would be to modify
std::vector (or typedef std::vector to your custom vector type).
Idea is to modify vector to ensure that this custom new vector increases capacity exactly by what is needed instead of doubling it (yes, it will be super slow), and see how memory usage of the application changes when you run it with this custom vector.
While not useful in actually optimizing the code, it at least quickly gives you an idea of how much you can gain by optimizing vectors.
Just add some periodic logging lines that print the vector size, capacity and
sizeof(v) + sizeof(element_type) * v.capacity();
for each of your vectors v (this last will be the exact size of the vector in memory). You could register all your vectors somewhere central to keep this tidy.
Then you can do some analysis by searching through your logfiles - to see which ones are using significant amounts of memory and how the usage varies over time. If it is only peak usage that is high, then you may be able to 'resize' your vectors to get rid of the spare capacity.

How to create an array with size more than C++ limits

I have a little problem here, i write c++ code to create an array but when i want to set array size to 100,000,000 or more i got an error.
this is my code:
int i=0;
double *a = new double[n*n];
this part is so important for my project.
When you think you need an array of 100,000,000 elements, what you actually need is a different data structure that you probably have never heard of before. Maybe a hash map, or maybe a sparse matrix.
If you tell us more about the actual problem you are trying to solve, we can provide better help.
In general, the only reason that would fail would be due to lack of memory/memory fragmentation/available address space. That is, trying to allocate 800MB of memory. Granted, I have no idea why your system's virtual memory can't handle that, but maybe you allocated a bunch of other stuff. It doesn't matter.
Your alternatives are to tricks like memory-mapped files, sparse arrays, and so forth instead of an explicit C-style array.
If you do not have sufficient memory, you may need to use a file to store your data and process it in smaller chunks.
Don't know if IMSL provides what you are looking for, however, if you want to work on smaller chunks you might devise an algorithm that can call IMSL functions with these small chunks and later merge the results. For example, you can do matrix multiplication by combining multiplication of sub-matrices.

Is there a way to make sure an array variable (unsigned int*) will be in memory?

I need to set some default value for all entires in a very large array.
It takes me quite long time (110-120 ms) and i suspect it happens because of misses in memory.
I use memset/std:fill to set the default value. Is there a way to make sure that the array will reside in memory before the memset/fill?
Assuming this is a large memory-mapped file, you can use the madvise() libc call with the MADV_WILLNEED argument to hint to the OS that you'll be wanting to access the region mentioned soon.
However YMMV, as the array needs to be large enough that the benefit of the resulting syscall isn't outweighed by the cost of making the call.
You can lock memory at per-page granuality using mlock, though only up to a fixed amount (I'm not sure what the limit is on OS X, but you can check it using getrlimit with RLIMIT_MEMLOCK).
Most likely you have a multiple core processor and functions like memset actually degrade in performance when not used on single core CPUs. It's possible that mutex locking are causing the slowdown. Try allocating memory on the stack instead of dynamic memory. Since it's a very large array then I would experiment making my own memory manager and store segments of it in multiple threads (but that's just an idea I had after reading an article fast). A standard way of doing it would be to use one memory allocator per thread. In any case I would look into something else than memset.
Maybe the following aticle would help