How to make a character array of size 1000000000000 - c++

Since I am new to the programming field and I was trying to make a character array of very large size say for example 1000000000000 but my compiler is showing error:
Array too large
I am using turbo c++.
Can anyone please tell me how to do that?

You have several problems:
Firstly Turbo-C++ is a 16-bit compiler, and even with the best will in the world, it is not going to be able to cope. Even a 32-bit compiler (maximum address space just over 4,000,000,000 bytes) won't be able to cope. You need to use a 64-bit compiler.
Your next problem is that if you try to allocate such an enormous array on the stack, it won't fit. Most systems use a stack of around 1MB. You need to allocate the array on the heap. I normally(*) I would recommend using std::vector (because it manages releasing the memory for you). So instead of:
char big[1000ull*1000*1000*1000];
You need:
std::vector<char> big(1000ull*1000*1000*1000);
Your final problem is that very few machines are going to have 1TB of RAM installed. On Windows 10 you can allocate that much address space - but most of it is going to be in the swap, not in RAM.
*: This is why I wouldn't recommend std::vector here. Something involving either memory mapped files, or a more efficient data structure is going to be better. We can't tell what, unless you explain your actual problem.

Related

how to manage large arrays

I have a c++ program that uses several very large arrays of doubles, and I want to reduce the memory footprint of this particular part of the program. Currently, I'm allocating 100 of them and they can be 100 Mb each.
Now, I do have the advantage, that eventually parts of these arrays become obsolete during later parts of the program's execution, and there is little need to ever have the whole of any one of then in memory at any one time.
My question is this:
Is there any way of telling the OS after I have created the array with new or malloc that a part of it is unnecessary any more ?
I'm coming to the conclusion that the only way to achieve this is going to be to declare an array of pointers, each of which may point to a chunk say 1Mb of the desired array, so that old chunks that are not needed any more can be reused for new bits of the array. This seems to me like writing a custom memory manager which does seem like a bit of a sledgehammer, that's going to create a bit of a performance hit as well
I can't move the data in the array because it is going to cause too many thread contention issues. the arrays may be accessed by any one of a large number of threads at any time, though only one thread ever writes to any given array.
It depends on the operating system. POSIX - including Linux - has the system call madvise to do improve memory performance. From the man page:
The madvise() system call advises the kernel about how to handle paging input/output in the address range beginning at address addr and with size length bytes. It allows an application to tell the kernel how it expects to use some mapped or shared memory areas, so that the kernel can choose appropriate read-ahead and caching techniques. This call does not influence the semantics of the application (except in the case of MADV_DONTNEED), but may influence its performance. The kernel is free to ignore the advice.
See the man page of madvise for more information.
Edit: Apparently, the above description was not clear enough. So, here are some more details, and some of them are specific to Linux.
You can use mmap to allocate a block of memory (directly from the OS instead of the libc), that is not backed by any file. For large chunks of memory, malloc is doing exactly the same thing. You have to use munmap to release the memory - regardless of the usage of madvise:
void* data = ::mmap(nullptr, size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
// ...
::munmap(data, size);
If you want to get rid of some parts of this chunk, you can use madvise to tell the kernel to do so:
madvise(static_cast<unsigned char*>(data) + 7 * page_size,
3 * page_size, MADV_DONTNEED);
The address range is still valid, but it is no longer backed - neither by physical RAM nor by storage. If you access the pages later, the kernel will allocate some new pages on the fly and re-initialize them to zero. Be aware, that the dontneed pages are also part of the virtual memory size of the process. It might be necessary to make some configuration changes to the virtual memory management, e.g. activating over-commit.
It would be easier to answer if we had more details.
1°) The answer to the question "Is there any way of telling the OS after I have created the array with new or malloc that a part of it is unnecessary any more ?" is "not really". That's the point of C and C++, and any language that let you handle memory manually.
2°) If you're using C++ and not C, you should not be using malloc.
3°) Nor arrays, unless for a very specific reason. Use a std::vector.
4°) Preferably, if you need to change often the content of the array and reduce the memory footprint, use a linked list (std::list), though it'll be more expensive to "access" individually the content of the list (but will be almost as fast if you only iterate through it).
A std::deque with pointers to std::array<double,LARGE_NUMBER> may do the job, but you better make a dedicated container with the deque, so you can remap the indexes and most importantly, define when entries are not used anymore.
The dedicated container can also contain a read/write lock, so it can be used in a thread-safe way.
You could try using lists instead of arrays. Of course list is 'heavyer' than array but on the other hand it is easy to reconstruct a list so that you can throw away a part of it when it becomes obsolete. You could also use a wrapper which would only contain indexes saying which part of the list is up-to-date and which part may be reused.
This will help you improve performance, but will require a little bit more (reusable) memory.
Allocating by chunk and delete[]-ing and new[]-ing on the way seems like the good solution. It may be possible to do as little as memory management as possible. Do not reuse chunk yourself, simply deallocate old one and allocate new chunks when needed.

Declaring large character array in c++

I am trying right now to declare a large character array. I am using the character array as a bitmap (as in a map of booleans, not the image file type). The following code generates a compilation error.
//This is code before main. I want these as globals.
unsigned const long bitmap_size = (ULONG_MAX/(sizeof(char)));
char bitmap[bitmap_size];
The error is overflow in array dimension. I recognize that I'm trying to have my process consume a lot of data and that there might be some limit in place that prevents me from doing so. I am curious as to whether I am making a syntax error or if I need to request more resources from the kernel. Also, I have no interest in creating a bitmap with some class. Thank you for your time.
EDIT
ULONG_MAX is very much dependent upon the machine that you are using. On the particular machine I was compiling my code on it was well over 4.2 billion. All in all, I wouldn't to use that constant like a constant, at least for the purpose of memory allocation.
ULONG_MAX/sizeof(char) is the same as ULONG_MAX, which is a very large number. So large, in fact, that you don't have room for it even in virtual memory (because ULONG_MAX is probably the number of bytes in your entire virtual memory).
You definitely need to rethink what you are trying to do.
It's impossible to declare an array that large on most systems -- on a 32-bit system, that array is 4 GB, which doesn't fit into the available address space, and on most 64-bit systems, it's 16 exabytes (16 million terabytes), which doesn't fit into the available address space there either (and, incidentally, may be more memory than exists on the entire planet).
Use malloc() to allocate large amounts of memory. But be realistic. :)
As I understand it, the maximum size of an array in c++ is the largest integer the platform supports. It is likely that your long-type bitmap_size constant exceeds that limit.

Create too large array in C++, how to solve?

Recently, I work in C++ and I have to create a array[60.000][60.000]. However, i cannot create this array because it's too large. I tried float **array or even static float array but nothing is good. Does anyone have an ideas?
Thanks for your helps!
A matrix of size 60,000 x 60,000 has 3,600,000,000 elements.
You're using type float so it becomes:
60,000 x 60,000 * 4 bytes = 14,400,000,000 bytes ~= 13.4 GB
Do you even have that much memory in your machine?
Note that the issue of stack vs heap doesn't even matter unless you have enough memory to begin with.
Here's a list of possible problems:
You don't have enough memory.
If the matrix is declared globally, you'll exceed the maximum size of the binary.
If the matrix is declared as a local array, then you will blow your stack.
If you're compiling for 32-bit, you have far exceeded the 2GB/4GB addressing limit.
Does "60.000" actually mean "60000"? If so, the size of the required memory is 60000 * 60000 * sizeof(float), which is roughly 13.4 GB. A typical 32-bit process is limited to only 2 GB, so it is clear why it doesn't fit.
On the other hand, I don't see why you shouldn't be able to fit that into a 64-bit process, assuming your machine has enough RAM.
Allocate the memory at runtime -- consider using a memory mapped file as the backing. Like everyone says, 14 gigs is a lot of memory. But it's not unreasonable to find a computer with 14GB of memory, nor is it unreasonable to page the memory as necessary.
With a matrix of this size, you will likely become very curious about memory access performance. Remember to consider the cache grain of your target architecture and if your target has a TLB you may be able to use larger pages to relieve some TLB pressure. Then again, if you don't have enough memory you'll likely care only about how fast your storage I/O is.
If it's not already obvious, you'll need an architecture that supports a 64-bit address space in order to access this memory directly/conveniently.
To initialise the 2D array of floats that you want, you will need:
60000 * 60000 * 4 bytes = 14400000000 bytes
Which is approximately 14GB of memory. That's a LOT of memory. To even hold that theoretically, you will need to be running a 64bit machine, not to mention one with quite a bit of RAM installed.
Furthermore, allocating this much memory is almost never necessary in most situations, are you sure no optimisations could be made here?
EDIT:
In light of new information from your comments on other answers: You only have 4GB memory (RAM). Your operating system is hence going to have to page at least 9GB on the Hard Drive, in reality probably more. But you also only have 20GB of Hard Drive space. This is barely enough to page all that data, especially if the disk is fragmented. Finally, (I could be wrong because you haven't stated explicitly) it is quite possible that you're running a 32bit machine. This isn't really capable of handling more than 4GB of memory at a time.
I had this problem too. I did a workaround where I chopped the array into sections (my biggest allowed array was float A_sub_matrix_20[62944560]). When I declared just one of these in main(), it seems to be put in RAM as I got a runtime exception as soon as main() starts. I was able to declare 20 buffers of that size as global variables which works (looks like in global form they are stored on the HDD - when I added A_sub_matrix_20[n] to the watch list in VisualStudio it gave a message "reading from file").

How to allocate long array of struct?

I have such struct:
struct Heap {
int size;
int *heap_array;
};
And I need to create an array:
Heap *rooms = new Heap[k];
k may be even equal 1000000. For k about 1000 it works, with k about 10000 I got:
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted
Edit:
I forgot to add, I cant use vector, it is task in my school... Only <cstdio> and <math> allowed.
Are you using 32 or 64 bit?
Depending on this your process can only consume memory up to a maximum size. I am guessing you are on 32 bits. Maybe you don't even have so much memory to begin with.
Also take a look here :
http://www.cplusplus.com/reference/std/new/bad_alloc/
Updated
You should ensure you are not leaking anything and that your heap allocations do not live longer than needed. Attempt to reduce the allocation requirements per Heap. As well, if you are allocating that many Heaps: Where is the storage for heap_array? Are those all also new[]ed?
If you exceed the amount of addressable memory for your system, you may need to run your program as a 64 bit executable.
bad_alloc basically means that new is unable to allocate the requested space. It surprises me that you see it already when attempting to allocate 10 000. How much memory are you using besides that?
You might want to check what your alignment is set to (how you do this is compiler specific) Using a vector shouldn't really help in avoiding the bad_alloc exception, especially not if you know from start the number of elements needed.
You might be running your head against the wall here if you are trying to allocate more memory than you have (2 Gb on Win 32 bit), if this is the case try looking at this answer:
C++ memory allocation: 'new' throws bad_alloc?
You can also risk running into fragmentation issues, there might be enough space counting the number of free bytes, but not enough space in a single cluster. The link above brings some suggestions for that as well in the post by the user Crashworks he suggests using the (although OS specific) functions HeapAlloc and VirtualAlloc. But then again this would conflict with your school assignment.
Instead try investigating if you receive the same problem on a different computer.
Perhaps if it is truly necessary to allocate and process enough structs to cause a bad_alloc exception you could consider processing only a few at a time, preferrably reusing already allocated structs. This would improve your memory usage numbers, and might even prove to be faster.

Very large array on the heap (Visual C++)

I hope some one can help me, i'm trying to create an int[400000000] (400 millions) array on my application using visual c++ 2010 but it generates an overflow error
The same code runs on linux with g++.
I need this because i'm working with large matrices.
Thank you in advance.
If you are using a 32-bit application then by default you have just 2GB of user address space. 400 million integers is about 1.5GB. You are very likely not to have this much contiguous address space. It is possible to force 32-bit windows to allocate a 3GB user address space for each process but this may just be a stop gap for your situation.
If you can move to a 64-bit architecture then this should not be an issue; otherwise you should find a way of storing your matrix data in a way that does not require a single block of contiguous storage, for example storing it in chunks.
I think what you need is a Divide-and-Conquer algorithm. Not memory space.
I'm not sure if in you're case it wouldn't even be better to use STXXL.
Perhaps sparse matrices are of use in your application. This concept is used when dealing with big matrices which have a lot of 0 entries, which can be the case in quite a lot of applications.
And by the way, you do not gain anything by storing such a huge amount of data on the heap. Consider, that your CPU cache has perhaps 12 MB! At least use some intelligent dynamic memory allocation mechanism.
Does the whole array really needs to be allocated ? do you really use the whole array ? Is it an array with lots of 0 ? if it is the case, then the fact that it works better on linux can be explained.
In that case using a sparse array might be more appropriate. Using an existing sparse array implementation would reduce the memory footprint and maybe allow faster computation.
I just found a very simple solution but i don't know if it is advisable
int tab[400000000]={0};//global array
int main(array<System::String ^> ^args)
{
std::cout<<tab[399999999]<<std::endl;//ok
/*
int* tab=new int[400000000];//doesn't work
...
delete[] tab;
*/
return 0;
}