Arrays and Caching [closed] - c++

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
If I have a char array that is of length 8 billion. Would breaking it into smaller arrays increase performance by improving caching? Basically, I will iterate the array and do some comparisons. If not, what is the most optimal way of using an array with such length.
I am reading a file in binary form into an array, and will be performing binary comparisons on different parts of the file.

8 GB worth of data will inevitably ruin data locality so one way or the other you either have to manage your memory in smaller pieces or your OS will do the disk swapping of virtual memory.
There is, however, an alternative - a so-called mmap. Essentially this allows you to map a file into a virtual memory space and your OS then takes the task of accessing it and loading the necessary pages into RAM, while your access to this file becomes nothing more than just a simple memory addressing.
Read more about mmap at http://en.wikipedia.org/wiki/Mmap

If you are going to do this once then just run through it. The programming effort may not be worth the time gained.
I am assuming you want to do this again and again which is why you want to optimize it. It would surely help to know if your iteration and comparisons need to be done sequentially etc? Without some problem domain input it is kind of difficult to give a generic optimization here.
If it can be done in parallel and you have to do it multiple times I suggest you take a look at MapReduce techniques to solve this.

Related

Why does using std::vector<> instead of std::list<> cause an increase in code size? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
In a project at work a lot std::list and std::vector are used. Since random inserts were seldom needed, I started to change the std::lists to std::vectors. But with every switch the resulting code size increased (not a fixed amount but roughly 1kB on average). Given that std::vector was already used, I don't see why switching a std::list to std::vector should increase the code size. Any ideas why? The compiler used is g++.
Maybe you have added a vector of a new type (e.g. In your original code you used vector<int> and now you added a vector<string>: they are different types, so the code size will increase to include the new type).
Is this in debug mode? If yes, it could be inlined range checking code that increases the code size. Note that this is not so much necessary for lists, where you only need to check if the next node is null.
Okay, without further details we can only guess.
The vector-memory is contiguous (which is guaranteed by the standard), but list-memory is not. Therefore it might be that the compiler is able to vectorize and unroll your vector-based code better, which leads to bigger instructions and longer binary code.
std::vector contains more functions and code than list (no random access in list)

c++ Realy large floating point calculation [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I want to do really large floating point calculations. Should be fast enough.
How can I make use of Graphics processors if available? If not GPU available then I would like to use main CPU.
thanks
Depending on the 'size' of these numbers, you can try MPFR, although its not a GPU solution, it can handle big numbers and should be relitively fast, its used by a few opensource compilers(GCC & LLVM) to do static constant folding, so its designed to preserve accuracy.
To do work on a GPU (really a GPGPU), you'd need to write a kernel using soemthing like OpenCL or DirectCompute, and do your number crunching in that.
You may also be interested in intels new AVX extensions as well.
Floating point calculations can handle very large numbers, very small numbers, and numbers somewhere in-between. The problem is that the number of significant digits is limited, and and number that doesn't perfectly fit in a base two representation (numbers similar to 1/3 or 1/7) will experience errors as they are translated into their closest base 2 counterparts.
If a GPU is available, as one is in nearly all computers with video, then a libarary like GPGPU should help you access it without writing tons of assembly language. That said, until you are sure your computations will involve operations that are similar to those already performed by a GPU, you would be better off avoiding the GPU as they are excellent for doing what they already do, and poor at being adaptable to anything else.

Generating word library - C or C++ [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I need to create a simple application, but speed is very important here. Application is pretty much simple.
It will generate all available chars by saving them to text file. User will enter length that will be used for generating so the application will use recursive function with loop inside.
Will C be faster then C++ in this matter, or it does not matter?
Speed is very important because if my application needs to generate/save to file 10 million+ words.
It doesn't really matter, chances are your application will be I/O bound rather than CPU bound unless you have enough RAM to hold all that in memory.
It's much more important that you choose the best algorithm, and the best data structures to back that algorithm up.
Then implement that in the language you're most familiar with. C++ has the advantage of having easy to use containers in its standard libraries, but that's about it. You can write slow code in both, and fast code in both.

FFmpeg potential mem leak like behavior points. Are there such? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 12 years ago.
So I mean like it tries to educate it self and collects some data... Are there any such points/feilds in encoding part of ffmpeg (that I hope can be disabeld)?
BTW: My problem is simple: I looked thrue out all my code. It seriosly looks like it is some part of my ffmpeg windows build leaks memory a littel... all the time while I am encoding... So I hope ffmpeg is just triing to learn so that I would be able to tell ti not to learn!)
FFmpeg libraries use a very object-oriented design. All memory allocated should be kept track of in the context structures and freed when the relevant context is destroyed. There may be some one-time allocation and initialization of constant global data which one could call a "leak", but I believe that was all replaced with static const tables to facilitate better use of shared memory and eliminate memory leaks associated with dynamic loading. If you really think it's leaking (and if you care), you need to use some memory debugging tools to identify where the leaks might be and coordinate finding/fixing them with the developers.
If what you mean is that during a single encode, memory usage grows slightly, this is probably normal and to be expected. It shouldn't be much, and the memory should all be released when the encoding context is freed.

How to fix heap corruption in c/c++? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 12 years ago.
This is a further question for my previous one here.
Is there a general solution for this?
Fix all dangling pointers
Fix all buffer overflows
Use pointers only where they are really needed
Reading your original post I'm not 100% you are facing a heap-corruption and you really hope you don't because if you would this is one of the most tricky errors to track down and AFAIK there are no solutions that always work.
However, if you are positive its a heap-corruption and you are on the windows platform you could try out the tool gflags. This advanced debugging tools allow you to install a debug heap manager in order to make it possible to find certain kinds of heap corruptions. One of the neat options it has is to make each allocation in it's own page or to write protect the heap datastructures. This makes it easier to pinpoint the exact location of the heap-corruption. It sucks lots of memory and CPU though.
Another tip if you are using Visual Studio is that if you manage to find that something is corrupting the data after a certain point of time and you wish to find the code that corrupts the data you can use Data Breakpoint to break whenever someone changes the data in question.
I wish you good luck.
Adding to swegi's answer I would check all your C++ constructors and all C initializations. Transform all dynamic C++ constructors (those where you put the init code in the function body) into static ones (where you initialize with the special constructor syntax). And be sure that you init all pointers and pointer members to 0. For C, I would initialize all variables.
Then, on unix I would use valgrind, usually this is quite good in finding all access violations and if you compile with all debugging options on it is able to trace it to the source line. There should be something similar on windows, too.