How to Parallelize PDF to HTML conversion on GPU? [closed] - c++

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I want to Parallelize PDF to HTML conversion. Not in file level, but in page level or object level. Is it a wise choice for parallelization? If it is so, how it can be done? Will the speed be appreciable in GPU, when compared with the same in CPU??

My simplest answer would be - it may be not feasible.
Basically - The most important classification here is whether a problem is task parallel or data parallel. The first one refers, roughly speaking, to problems where several threads are working on their own tasks, more or less independently. The second one refers to problems where many threads are all doing the same - but on different parts of the data.The latter is the kind of problem that GPUs are good at: They have many cores, and all the cores do the same, but operate on different parts of the input data.
Next issue is to move the data around.
GPU programming is an art, and it can be very, very challenging to get it right.
So the question is - can you parallelize the of the format conversion? I did some conversions before and almost none of them were feasible for parallel processing.

Related

How do I compare two functions' speed and performance [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I have two functions performing same process but with different techniques and I need to know on a large scale which technique is faster than the other of maybe in the future will be more techniques available. So my question is, how can I do that in c++ specially? Is there a specific method and header to be used to perform this task?
More details:
For example the isLargest() uses three parameters and it has two versions, one uses a nested if technique and the other uses initializers and less if statements. So if I need to know which one is faster, how can I do that?
Try your code in the real world and measure
There is a tool called a profiler that is meant to solve this problem. Broadly speaking, there are two kinds (note: some are a mix between the two):
Sampling profilers.
Instrumenting profilers.
It's worth learning about what each does and their pros/cons, but if you don't know what to use go with a sampling profiler.
There are many sampling profilers, but support depends on your platform. If you're on Windows, Visual Studio comes with a really nice sampling profiler and I recommend you start there!
If you go down this route, it's important to make sure you use your functions as you would "for real" when you're profiling them, as there are many subtle factors that can affect the result.
An alternative
If you don't want to try your code running in a real program, perhaps if you're just trying to understand general characteristics of the function, there are libraries to help you do this such as Google Benchmark.
Benchmarking code can be surprisingly difficult to get right, so I would strongly recommend using existing benchmarking tools where like Google Benchmark wherever possible.

Interior Point Method(Path following) vs Simplex [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
What are pros and cons of these two LP methods ?
I can only think of less iterations in Interior Point Method (when LPP is sufficiently large).
I'm going to list some features of both algorithms to explain what differentiates them.
Simplex
provides a basic solution, useful for branch and bound solvers in integer programming
easy to warm (or hot) start from a suboptimal solution, also necessary for integer programming
very high iteration speed mainly due to preservation of sparse data structures, but sometimes requires many iterations to reach optimality
memory efficient
numerically very stable
Interior Point
iteration count independent of problem size
often faster to reach optimality
easier to parallelize (Cholesky factorization)
In summary, IPM is the way to go for pure LPs, while for reoptimization-heavy applications like (mixed) integer programming the Simplex is better suited. One may also combine both approaches and perform a Simplex-like cross-over after the IPM found an optimal solution to get a basic one.
Often, it is a good idea to try both methods and decide then what works best, because performance is very much problem dependent.

Is my company doing this right, sharing data between exes? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
First off, my company is into power grid, not IT, so software is kinda a secondary role here.
I work on a power system simulation software, very old code base in C++ with MFC, like 15 years old. What we do is take large amounts of data, ~100,000 floating point values then format and write to a text file (most of the code actually uses the C FILE structure to do this). Then, it's read by a separate engine exe which computes the electrical algorithm (Electrical algorithms are mostly numeric solutions of system of diffn equations) and then writes back huge amount of data to another text file, which we read and update the UI.
My question is, is this how it should be done? It there a way to skip writing into the text file and directly pass the data to the exes?
exes are called using CreateProcess() MFC function.
EDIT::
Sorry, site won't let me comment.
#Vlad Feinstein Well, yes, it's like a Ladder. A thing called load flow solves power flow through the lines, which in turn will be used to find stability of the systems, which in turn for overvoltage ect. It's huge, the UI is million+ lines of code, engine exes another million maybe.
Doesn't MFC already implement IPC using Dynamic Data Exchange? I can pass strings to another process's PreTranslateMessage() func. A scaled up version of that?
There is no such a thing as "should be done as ..." there are multiple methods to do IPC and while the method you describe might not be the fastest, it is a viable solution nevertheless. If the performance doesn't bother you in this particular case you should not bother with modifying it. It is exactly the case where the phrase "if it ain't broke, don't fix it" applies.
Probably, you would not want to make any new IPC in the application that way, though.

Should I use Priority Queues for scheduling tasks (functions, etc.) in a highly dynamic system? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I am a hobby programmer interested in dynamic systems like game engines etc. and I had the Idea of a task scheduling system using something like a priority queue to sort different tasks dynamically and maybe include a parallel feature to use multiple cores efficiently. My explicit idea was to use some kind of Task class that itself stores a function pointer and two queue parameters, one being the gravity of the task and one being the time since it was pushed onto the queue, which then would be multiplied to archieve the position in the listing.
Now here comes my question. Would such a system be more efficient in general or at least pay up in any way in comparisation to a hard-coded system (like some 'main loop')?
e.g. is it a better solution / is it faster?
Thanx for the replies.
This is exactly what priority queue's where designed for. Start your design with priority queues and see how well it goes. Then you may want to tweak it if specific issues come up.

On what amount of work does using OpenMP start making sense? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I recently started looking on parallelization using OpenMP and found a decent amount of good resources describing how to use it. However, I was unable to find documentation on when parallelization starts making sense or in other words: where is the turning point where parallelization start compensating the overhead of OpenMP's thread creation and in what cases is it better to go without it? How complex has work to be so it makes sense to parallelize it?
Is there any documentation or guide available on that?
From my own experience, if your computation is well suited for parallelization, you can expect substantial gain if the serial computation (for example the loop you want to parallelize) takes a few milliseconds.
Below 1 millisecond, it will not help to use multiple threads due to the overhead involved.
Image processing would be one good example...
I used it when running sift and surf on two consecutive images..
I find it useful if you need to do heavy mathematical calculations especially on matrixes..