Why do books on concurrent programming always ignore data parallelism? [closed] - concurrency

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
There has been a significant shift towards data-parallel programming via systems like OpenCL and CUDA over the last few years, and yet books published even within the last six months never even mention the topic of data-parallel programming.
It's not suitable for every problem, but it seems that there is a significant gap here that isn't being addressed.

First off, I'll point out that concurrent programming is not necessarily synonymous with parallel programming. Concurrent programming is about constructing applications from loosely-coupled tasks. For instance, a dialog window could have interactions with each control implemented as a separate task. Parallel programming, on the other hand, is explicitly about spreading the solution of some computational task across more than a single piece of execution hardware, essentially always for performance reasons of some sort (note: even too little RAM is a performance reason when the alternative is swapping.
So, I have to ask in return: What books are you referring to? Are they about concurrent programming (I have a few of these, there's a lot of interesting theory there), or about parallel programming?
If they really are about parallel programming, I'll make a few observations:
CUDA is a rapidly moving target, and has been since its release. A book written about it today would be half-obsolete by the time it made it into print.
OpenCL's standard was released just under a year ago. Stable implementations came out over the last 8 months or so. There's simply not been enough time to get a book written yet, let alone revised and published.
OpenMP is covered in at least a few of the parallel programming textbooks that I've used. Up to version 2 (v3 was just released), it was essentially all about data parallel programming.

I think those working with parallel computing academically today are usually coming from the cluster computing field. OpenCL and CUDA use graphics processors, which more or less inadvertently have evolved into general purpose processors along with the development of more advanced graphics rendering algorithms.
However, the graphics people and the high performance computing people have been "discovering" each other for some time now, and a lot or research is being does into using GPUs for general purpose computing.

"always" is a bit strong; there are resources out there (example) that include data parallelism topics.

The classic book "The Connection Machine" by Hillis was all data parallelism. It's one of my favorites

Related

Risks of maintaining/sustaining two code sets, one for CPU one for GPU, that need to perform very similar functions [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
This is a bad title, but hopefully my description is clearer. I am managing a modeling and simulation application that is decades old. For the longest time we have been interested in writing some of the code to run on GPUs because we believe it will speed up the simulations (yes, we are very behind in the times). We finally have the opportunity to do this (i.e. money), and so now we want to make sure we understand the consequences of doing this, specifically to sustaining the code. The problem is that since many of our users do not have high end GPUs (at the moment), we would still need our code to support normal processing and GPU processing (i.e. I believe we will now have two sets of code performing very similar operations). Has anyone had to go through this and have any lesson learned and/or advice that they would like to share? If it helps, our current application is developed with C++ and we are looking at going with NVIDIA and writing in Cuda for the GPU.
This is similar to writing hand-crafted assembly version with vectorization or other assembly instructions, while maintaining a C/C++ version as well. There is a lot of experience with doing this in the long-term out there, and this advice is based on that. (My experience with doing this with GPU cases is both shorter term (a few years) and smaller (a few cases)).
You will want to write unit tests.
The unit tests use the CPU implementations (because I have yet to find a situation where they are not simpler) to test the GPU implementations.
The test runs a few simulations/models, and asserts that the results are identical if possible. These run nightly, and/or with every change to the code base as part of the acceptance suite.
This ensures that both code bases do not go "stale" as they are constantly exercised, and the two indepdendent implementations actually help with maintenance on the other.
Another approach is to run blended solutions. Sometimes running a mix of CPU and GPU is faster than one or the other, even if they are both solving the same problem.
When you have to switch technology (say, to a new GPU language, or to a distributed network of devices, or whatever new whiz-bang that shows up in the next 20 years), the "simpler" CPU implementation will be a life saver.

Do programmers actually draw flow charts? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Im in first year computer science in University and for my current assignment i need to write a program with an algorithm and a hand trace and a flow chart. I kind of get the usefulness of the algorithm and hand trace but to me it seems like the flowchart is a monumental waste of time especially since the program took me about an hour to write and was super simple. I was just wondering if real programmers actually used flow charts for help.
Flow charts are one of many ways I'll work through my planning of a program or script; but I won't necessarily use it every time. Pick the right tool for the situation - sometimes it's pseudocode, sometimes it's a flowchart, sometimes it's prose or simple wireframes/schematics. Often, it's a combination.
Once you get to any application or system that's of non-trivial size, some sort of visualization or planning process that doesn't involve code on a display becomes immensely helpful. It also aids in communicating your ideas to users, testers and other developers - you won't be building your application in a vacuum.
Back from about 1980 to 2000 I used to "draw" a lot of "Chapin" charts -- feed a PL/I sort of syntax into a program and it prints a form of flow chart on 14" wide fanfold paper. These were incredibly useful and we used them extensively in code reviews.
But then they did away with paper, and the charts are not nearly as useful if you can't set them on your desk and mark on them, and even if you print them out on letter-sized paper instead of fanfold you lose a lot of effectiveness because you can't easily stretch them out to see 3-4 pages at a time.
Similarly, code reviews are less effective now that code is pre-reviewed online vs with fanfold listings. It's harder to follow the flow of a long method, and there's no good way to make notes, highlight/underline lines, etc.
Using this technology I once did the design for an incredibly complex algorithm internal to a database system. Detailed flow charts ran, I'm thinking, about 1000 lines. The flowchart got down to the details of variable names, etc. We reviewed that intensively. Then I was transferred to another area and a (admittedly bright) new-hire was given the task to actually code the algorithm. It came to about 4000 lines, IIRC. There were 3 minor bugs found in testing.
A lot has been lost in the past 15-20 years, due to "technological advancement".

What are the advantages of extending Python with C or C++? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Recently I've read that we can code C/C++ and from python call those modules, I know that C/C++ is fast and strongly typed and those things but what advantages I got if I code some module and then call it from python? in what case/scenario/context it would be nice to implement this?
Thanks in advance.
Performance. That's why NumPy is so fast ("The NumPy array: a structure for efficient
numerical computation")
If you need to access a system library that doesn't have a wrapper
in python (Example: Shapely wraps around libgeos to do
geometrical computations), or if you're writing a wrapper around a
system library.
If you have a performance bottleneck in a function
that needs to be made a lot faster (and can benefit from using C).
Like Charlie said, profiling is essential to find out whether you
want to do this or not.
Profile your application. If it really is spending time in a couple of places that you can recode in C consider doing so. Don't do this unless profiling tells you you really need to.
Another reason is there might be a C/C++ library with functionality not available in python. You might write a python extension in C/C++ so that you can access/use that C/C++ library.
The primary advantage I see is speed. That's the price paid for the generality and flexibility of a dynamic language like Python. The execution model of the language doesn't match the execution model of the processor, by a wide margin, so there must be a translation layer someplace at runtime.
There are significant sections of work in many applications that can be encapsulated as strongly-typed functions. FFTs, convolutions, matrix operations and other types of array processing are good examples where a tightly-coded compiled loop can outperform a pure Python solution more than enough to pay for the runtime data copying and "environment switching" overhead.
There is also the case for "C as a portable assembler" to gain access to hardware functions for accelerating these functions. The Python library may have a high-level interface that depends on driver code that's not available widely enough to be part of the Python execution model. Multimedia hardware for video and audio processing, and array processing instructions on the CPU or GPU are examples.
The costs for a custom hybrid solution are in development and maintenance. There is added complexity in design, coding, building, debugging and deployment of the application. You also need expertise on-staff for two different programming languages.

Recommend a good resource for approaches to concurrent programming? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I've read various bits and pieces on concurrency, but was hoping to find a single resource that details and compares the various approaches. Ideally taking in threads, co-routines, message passing, actors, futures... whatever else that might be new that I don't know about yet!
Would prefer a lay-coders guide than something overtly theoretical / mathematical.
Thank you.
I recommend An Introduction to Parallel Programming by Pacheco. It's clearly written, and a good intro to parallel programming.
If you don't care about something being tied to a language, then Java Concurrency in Practice is a great resource.
Oracle's online tutorial is free, but probably a bit more succinct than what you're looking for.
That being said, the best teacher for concurrency is probably experience. I'd try to get some practice, myself. Start out by making a simulation of the Dining Philosophers problem. It's a classic.
At first, let's see if you're interested in the topic or not. To grasp a big picture about concurrency, best practice is to take a look at operating systems books, like Operating systems internal by Stalings or Modern operating systems by Tanenbaum. They can give you an intuition about what this is all about.
There's also an old book, named Concurrent programming by Ben-Ari. If you found it, it can be helpful.
Beside reading text books it's good get your hands dirty by writing some concurrent programs. Python is a very good choice if you want to start using threads. Every Python book has a part dedicated to this topic. Also a with a simple search on the web you can find a lot of resources about it, but I give these two higher preference:
Multithreaded Programming (POSIX pthreads Tutorial), A very comprehensive introduction to concurrency and multi-threading. It's mainly about C multi-threading.
The other one is Thread Synchronization Mechanisms in Python.
Now if you still find yourself interested about concurrent programming, it's time to go deeper. You almost have the basic knowledge of concurrency, now the best approach at this point is to start solving problem and become familiar with patterns. To achieve this goal, you can use The Little Book of Semaphores. It's one of best books in the field and it's also free. This is a book that can head you toward becoming proficient at concurrent programming.
These should be enough if you want to approach concurrent programming, but if you have enough time, and you're eager, it's good to take a look at some other paradigms of concurrent programming, like actors which are used in Erlang. I say it is worth to read some chapters of the book Seven Languages in Seven Weeks, especially chapter about Erlang and IO. At first glance, it might be hard and strange, but it's good to become familiar with other solutions to concurrency.
I would recommend Concepts, Techniques, and Models of Computer Programming from Peter Van Roy and Seif Haridi. All the major programming techniques have a follow up sections on concurrent programming. Also the author starts with basics and defines the main concurrent programming concepts, their shortcomings, all accompanied with examples in Oz programming language.
If you happen to be a C# programmer or maybe someday, I highly recommend this tutorial series by Michal Habalcik.

Getting starting with Parallel programming [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
So it looks like multicore and all its associated complications are here to stay. I am planning a software project that will definitely benefit from parallelism. The problem is that I have very little experience writing concurrent software. I studied it at University and understand the concepts and theory very well but have zero hands on useful experience building software to run on on multiple processors since school.
So my question is, what is the best way to get started with multiprocessor programming?
I am familiar with mostly Linux development in C/C++ and Obj-C on Mac OS X with almost zero Windows experience. Also my planned software project will require FFT and probably floating point comparisons of a lot of data.
There is OpenCL, OpenMP, MPI, POSIX threads, etc... What technologies should I start with?
Here are a couple stack options I am considering but not sure if they will let me experiment in working towards my goal:
Should I get Snow Leopard and try to
get OpenCL Obj-C programs to run
execution on the ATI X1600 GPU on my
laptop? or
Should I get a
Playstation and try writing C code to
throw across its six available Cell SPE cores?
or
Should I build out a Linux box
with an Nvidia card and try working
with CUDA?
Thanks in advance for your help.
I'd suggest going OpenMP and MPI initially, not sure it matters which you choose first, but you definitely ought to want (in my opinion :-) ) to learn both shared and distributed memory approaches to parallel computing.
I suggest avoiding OpenCL, CUDA, POSIX threads, at first: get a good grounding in the basics of parallel applications before you start to wrestle with the sub-structure. For example, it's much easier to learn to use broadcast communications in MPI than it is to program them in threads.
I'd stick with C/C++ on your Mac since you are already familiar with them, and there are good open-source OpenMP and MPI libraries for that platform and those languages.
And, and for some of us it's a big plus, whatever you learn about C/C++ and MPI (to a lesser extent it's true of OpenMP too) will serve you well when you graduate to real supercomputers.
All subjective and argumentative, so ignore this if you wish.
If you're interested in parallelism in OS X, make sure to check out Grand Central Dispatch, especially since the tech has been open-sourced and may soon see much wider adoption.
The traditional and imperative 'shared state with locks' isn't your only choice. Rich Hickey, the creator of Clojure, a Lisp 1 for the JVM, makes a very compelling argument against shared state. He basically argues that it's almost impossible to get right. You may want to read up on message passing ala Erlang actors or STM libraries.
You should Learn You Some Erlang. For great good.
You don't need special hardware like graphic cards and Cells to do parallel programming. Your simple multi-core CPU will also profit from parallel programming. If you have experience with C/C++ and objective-c, start with one of those and learn to use threads. Start with simple examples like matrix multiplication or maze solving and you'll learn about those pesky problems (parallel software is non-deterministic and full of Heisenbugs).
If you want to go into the massive multiparallelism, I'd choose openCL as it's the most portable one. Cuda still has a larger community, more documentation and examples and is a bit easier, but you'd an nvidia card.
Maybe your problem is suitable for the MapReduce paradigm. It automatically takes care of load balancing and concurrency issues, the research paper from Google is already a classic. You have a single-machine implementation called Mars that run on GPUs, this may work fine for you. There is also Phoenix that runs map-reduce on multicore and symmetric multiprocessors.
I would start with MPI as you learn how to deal with distributed memory. Pacheco's book is an oldie but a goodie, and MPI runs fine out of the box on OS X now giving pretty good multicore performance.