Parallel programming algorithms can get really complex depending on the data-structures that one uses. Rookie programmers often resort to the dangerous habit of "programming on the go", that is, they tend to approach programming as a mere writing exercise, where they start coding their application without giving due consideration to the algorithm in its entirety prior to starting out. Distributed memory models such as the Message Passing Model is exemplary in this regard. Programmers are forced to consider higher levels of topological and algorithmic abstraction to utilize standard parallel programming paradigms effectively. This makes the job of getting a parallel program to produce the correct results consistently with a shorter turnaround time a great challenge. A lot of the memory management is taken care of by the compiler. Sequential programming does not entail the intricate job of managing data flow and distribution of work to various processes. While humans can easily conceive the idea of distributed computing, algorithm development and implementation can get ridiculously tedious when one brings the notion of parallelism into the picture. The notion of parallelism stems from the idea of the distribution of labor, which is very natural to the human psyche. Parallel computing has taken a great leap forward, thanks to the gaming industry and today, we use technologies such as OpenMP, MPI, and CUDA to solve large engineering problems that would usually take days, in a matter of hours. Multi-core processing and GPU processing are fast becoming the gold standard for performance optimization in scientific computing. This leaves us with the option of stringing together multiple computers to take on computational tasks that are too heavy to be performed on a single computer. While the computing power of a single processing unit has steadily kept increasing, one cannot ignore the impenetrable wall that we are fast approaching. On the other hand, if the algorithm runs in O(N) both sequentially and in parallel, then a large number of cores will probably not make the algorithm run much faster."Why is parallel programming so hard?" - Everyone who's used MPI If the algorithm runs in O(N) time sequentially but in O(log N) in parallel, that means that the algorithm parallelizes well. Even this approximation tells us something deep about a particular algorithm. The crudest approach is simply to assume that there will be enough processors available on the machine, regardless of what “enough” means. Method 1: “the algorithm runs in O(N) time in parallel” There are several ways to incorporate this variable into the Big-Oh notation, and I am going to describe them one-by-one. In fact, the number of computational cores is a new variable that we need to take into account. In the parallel case, we may have potentially many cores executing the computation steps! But, what about the complexity of parallel programs? A major departure from the sequential case is that the number of steps an algorithm takes and the actual real-world time may not be proportional. While this is great, you already probably heard it a thousand times. If an algorithm takes O(N) steps according to one definition of a “step”, it takes O(N) according to all reasonable definitions. The step could be a clock cycle, a hardware instruction, or an expression in source code. The beauty of the Big-Oh notation is that any somewhat reasonable definition of a step will do. One interesting question is how to define a “step”. Or, in mathematical terms, there is some fixed constant C, such that to process input of size N, the algorithm needs at most C x N 2 steps. When you say that a particular algorithm runs in O(N 2) time, you mean that the number of steps the algorithm takes is proportional to the input size squared. Big-Oh notation is a simple and powerful way to express how running time of a particular algorithm depends on the size of the input.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |