Saturday, October 3, 2009

Fork/Join Parallelism

The first question that comes to mind is why do we need a framework for parallel computing? Doesn't the operating system in conjunction with hardware already provide this support for us? In the case of Java Fork/Join, doesn't the JVM provide some parallelization oomph ? Well, the article explicitly states that this framework is intended for programs that execute subtasks in parallel and, therefore, can not rely on the shortcomings of pthreads and the java.lang.Thread class from the perspective of task parallelism computing. It's worth restating their deficiencies as follows:

1. Thread synchronization and management is generalized to accommodate the different types of parallel computing so it unnecessarily blocks threads in task parallel programs.
2. Tasks are too coarse-granular that it "limits opportunities for exploiting parallelism".

But aren't there automated tools with configurable parameters that one can use for parallelization? This investigation led me to following results from Wikipedia:

"Automatic parallelization by compilers or tools is very difficult due to the following reasons:

* dependence analysis is hard for code using indirect addressing, pointers, recursion, and indirect function calls;
* loops have an unknown number of iterations;
* accesses to global resources are difficult to coordinate in terms of memory allocation, I/O, and shared variables.
"

For these reasons, we have the POSIX Thread library, Cilk, OpenCL, and CUDA among other programming models. And why Java? It provides more portability to fork/join programs since they can run on any machine that has a JVM. And one will not have to worry about porting the fork/join framework since it'll be included in Java SE 7 as listed in http://www.javaworld.com/community/node/3458. According to http://www.javaworld.com/javaworld/jw-12-2008/jw-12-year-in-review-2.html?page=4, there is a ParalleArray class that represents an array where you can perform operations such as filter, map, and apply on the array's data items in parallel. The ForkJoinPool class will provide the threads to perform these concurrent operations and "can automatically take advantage of increasing processor-core counts in the future without modifying the functions". This will be particularly useful for 16-32 core machines since SE 6 only functioned well for 4-8 core machines.

No comments:

Post a Comment