Opinion: Multiple Processor Computing Challenges go Beyond Purely Technical Issues
In his position statement for the International Computer Music Conference 2008, Peter Van Roy raises a number of issues related to the emergence of multi-core processors and loosely coupled systems (e.g. Internet) that are the two driving forces of the rising trend of multiple processor computing. Though challenges brought by each form of concurrency computing are very different in their nature, for both of them these challenges go beyond purely technical issues.
With regard to programming multi-core processors – i.e. processors combining two or more processing elements that “share the interconnect to the rest of the system and often share onchip cache memory” - Peter Van Roy believes that the challenge is purely sociological. Dataflow programming is “a simple, natural, and powerful approach for programming these machines” while “deterministic concurrency, that has no race conditions, is as easy to program as sequential programs, and can exploit parallel processors as a bonus.” Hence, the challenge is not to find new technical solutions but rather “to educate programmers” to effectively use existing ones, following the example of Google whose MapReduce uses dataflow ideas.
Responding to this on Lambda the Ultimate, Marijn Haverbeke stresses, however, that even though “there are a lot of techniques available and documented” for multi-core concurrency “specific situations or performance requirements often make it necessary to go beyond existing techniques, since none of them […] is general, composable, efficient, and silver-bullety enough to solve all problems.”
Peter Van Roy goes on discussing the challenges of programming loosely coupled systems that he believes to be more significant. These systems consisting “of a set of processors and a network connecting them” face the inherent issues of distributed systems:
- lack of global knowledge, i.e. the fact that no one agent has a view of the system as a whole and can only know about other agents actions through messaging;
- partial failure, i.e. the fact that one of the agents leaves the system or starts behaving strangely.
Van Roy believes that those are “low-level technical problems” that can be overcome by using appropriate algorithms “such as clock synchronization, distributed snapshots, and fault tolerance”.
However, loosely coupled systems also have issues at a higher level. The first is the problem of conflicting goals appearing in peer-to-peer file sharing and often referred to as the “freeloader problem”. To solve it, the system should be designed in a way to make “agent's goals overlap with the overall system's goals” and to discourage freeloaders. Van Roy gives the example of BitTorrent protocols and tools that help to achieve that goal. The other issue is the emergent behavior, i.e. behavior that is not shared by any of the system’s parts but is shown by the system as a whole. According to Peter Van Roy, this should rather be treated as an opportunity that appropriate tools and design can help to exploit. Google’s PageRank algorithm, for instance, extracts from Web pages information that helps to assess their usefulness, correctness, and popularity that would be hard to evaluate if each Web page were taken by itself.
Addressing the issue of loosely coupled systems design, Peter Van Roy advocates for decentralized architecture where “each computing node is by default independent of all the others [and] contains the whole application and works even if there is no communication whatsoever between nodes [but] can use information from other nodes when it is available”. He believes indeed that this kind of architecture is instrumental for addressing issues raised by system distribution and provides some examples of tools built that way.
According to Scott Johnson who reacts to Van Roy’s paper on Lambda the Ultimate “a bigger distinction should be made between "loosely coupled" systems, and "widely distributed" systems” because “in the second case, one must seek to guard against attacks of all sorts” since one cannot assume “the absence of malicious entities, a single administrative domain, as well as more desirable technical parameters (bandwidth, latency).” He then suggests a longer classification of systems that are compared with regard to a certain number of characteristics, e.g. deterministic scheduling, topology, migration, partial failure, global consensus (global knowledge), bandwidth, latency, etc…
Anatole Tresch Mar 03, 2015