Andreas Olofsson reviews the history of processors and outlines some of the challenges ahead, introducing project Parallella meant to speed up the transition to massively parallel computing.
Andreas Olofsson (@adapteva) founded Adapteva in 2008. Prior to Adapteva, he was a key contributor to a number of successful products at Analog Devices, including the ground breaking TigerSHARC DSP architecture. The TigerSHARC was the first processor to enable software programmable 3G and WiMax base station platforms and was the most energy efficient floating point microprocessor at that time.
Code Mesh London is an annual conference dedicated to non-mainstream languages and technologies. In 2013 it featured over 50 talks from experts in languages, libraries, operating systems and technologies that handle the programming and business challenges of today. Languages discussed include Haskell, Clojure, Erlang, Elixir, Rust, Go and Julia.
A chip does not need to be either sequential or massively parallel, it could also be a tree-like structure (memory and CPU), each node with eventually some degree of parallelism.
It seems to me that the world around and our software, as well as our networks, often have a tree-like structure (fractal or not): why not hardware?
The chip could have the shape of a sphere, with top-level processor at center.
I/O could be done at any level, but in practice inputs at low levels and outputs at high levels could be a common pattern.
Programming could look something like this (using Java syntax):
// Creates a thread running on (cores of) processor of current thread,
// and located in corresponding RAM (as objects created from that thread).
// Main thread would run on top-level processor.
// That would make current Java code still usable.
// Numbers of cores available in the processor this thread runs on.
// Numbers of processors that are child of the one this thread runs on.
// Creates a thread running on (cores of) child processor
// of specified index, in [0 .. availableChildProcessors-1].
Threads and objects of a processor could only interact (using references, seeing volatile writes, etc.) with threads and objects of parent and child processors, not of grandparents or grandchildren.