Questions for an Enterprise Architect
Erik Dörnenburg answers: What is Enterprise and Evolutionary Architecture?, discussing 4 issues: Turning strategy into execution, Ensuring conformance, Where do the architects sit? Buying or building?
The content has been bookmarked!
There was an error bookmarking this content! Please retry.
Posted by Werner Schuster on May 23, 2007
As you know, YARV support native thread. It means that you can run each Ruby thread on each native thread concurrently.This means: no matter how many cores or CPUs are available, only one Ruby thread will be able to run at any given time. There are workarounds and native extensions can handle the Global Interpreter Lock (GIL) in more flexible ways, for instance, release it before starting a long running operation. Sasada Koichi explains the API available for releasing the GIL:
It doesn't mean that every Ruby thread runs in parallel. YARV has global VM lock (global interpreter lock) which only one running Ruby thread has. This decision maybe makes us happy because we can run most of the extensions written in C without any modifications.
You must release Giant VM Lock before doing blocking task. If you need do this in extension libraries, use rb_thread_blocking_region() API.
rb_thread_blocking_region(
blocking_func, /* function that that will block */
data, /* this will be passed above function */
unblock_func /* if another thread cause exception with Thread#raise,
this function is called to unblock or NULL
)
Nevertheless, you're right the GIL is not as bad as you would initially think: you just have to undo the brainwashing you got from Windows and Java proponents who seem to consider threads as the only way to approach concurrent activities.The benefits of preemptively scheduled threads that share an address space has long been debated. Unix was, for the longest time, single threaded or user space threaded. Parallelism was implemented with multiple processes which communicated via different means of InterProcess Communication (IPC), such as Pipes, FIFOs, or explicitely shared memory regions. This was supported by the
Just because Java was once aimed at a set-top box OS that didn't support multiple address spaces, and just because process creation in Windows used to be slow as a dog, doesn't mean that multiple processes (with judicious use of IPC) aren't a much better approach to writing apps for multi-CPU boxes than threads.
Just Say No to the combined evils of locking, deadlocks, lock granularity, livelocks, nondeterminism and race conditions.
fork syscall, which allowed to cheaply duplicate a running process. [...] if we have multiple VM instance on a process, these VMs can be run in parallel. I'll work on that theme in the near future (as my research topic).This indicates that userspace (green) threads version of Ruby is not off the table, particularly in light of implementation problems of threading systems on different OSes, such as this one:
[...] if there are many many problems on native threads, I'll implement green thread. As you know, it's has some benefit against native thread (lightweight thread creation, etc). It will be lovely hack (FYI. my graduation thesis is to implement userlevel thread library on our specific SMT CPU).
Programming on native thread has own difficulty. For example, on MacOSX, exec() doesn't work (cause exception) if other threads are running (one of portability problem). If we find critical problems on native thread, I will make green thread version on trunk (YARV).Why is there a need for Sasada Koichi's Multiple VM (MVM) solution? Running multiple Ruby interpreters and having them communicate via IPC methods (e.g. sockets) is possible today as well. However, it comes with a host of problems:
x = Thread.new{
p "hello"
}
Or this Erlang sample:pid_x = spawn(Node, Mod, Func, Args)This Erlang code spawns a new lightweight process, and indeed: this is all the code that's needed. All the set up code is taken care of, none of the problems explained above.
pid_x ! a_messageThis sends a simple message to the process with the pid store in pid_x. The message can consist of various types, for instance Atoms, Erlang's version of Ruby's symbols.
Improve Java Garbage Collection, Runtime Execution, and JVM visibility with Zing
Agile Development: A Manager's Roadmap for Success
I like Guido's ideas, and it certainly would work for concurrency in the large.
However, if you look at the micro sort of parallelism like fortress has, or haskell can do (where even small operations are parallel) IPC is really not suitable for that (think of a multicore CPU for instance).
Ruby is a nice high level language, I would like to see threads come in in a high level way (if possible) not just do what other languages do for the sake of completeness.
If they are going to use multiple cores, the chance of visibility issues is going to increase as well. It took a long time before Java its memory model was fixed (and a lot of very smart people worked in it). How are these issues going to be tackled in Ruby?
Now here's a level of technical coverage I wouldn't have expected from infoq. Good thing though, these things are important when considering the future of Ruby (including IronRuby and JRuby) for enterprise development. Good work!
I'll second what Stefan said.
Me too. Great article!
Notably missing from the listed Ruby implementations is Rubinius, which as I understand it will support a wide variety of threading models, including green threads and the Erlang lightweight-process model. MenTaLguY is, I believe, pushing this forward, and considering porting his work over to some of the other implementations, too.
Hanging around in #rubinius while MenTaLguY is around will probably yield more enlightenment.
It seems to be one of *the* most comprehensive coverage of threading in Ruby and the latest trends/research.
But the problem I figured out is that there is no pattern of thought process developing or research heading somewhere concrete.
Although Ruby MVM is posed as the best option available but still issues with that approach are mentioned. Also any successful work on that is hard to find.
It seems threading in Ruby will remain experimental in nature and may only improve with the advent of some other language that has a solid implementation of the threading experiments done in Ruby side.
Antonio: yes, you're right. Some Rubinius coverage would be good, especially now that Evan Phoenix is paid for his Rubinius work. The current state of Rubinius threading seems to be userspace threads, with some more ideas/concepts on the horizon.
Ismail: Ruby's threading is not experimental in Ruby 1.8.x and JRuby 1.0, one has userspace threads the other one has kernel threads. The implementations are solid and their issues are known.
Ruby 1.9.x is a bit of a wildcard right now, but that's alright since it's not a release yet, actually I think it's not even an 'alpha' or something like that.
If you want a "solid implementation", then just use the existing Ruby 1.8.x or JRuby. If you want a "new" language with a mature multiprogramming story, take a peek at Erlang. Erlang has been used for some for some highly scalable and rock solid apps.
Also: as for using only "solid implementations" ... Java's memory model (which is crucial for threading) was broken for the first 10 years of it's existence (it was fixed in 1.5), yet this didn't have a bad impact on the success Java or it's applications.
Interesting times are ahead...
I find this topic to be very interesting. Since Ruby 1.9, JRuby and Rubinius have a lot of leeway to experiment with new features like MxN threading or Erlang style multiprocess concurrency I am confident good things are one the horizon. I think it is particularly important not to fall into the trap of thinking that native threads will be a concurrency panacea for Ruby (or any other language). Erlang definitely seems to have the right idea.
BTW, I wrote about this topic on my blog earlier this month: Are Native Threads Worth It?
"The Ruby process needs to exec a new Ruby interpreter, ..."
It doesn't seem like that is necessary if you just use fork()?
sRp
Oz/mozart is another language with super light weight internal threading, very much like erlangs light weight threading. They've also gone the route of no smp, use multiple processes. At the moment, i think the main disadvantage of such is that it is slightly harder to take advantage of multiple processors/cores. For instance, instead of running a spawn for every element in a list and collecting the results, you'd need to include logic to distribute the spawns to different processes and possibly even worry about dealing with a single process crashing. Most of this could be solved by having a good set of libraries to make the default case easy. The advantage is that it should make it much easier in the future to take fuller advantage of things like NUMA/SUMA/SUMO without having to deal with memory locality issues and expensive barriers. Such libraries should also be beneficial for making some very closely networked machines (such as blades) function together.
sRp
Erik Dörnenburg answers: What is Enterprise and Evolutionary Architecture?, discussing 4 issues: Turning strategy into execution, Ensuring conformance, Where do the architects sit? Buying or building?
Sean Cribbs explains what Map-Reduce and Riak are, why and how to use Map-Reduce with Riak, and how to convert SQL queries into their Map-Reduce equivalents.
Chris Richardson shows how he ported a relational database to three NoSQL data stores: Redis, Cassandra and MongoDB.
Jean Tabaka challenges the audience to reflect on what Agile practices they are employing, how they are using them, ending with the questions “Why have their organization chosen to go Agile?
Andreas talks about the benefits of the Open Web and how it compares to proprietary stacks. He also talks about various projects that push the envelope like Boot to Gecko, Broadway and pdf.js.
Ron Bodkin discusses early adoption of Hadoop, NoSQL and describes MapReduce and related libraries and Frameworks. Other topics include Hive, Pig, multi tenancy, and security in a big data environment
Stephen Bohlen explains how Spring helps with interoperability between Java and .NET, demoing it with the help of a sample application.
Guilherme Silveira mentions some of the turning points in project development that may affect the quality of the code offering advice on avoiding writing crappy code.
12 comments
Watch Thread Reply