Collaboration: At the Extremities of Extreme
Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.
The content has been bookmarked!
There was an error bookmarking this content! Please retry.
Posted by Michael Hunger on Sep 08, 2010
Cloudant the company behind CouchDB just released Java View Server for CouchDB. That means that not only Erlang and interpreted languages like Javascript or Python can be used to write Map-Reduce jobs but also JVM based languages. The approached will be discussed at the CouchDB community meeting this week. Currently it can be only used on Cloudant's hosted BigCouch service.
The main advantages that are cited is the massive amount of Java libraries that are available for all kinds of functionality that could be relevant in map reduce tasks. The second one is the more reliable static typing aspect (but that needs to be proven).
A performance comparison would be interesting, but by now there was no benchmark performed. The performance is expected to be lower than native Erlang views (Java and Erlang can be mixed within a view). There is some overhead due to JSON serializiation and deserialization by the org.json library.
For using the Java based Map Reduce views just implement a simple JavaView Interface that offers callbacks for map, reduce and rereduce. For example a simple view that aggregate word counts in configured JSON fields.
{
"_id":"_design/splittext",
"language":"java",
"views" : {
"title" : {"map":"{\"classname\":\"com.cloudant.javaviews.SplitText\",\"configure\":\"title\"}","reduce":"com.cloudant.javaviews.SplitText"},
}
}
InfoQ spoke with David Hardtke, the Director of Search at Cloudant, who is responsible for this project.
InfoQ: CouchDB runs on Erlang how does this interact with JVM code? What were the implementation challenges?
David: The Java View Server, like all CouchDB view servers (except native erlang), runs as an external process. There is a well defined protocol for communication between CouchDb and view servers.
Normally, communication occurs via standard io but we actually use the OtpErlang java-erlang Library for performance reasons (allows for multiple threads).
InfoQ: Any limitations on what code / libraries can be used in this context?
David: The main challenge was security, both at a System level and from a user data level. We are running this on a shared cluster. We use dynamic class loading to load user libraries. The class loader has a fairly tight security manager in place that restricts malicious calls. There is no FS access and a limited set of System calls allowed.
The current architecture of the view server is quite simple, it is just using java threading which is driven by the calls from the Erlang based CouchDB instances. If the Java server fails it is just shut down and restarted. Interesting approaches for such a server would be using the Scala based Akka framework or Jetty's non-blocking requests. They Java View Server runs on any JVM
A great potential lies in using the Java.next languages like Clojure, Scala or Groovy (and others) for this kind of work as they are much more concise and powerful than Java in expressing such tasks. According to David a Clojure based view server is in development by some other party.
To evaluate the new Java View Server a free account available from Cloudant's site can be used. Detailed instructions can be found in the couchjava github repository.
Agile Development: A Manager's Roadmap for Success
Mobile and the New Two-Tiered Web Architecture
18 agile and lean practices for effective software development governance
SCM best practices for multiple processes, releases & distributed teams
Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.
Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).
Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.
Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.
One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.
InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.
Alex Papadimoulis discusses ugly code, where it comes from, how to avoid it, and how to get rid of it.
John Davies examines Visa’s architecture and shows how enterprises have architected complex integrations incorporating Hadoop, memcached, Ruby on Rails, and others to deliver innovative solutions.
No comments
Watch Thread Reply