Apache Tez is a new distributed execution framework that is targeted to-wards data-processing applications on Hadoop. But what exactly is it? How does it work? In the presentation, “Apache Tez: Accelerating Hadoop Query Processing”, Bikas Saha and Arun Murthy discuss Tez’s design, highlight some of its features and share initial results obtained by making Hive use Tez instead of MapReduce.
Apache Samza is a stream processor LinkedIn recently open-sourced. In his presentation, Samza: Real-time Stream Processing at LinkedIn, Chris Riccomini discusses Samza's feature set, how Samza integrates with YARN and Kafka, how it's used at LinkedIn, and what's next on the roadmap.
MetaModel - an Apache Incubator project – is a Java library used to browse, query and update various types of data stores including traditional SQL databases, unusual stores such as CSV or Excel, or the more modern NoSQL stores in a uniform and programmatic way.
Apache Hadoop YARN – a new Hadoop resource manager - has just been promoted to a high level Hadoop subproject. InfoQ had the chance to discuss YARN with Arun Murthy - founder of Hortonworks. 1
Citing a need to be able to respond faster to events, and disappointment in both feature set and timeframe for Java 7, the guardian.co.uk team is using Scala rather than Java for new projects. 12
A new marshaling framework - Apache Avro provides a lot of interesting new features. In his new article, Boris Lublinsky takes it for a test drive and provides some suggestions on its proper usage 4
"Tuscany SCA in Action" by Simon Laws, Mark Combellack, Raymond Feng, Haleh Mahbod and Simon Nash provides a simple step-by-step guide on how to develop applications leveraging SCA and Apache Tuscany.
Heshan Suriyaarachchi explores how Apache Axis2 can be extended to support JVM based scripting languages allowing them to be used to both expose web services and write web service clients. 1
Scout is an extensible server and application monitoring service which focuses upon ease of installation and configuration.
This article discusses the use of bindings on services and references (including the instance of non-configured bindings) as the means to implement SCA communications in a Web and SOA environment.
When performance and speed are not an issue, SMTP and POP3 can be used to integrate applications communicating to each other through a Mail Server. 9
InfoQ spoke to the lead developers of the most important open source Java Web-services stacks about their design goals, standards, data binding, XML, interoperability, REST support, and maturity. 12