Community the Focus at ApacheCON NA 2014
This year's ApacheCON North America conference saw key speakers focus on open source and its community. With more than 400 attendees, over 70 projects represented and 180 conference sessions it covered as many diverse topics as diverse the Apache Software Foundation projects are.
Roman Shaposhnik, senior manager of the Hadoop open source platform at Pivotal, commented in his blog about the profile of the attendees.
While there are many software conferences with developers, this conference has one of the highest concentrations of folks who actually write tons of code and lead sizable open source communities.
There were 10 different tracks, focused on a wide variety of technologies such as OpenOffice, Big Data or httpd. Following the Apache Way motto, there was also a dedicated track "Community over Code", and, on top of that, rooms booked through all the conference to host hackathons, where anybody could swing by and join other attendees or ASF committers, as well as several meetups organized by ASF projects.
Jim Zemlin of the Linux Foundation presented a keynote about the role of foundations in open source, focusing on the the social coding benefits, neutrality, openness and fairness that foundations provide in the post-GitHub era, as it allows resources and intellectual property to be collected and shared in a fair and open manner. Open source foundations are now filling the role that standard developing organizations used to play for many years.
James Watters, Pivotal Software's head of product, presented on open source in the enterprise and corporate collaboration with open source initiatives from the lessons he learned back from his experience in OpenSolaris and now at Pivotal. In his opinion, OpenSolaris failed because it became open 10 years too late, a mistake that they are trying to avoid at Pivotal, a company that has built a PaaS platform powered by Apache software and other open source projects, most of them under the Apache license, avoiding companies lock in and copyright issues. He also shares the Zemlin's idea that open source is the new standard.
Another keynote focused on companies and open source was given by Mark Hinkle, senior director, open source solutions at Citrix Systems, on the reasons behind donating CloudStack to the ASF, both wanting the same: promote the core functionality, build the Apache and company brands, and be charitable for the greater good, mixing corporate and community interests.
Hilary Mason, data scientist in residence at Accel Partners, talked about the past, present and future of data engineering. For her, Big Data is data that is useful, defined as its capability to solve human problems, not necessarily defined by its size. For instance, a Big Data problem is the deciphering of ancient languages, addressed by data scientists and linguists collaborating together. Another example is the 1880 US census, a process that took seven years to compile, that counting machines by the Tabulating Machine Company were applied to. The Tabulating Machine Company eventually became the well known IBM. People are now collecting more data than ever before, from all the devices and sensors, and she hopes that in the future data engineering will be easier, with data systems using natural language, as some examples that are already out there.
Jason Hibbets, project manager in corporate marketing at Red Hat and lead administrator for Opensource.com, covered open initiatives at community, city and government level. The open source way can be applied to other disciplines in order to improve the citizen experience in multiple areas. The Open Government Initiative at the federal government was started on president Obama's first day in office, and is based on the principles and philosophy of open source. It has been complemented over time, for instance, just last year the administration released an executive order, mandating that any agency producing data should do so openly and in machine readable formats. At local level there are several other initiatives such as Citycamp, an unconference focused on innovation for municipal governments and community organizations, or Code for America, a non-profit organization facilitating the use of technology by residents and governments. The key elements that these movements have in common are the promotion of culture/participation, open government and open data policy, open source conferences and user groups, and the economic development resulting from these actions.
The Big Data track used two rooms for the entire event, mainly focusing around Apache Hadoop, the framework that allows for the distributed processing of large data sets across clusters of computers, and a vast ecosystem of related projects. There were talks about Big Data infrastructure, covering projects such as HBase, the scalable, distributed database for structured data, Hive, a data warehouse infrastructure, or Apache Mesos, a cluster manager that provides resource isolation and sharing across distributed applications. Mesos is used by Twitter, where it runs on tens of thousands of cores, and Airbnb, who runs all of their data infrastructure on it, processing petabytes of data. Mesos can run Hadoop, Jenkins, Spark, Aurora, and other applications on a dynamically shared pool of nodes.
Other projects covered included the ones used to access Big Data in the most efficient way. For instance Apache Phoenix focuses on applications for Apache HBase that demand low latency as opposed to the typical map-reduce, batch-oriented applications. It is the technology used to support big data at Salesforce.com and the first SQL query engine to be built specifically for HBase.
The Tomcat track covered Tomcat 8, the version that implements the latest Java server specifications, Servlet 3.1, JSP 2.3, EL 3.0 and WebSockets, targeting Java 7 and higher. On top of Tomcat is built Apache TomEE, adding the power of Java EE as a Java EE 6 Web Profile certified version of Tomcat.
Apache httpd continues to be the most used HTTP server around the world and had its own track to discuss recent performance enhancements, cloud functionality, and reverse proxy improvements in the last release, as well as security, specially in the Heartbleed aftermath.
The cloud track briefly covered Apache CloudStack, the Infrastructure as a Service (IaaS) cloud computing platform, as well as the libraries to manage cloud resources in different cloud providers through a unified API: Apache Libcloud for Python, and Apache jclouds for Java and Clojure developers. Co-located and right after ApacheCON, took place the CloudStack Collaboration Conference, featuring three days of hackathons, keynotes and conference sessions about the Apache CloudStack project.
The Apache Way
The conference took place in Denver, CO, and this was the first time it was co-organized with the Linux Foundation. Rich Bowen, executive vice president of the Apache Software Foundation and ApacheCon North America 2014 chair told InfoQ about this joint event.
We have been producing ApacheCon on mostly volunteer effort for 15 years, and doing a fine job of it. But when it is not your day job, things get forgotten, deadlines get missed, and attendees do not always get the best possible experience. Going with the Linux Foundation gave us a great conference, with only a tiny fraction of the stress that has been involved in past years.
Between big projects such as httpd, Hadoop, Tomcat or CloudStack and the smaller libraries such as log4j or Commons, is likely impossible to find any developer not using some code from the ASF. The Apache Software Foundation just ran another ApacheCON with a vast amount of projects involved as diverse as the over 3800 ASF committers, and is already preparing the next one, ApacheCON Europe, that will take place November 17 - 21 in Budapest, Hungary. The call for papers is already open, with several projects looking at doing stand-alone events co-located with ApacheCon, including CloudStack, Traffic Server, and Apache OpenOffice.
The videos of the sessions are available at the ASF YouTube channel.