Corona Improves Hadoop Scalability At Facebook
Facebook has open sourced Corona, an in-house developed improvement to Hadoop MapReduce scheduling software.
Corona separates the two critical tasks of Cluster management and Job Tracking. This is very similar in concept to Apache YARN which is also an improved version of MapReduce scheduler/resource manager. Facebook Engineering has published an article explaining Corona along with the background. They have also explained the reason for not reusing YARN –
It’s worth noting that we considered Apache YARN as a possible alternative to Corona. However, after investigating the use of YARN on top of our version of HDFS (a strong requirement due to our many petabytes of archived data) we found numerous incompatibilities that would be time-prohibitive and risky to fix. Also, it is unknown when YARN would be ready to work at Facebook-scale workloads.
One of the main differences in Facebook’s version of Hadoop is AvatarNode, that creates a hot-standby for every Node in the cluster. This creates a highly-available NameNode and even allows software upgrades to happen without downtime. This is critical for the company as it currently processes hundreds of petabytes of data in it's data warehouse, with half-a-petabyte of new data coming in every day.
Corona can currently run MapReduce Jobs, but Facebook intends to use it for scheduling Jobs from other types of applications as well, such as Peregrine.
Ronny Kohavi Dec 12, 2013