Netflix Log4J Optimizations Yield Logging at Massive Scale
Blitz4k, Netflix’ internally optimized version of log4j, has been released to Github. Blitz4j efficiently generates logs within a massively concurrent and heavy traffic environment while consuming fewer resources than other, more traditional logging technologies. It achieves this by overriding sections of log4j’s code where synchronization and deadlocks may occur.
Netflix changes to log4J include:
- Removing all critical synchronizations with concurrent data structures.
- Providing extreme configurability in terms of in-memory buffer and worker threads
- More isolation of application threads from logging threads by replacing the wait-notify model with an executor pool model.
- Better handling of log messages during log storms with configurable summary.
Netflix reports that the cost of logging 300-500 lines per second has dropped by at least 75% by using Blitz4j and spikes of processor activity associated with synchronization have disappeared completely. Applications are now able to respond within an acceptable timeframe even during periods of heavy usage and heightened logging.
As their traffic and need for logging per instance increased, Netflix noticed that log4j consumed more and more resources and slowed the very processes it was logging. They were hesitant to move to a different logging technology such as LogBack because of their heavy investment in log4j. They instead chose to override log4j and customize it for non-blocking, asynchronous logging. The log4j framework is largely unchanged; only the areas that affected scalability were altered.
Netflix’s Karthikeyan Ranganathan recognizes that Blitz4j may not be the best choice for projects just getting off the ground. LogBack is a product from the team that delivered log4j that addresses many of the concerns offered by the Netflix team. In this aspect, projects without the investment in the traditional log4j framework or have been built against slf4j should consider using LogBack over Blitz4j. But for companies with a significant investment in log4j, Blitz4j is a valid option to enable logging at internet scale.
Craig Motlin Sep 01, 2014