BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Rendering Large Logs in the Browser for GitHub Actions

Rendering Large Logs in the Browser for GitHub Actions

This item in japanese

Bookmarks

Rendering large logs in a browser can be a complex task if you want a rich UI including coloring, grouping, search, and permalinks, says GitHub engineer Alberto Gimeno. This is why after testing with both a React and plain JS library, they opted to build their own.

The main design decision GitHub engineers had to make was opting for virtualization. Without virtualization, Gimeno explains, the risk was too high that the UI could become sluggish or unresponsive. Virtualization is possible at two different levels: data virtualization and UI virtualization.

UI virtualization consists in displaying only a subset of all the entries belonging to the log and updating the view as the user scrolls through the page. Data virtualization is a more complex approach where the browser has only a portion of the data available at any time and fetches the rest on demand.

In their initial implementation, GitHub engineers tested first with a React-based library and then with a plain JavaScript library. Neither did fare well. In particular, the React-based library did not support a key feature for GitHub, that is the ability to handle variable height lines to split long log lines. The plain JavaScript solution, on the other hand, seemed to fit in better, since it supported all desired features, but it proved under-performing in terms of user experience. As several commenters on Hacker News explained, that initial implementation made the log view almost useless. In particular scrolling and finding information on the log page was not working very well.

For those reasons, GitHub engineers decided to reimplement their log experience from scratch, and leveraged usage data to better define it.

Our data showed us that 99.51% of existing jobs had less than 50k lines, but we knew that browsers start struggling with more than 20k log lines. We also found that even if there is a low number of log lines, it was possible that it could take up too much space in memory. With all that information, we decided that we didn’t need data virtualization but we did still require UI virtualization.

The requirements for the log view included the ability to render at least 50k lines, including on mobile; to enable text selection without restrictions; to reduce memory usage; and, of course, to guarantee smooth behaviour, even when searching or jumping through the log list.

One key factor in GitHub new implementation was to structure the DOM in a way so as to reduce the number of nodes to be rendered and reduce the number of DOM mutations while scrolling. Making too many DOM mutations while scrolling had a negative impact especially on the mobile experience, says Gimeno.

We came up with the idea of grouping log lines in clusters, so instead of removing and adding individual lines, we put log lines in clusters of N lines and add or remove clusters instead of individual lines. After some tests, we now have an idea of how many lines a cluster would have: 50 lines per cluster.

The whole process took a week to get to an initial implementation, and a few more to reach production quality. This approach made it possible for GitHub to have the log UI performance completely under control, stresses Gimeno.

This library has not been open-sourced, at the moment of this writing, and GitHub has not confirmed yet whether they plan to open-source this library. We will update this post when we will have additional information.

Rate this Article

Adoption
Style

BT