Ensuring Product Quality at Google
James Whittaker, a former Microsoft architect, author of several books in the “How to Break Software” series, and currently Director of Test Engineering at Google, has written a series of posts on how Google does testing. Google blends development with testing, having relatively few testers, and each product goes through successive channels before is ready for prime time.
The quest for quality at Google follows a different path than that used by most organizations. Google does not have a large testing department, but rather testing is partially assigned to the developers, according to Whittaker:
Testing and development go hand in hand. Code a little and test what you built. Then code some more and test some more. Better yet, plan the tests while you code or even before. Test isn’t a separate practice, it’s part and parcel of the development process itself. Quality is not equal to test; it is achieved by putting development and testing into a blender and mixing them until one is indistinguishable from the other.
That is because Google considers that quality is ensured through “an act of prevention” rather than one of detection:
Quality is a development issue, not a testing issue. To the extent that we are able to embed testing practice inside development, we have created a process that is hyper incremental where mistakes can be rolled back if any one increment turns out to be too buggy. We’ve not only prevented a lot of customer issues, we have greatly reduced the number of testers necessary to ensure the absence of recall-class bugs.
So, at Google, testers do not do testing as is generally known, they only “make sure they [developers] have the automation infrastructure and enabling processes” to do testing. Developers are responsible for the quality of their code, doing the testing needed. This places the “quality burden where it belongs: on the developers who are responsible for getting the product right.” To implement their quality philosophy, Google has three types of engineers, according to Whittaker:
- The SWE or Software Engineer is the traditional developer role. SWEs write functional code that ships to users. They create design documentation, design data structures and overall architecture and spend the vast majority of their time writing and reviewing code. SWEs write a lot of test code including test driven design, unit tests and, as we explain in future posts, participate in the construction of small, medium and large tests. SWEs own quality for everything they touch whether they wrote it, fixed it or modified it.
- The SET or Software Engineer in Test is also a developer role except their focus is on testability. They review designs and look closely at code quality and risk. They refactor code to make it more testable. SETs write unit testing frameworks and automation. They are a partner in the SWE code base but are more concerned with increasing quality and test coverage than adding new features or increasing performance.
- The TE or Test Engineer is the exact reverse of the SET. It is a a role that puts testing first and development second. Many Google TEs spend a good deal of their time writing code in the form of automation scripts and code that drives usage scenarios and even mimics a user. They also organize the testing work of SWEs and SETs, interpret test results and drive test execution, particular in the late stages of a project as the push toward release intensifies. TEs are product experts, quality advisers and analyzers of risk.
In other words, SWEs are responsible with software features and their quality. SETs provide support code enabling SWEs to test product features. TEs perform a quick testing or a double check to notice any major bug that went unnoticed by development, and they provide user testing plus performance, security and other similar testing.
On the organizational level, Google has several Focus Areas – search, ads, apps, mobile, operating systems, etc.. One such Focus Area is Engineering Productivity (EP) having several “horizontal and vertical engineering disciplines”, Test being the biggest. EP consists of:
- Product Team – creating productivity tools for all engineers across Google, including open source ones, such as “code analyzers, IDEs, test case management systems, automated testing tools, build systems, source control systems, code review schedulers, bug databases.”
- Services Team – provides reliability, security, internationalization, etc. expertise to any Googler on many topics such as “tools, documentation, testing, release management, training”, and others.
- Embedded Engineers Team – these are testers loaned to various product teams across Google. They can choose to stay with a team for many years, but they are encouraged to switch teams in order to keep a “healthy balance between product knowledge and fresh eyes.” These testers activate inside the product team of various Focus Areas, but they report to the EP management. The reason for that is to provide “a forum for testers to share information. Good testing ideas migrate easily within EP giving all testers, no matter their product ties, access to the best technology within the company.”
This approach to testing leads to a relatively low number of testers. This is made possible because “we rarely attempt to ship a large set of features at once. In fact, the exact opposite is often the goal: build the core of a product and release it the moment it is useful to as large a crowd as feasible, then get their feedback and iterate,” according to Whittaker. Another element in the process of ensuring quality is the use of multiple channels. Whittaker exemplifies with Chrome which has had four different channels:
- Canary Channel – for code that is not yet ready to see the light
- Development Channel – this is the channel used by developers
- Test Channel – this prepares the code for beta
- Beta or Release Channel – this channel readies a product for use either inside Google or for the general public
When a bug is found after a product has been released, a test is written and verified against all channel builds to see if the bug has already been fixed in one of the channels.
In few words, this is the process and the organization used by Google to test their products and to ensure code quality.
Anatole Tresch Mar 03, 2015