Nassim Kammah, engineer at Etsy, explained to Velocity Conference Europe 2014 attendants how Etsy does continuous integration for mobile apps. Nassim focused his presentation on iOS apps, less so on the Android space.
The mobile apps release process, more so on iOS, is much different from that of a web application. Etsy is well-known for releasing upwards 50 times a day on the web stack, but Apple’s review process takes around 5 days on average. Compounding this issue is that, unlike web applications, users get to decide when to update their apps. The cost of a mobile app error is greater than that of a web app error, because it takes much longer to fix. Making it even more complex, the number of different devices makes for an overwhelming test matrix.
Etsy faces those challenges by leveraging as much as possible the same continuous integration principles and tools that it applies to the web stack. It tries to do continuous releases, not to their customers as that’s not possible, but to their internal teams. It has a fully automated build, right up to the point where the binaries are handed for Apple’s review. Development processes that reinforce the confidence on the mobile apps quality complement this automation.
Each commit triggers a build on the mainline with Jenkins assuming the role of the CI server. Etsy considers that within its context, Apple’s solution to continuous integration, XCode Server and bots, is not the most adequate. When building an app with XCode, the code signing process is automagically handled by the IDE, but that solution is not acceptable for a continuous integration process. Apple’s provisioning profiles, required for code signing, are stored on the GitHub repo (Etsy uses GitHub Enterprise). During a build on an integration machine, they are passed to Shenzhen to build the .ipa
files, which store the iOS apps. Shenzhen is part of Nomad-cli, a set of command line tools for iOS development.
The integration (or build) machines infrastructure is composed of 25 Mac minis provisioned with Chef, with the help of Homebrew and rbenv. Highlighting the difficulties of Mac OSX provisioning automation, XCode installation stills requires a click, so the process is still not fully automated.
Etsy’s engineers strive to commit everyday, as they do on the web stack. Given that each commit triggers a build on the mainline, the same techniques that are applied to the web applications are also applied to the mobile apps. Branch by abstraction, which allows for continuous releasing even under ongoing structural changes, and feature toggles (a.k.a. feature flags or config flags) allow for branching in code and continuous commits on the mainline. These techniques can lead to complicated code but Nassim told that the team, on balance, prefers this solution over version control branching. The continuous integration process can keep the integrity of its principles this way. Even so, there are some rough edges to be smoothed. For instance, feature toggles are only checked on the app startup, removing some of the flexibility web applications enjoy. TryLib, which allows for a developer to test its changes in the continuous integration infrastructure before committing them, is also used for the mobile apps code.
On their inception, the mobile apps were not developed with unit tests. Retrofitting them to the existing code base was a challenge. While some teams solve this issue by assigning a developer each sprint to build up the unit test coverage, Etsy preferred a different approach: Testing Dojos. 6 engineers gather around a room with 1 computer, 1 projector and a simple testing objective. Each developer has the keyboard for three minutes and then rotates, turning into a sharing and learning experience.
Mobile apps functional testing poses some unique challenges, due to the variety of mobile devices. Functional tests are written with Calabash, a Cucumber-based automated acceptance framework for iOS and Android, which must then be exercised on simulators and real devices. Etsy chose to use regular simulators, in-house, but it outsourced the testing on real devices to Appthwack, an online service that does automated testing on those real devices. Nassim told the audience that they choose the devices on their testing pool by looking into Google Analytics data of their user’s devices. Functional tests are flaky, mostly due to timeouts according to Nassim, so Etsy uses the concept of a rolling window pass rate to assess their level of confidence on the apps quality. A rolling windows pass rate defines a gradient of colors from red to green, as opposed to the binary alternative between those two colors.
Besides dog fooding the mobile apps by rolling out continuous releases to their internal teams, two additional layers of manual quality assurance are regularly performed. One is “app rotations”: 8 volunteers gather in a room, accompanied by a QA facilitator and a mix of devices. The goal is to find as many bugs as possible in a predefined timebox. The other is using the Bug Hunting tool, which allows the team to report bugs directly from the device, attaching screenshots to help identify the bug. There is a level of gamification involved, with a leaderboard of the most prolific bug hunters.
The Android stack is a bit more immature, so Nassim did not present a detailed explanation of Etsy’s processes for that technology.