Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News How Microsoft Uses the .NET Core SDK Telemetry

How Microsoft Uses the .NET Core SDK Telemetry

This item in japanese

Microsoft has released the raw data sets from the telemetry that is collected from users of the .NET Core SDK.   The released data sets cover the time period of the third quarter of 2016 through the second quarter of 2017 and provide an interesting look into how the SDK is currently being used by developers.  According to Microsoft's Rich Lander, Microsoft will be providing the telemetry data sets on a quarterly basis in the future, and all data sets will be licensed with the Open Data Commons Attributions License.

This data reflects of .NET Core from the command line, so Visual Studio users of .NET Core are not included.  The .NET Core SDK 1.X collects the following data points:

  • The command being used (for example: build, restore, etc.).
  • The ExitCode of the command.
  • The test runner being used, for test projects.
  • The timestamp of invocation.
  • Whether runtime IDs are present in the runtimes node.
  • The CLI version being used.
  • Operating system version.

For .NET Core SDK 2.X series, Lander says that the following additional data points will be collected.  Notably, this includes an anonymous yet unique ID for each machine running .NET Core SDK from the command line:

  • dotnet command arguments and options — Determine more detailed product usage. For example, for dotnet new, collect the template name. For dotnet build --framework netstandard2.0, collect the framework specified. Only known arguments and options will be collected (not arbitrary strings).
  • Containers — Determine if the SDK is running in a container.  Helps Microsoft determine if more work to support containers would be useful.
  • Command duration — Determine how long a command runs. Useful to identify performance problems that should be investigated.
  • Target .NET framework(s) — Determine which target frameworks are used and whether multiple are specified. Useful to understand which .NET Standard versions are the most popular and what usage guidance is needed.
  • Hashed MAC address — Determine a cryptographically (SHA256) anonymous and unique ID for a machine. Useful to determine the aggregate number of machines that use .NET Core.  In response to user feedback, Lander says that this data point will not be released to the public.

It bears repeating that participating in the .NET Core SDK telemetry program is optional, but participation takes an opt-out approach.  This means that if you would rather not participate you will have to set an environmental variable in order to disable this (DOTNET_CLI_TELEMETRY_OPTOUT).  It should be noted that Lander reiterates that telemetry is not part of the .NET Core runtime, so this data collection is only possible for .NET Core SDK users.

Beyond the expanded collection data points, Microsoft’s .NET Core team are making some changes to the .NET Core 2 SDK based on what they have learned.  First, there will be a single Linux build rather than a distinct build for each supported distribution (Red Hat, Debian, etc.).  Next, macOS users should be pleased to know that OpenSSL will no longer be required.  Improvements (heretofore undescribed) are being made to building .NET Core 2 from source, so that it may more easily be included in a Linux distribution’s package infrastructure.

Interestingly, the most popular command varies depending on the operating system:

  • OS X (macOS) – restore is the most popular
  • Linux – run is the most popular (by a significant margin, 11M events versus the second place restore command at 3M)
  • Windows – build is the most popular

While the .NET Core SDK does not log a user’s IP address per se, the server(s) at Microsoft do.  The client IP is truncated to a 3-octet IP, allowing Microsoft to track the usage of the .NET Core SDK around the world.  For .NET Core SDK developers at the operating system level, Windows is the most popular at 71%, Linux usage is at 18%, and macOS usage is at 11%.

Those interested in the currently available data sets can obtain them directly from Microsoft.  (Be aware that these are large files, ranging in size from 188-516 megabytes.)

Rate this Article