BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Gil Tene: Understanding Hardware Transactional Memory

Gil Tene: Understanding Hardware Transactional Memory

This item in japanese

Bookmarks

In his presentation "Understanding Hardware Transactional Memory" at QCon New York 2016, Gil Tene introduces hardware transactional memory (HTM). Whereas the concept of HTM is not new, it is now finally available in commodity hardware. For example, starting in Q2 2016 every Intel based server supports HTM. The basic purpose of HTM is to be able to write to multiple addresses in memory in an atomical way so that there cannot be inconsistencies in cooperation other threads.

Tene starts by explaining the four states of memory caches:

  • invalid
  • shared
  • exclusive
  • modified

and points out that with regard to hardware transactional memory, there are now two additional states:

  • line was read during speculation
  • line was modified during speculation

A transaction has to be aborted if another CPU wants to write data, if it wants to read data that was modified and if a CPU decides to self-evict the cache.

According to Tene, the big advantage of hardware transactional memory is to get rid of blocks of serialization. The goal is to run fully in parallel and only rollback in case of an actual collision while accessing a data item. This is directly related to Amdahl's law which states that the more cores are available, the smaller the gain in speed actually is. If there is ten percent serialization in an application, ten CPUs will only provide little more speed-up than factor five. To actually realize a speed-up of factor ten, one would have to use around 100 CPUs.

Tene goes on to introduce the difference between lock contention and data contention. While working with large hashsets access might for example occur in completely different areas but the whole hashset nevertheless needs to be locked. Usually, lock contention is much bigger than data contention, but only data contention is a problem for CPUs working in parallel. Thus, only aborting transactions in cases of data contention would reduce the impact of Amdahl's law drastically and speed up parallel computing.

With regard to Java synchronized blocks, Tene explains that uncontented blocks would be executed as fast as before. Only in cases where an actual data contention happens there would be the need to rollback the transaction and let the code run again in parallel. For Java applications, this would be completely transparent and no code changes are needed as soon as the Java virtual machine is using HTM. This is the case for Hotspot 8 JVMs from update 40 on. Tene also shows simple benchmarks that visualize the positive effect of hardware transactional memory: even with five percent writes in a hashset, there is a linear scale factor while adding more CPUs.

Gil Tene concludes by pointing out that whereas using HTM is transparent for developers, they now need to start thinking about data contention in their applications. Multiple threads should not modify a single variable because that would lead to data contention and thus to loss of speed up because the advantages of HTM could not be leveraged then.

Please note that most QCon presentations will be made available for free on InfoQ in the weeks after the conference and slides are available for download on the conference web site. You can also watch an interview with Gil Tene on this topic on InfoQ.

Rate this Article

Adoption
Style

BT