Interview and Book Review: Programming Concurrency on the JVM
In his latest book Programming Concurrency on the JVM, author Venkat Subramaniam talks about the concurrency techniques using different JVM programming languages such as Java, Clojure, Groovy, JRuby and Scala. He also discusses topics like Software Transactional Memory (STM) and Actor-based Concurrency. InfoQ spoke with Venkat about strategies and design approaches for programming concurrency on JVM and hardware capabilities to achieve concurrency.
InfoQ: Hi Venkat, can you introduce yourself to our readers and tell us what you are currently working on?
Venkat: I share a great deal of enthusiasm and passion for programming. I help my clients—through training, mentoring, and consulting—adapt agile practices and various technologies.
InfoQ: There are other books on the concurrency topic. How is this book different from other books?
Venkat: If your interest is totally focused on the JDK solutions to concurrency, then I suggest that you look no further than the "Java Concurrency in Practice" by Brian Goetz, et. al. My objective in the "Programming Concurrency on the JVM" book is to take the readers well beyond the solutions offered directly in the JDK.
The concurrency solution in the JDK has quite a bit of complexity baked in. Without understanding the Java Memory Model, it is impossible to get it right. However, once we understand it, we realize that achieving a correct working solution with the JDK concurrency API is incredibly difficult. Thankfully, there are some really good, simple, yet quite powerful, options available for programming concurrency on the JVM. That is what I mainly focus on in the book.
InfoQ: Can you discuss the concurrency strategies and design approaches you covered in the book using programming with immutability and how it differs from the mutability based approach we're so used to?
Venkat: Mutability in itself is not a bad thing, and sharing is a good thing, but shared mutability is purely evil. Once multiple threads start sharing mutable variables, we have to deal with several issues. So we can say avoiding mutability is the easiest way to avoid these problems. However, immutability, where no variables ever change value, is easily said than done.
Programming with mutability is the way of life for us in Java applications. Most of us can't fathom the idea that applications can be programmed with pure immutability. Rather than looking at variables as mutable vs. immutable, if we look at the overall design, it becomes easier to comprehend. Rather than focusing our attention on mutation of state, if we think about state transformation, it gets easier to see how we can create applications where we compose a series of functions that take data from one state to another as a series of transformation rather than through a set of mutation.
In functional programming, immutability is the way of life. In Java, however, it is quite the opposite. On one hand, we can try to achieve this perfect immutable style. However, there is a reasonable middle ground, an approach of isolated mutability where a variable is mutable, however, never accessed by more than one threads at a time. This is the essence in actor based concurrency which is one of the approaches discussed in the book. In my opinion this is easier for most Java programmers to achieve than the pure immutability based design.
InfoQ: How does programming for an I/O based and CPU based application differ?
Venkat: You've decided to make your application concurrent, may be in order to make it more responsive, faster, or look ahead at certain actions the user may want to perform. The next question is how many threads do you need? That is a simple question, however, the answer is not that easy. It depends on two things, the number of cores and how your application spends its time in either utilizing the processor(s) or being blocked on I/O. Finding the number of cores is easy, we can use the JDK API to determine that. Finding the blocking factor, the time a task spends being blocked, requires more work. Once we have these two values, we can apply a simple formula to determine the meaningful number of threads to create. Having fewer threads than that will not give us the desired results. Having more threads will not help either as we'd end up wasting resources and creating unnecessary contention.
We can conclude that for a computation intensive application, you'd not want to have more threads than the number of cores. On the other hand, for a I/O intensive application, you'd want to have a lot more number of threads, depending on the blocking factor. Using the blocking factor and the formula, presented in the book, we can decide on the number of threads we'd need.
InfoQ: You also covered the JDK concurrency and Java 7 Fork-Join API in the book. Can you talk about what Fork-Join API brings to the concurrency table and what can be improved in Java language concurrency API?
Venkat: Java 7 Fork-Join improves on the existing JDK solution and is certainly a welcome step for those who have to continue to program using the JDK concurrency solution. The work-stealing approach certainly improves scalability while easing the burden on the programmers. I use a set of examples throughout the book using different approaches, and so I've included a solution using the Fork-Join API also so we can compare and contrast that with other solutions.
As far as what can be improved in the Java language concurrency API, that is a hard question. In the book I've called the current JDK solution as the "synchronize and suffer model," so that pretty much says how I feel about it. My focus in the book is more about what Java programmers can do right now, rather than how the Java language or it's API can change in the long term.
Java programmers are already used to using third party libraries like Spring, Hibernate, and the list goes on. So, they can right now reach out to third party solutions, that have risen to prominence from other JVM languages, right from within the Java language. So, my focus in the book really is on how we can, in practical ways, help Java programmers right away benefit from these viable solutions. So, rather than asking how can Java change, I took to the question as to how Java programmers can change to benefit from these approaches, while still programming in Java.
InfoQ: Can you talk about the different approaches to achieving concurrency especially between the software based options (like JDK concurrency API, STM, Actors etc) and hardware related options (GPUs)?
Venkat: The hardware capabilities have evolved quite significantly over the past eight years. In addition to GPUs, multicore processors are pretty much the standard on desktops and even laptops we can purchase today. Even smart devices are turning into multicores these days. However, when programming in Java, we program within a virtual machine which then helps us exploit the hardware capabilities.
So, we program on the JVM, so how do the improvements in hardware affect us?
Concurrency was once considered esoteric, something we could consider to avoid in the past. However, with the improved hardware capabilities, we may not have a choice here, it is a challenge we've been drawn into, whether we wanted it or not.
Programming with the JDK concurrency API is quite difficult, there are way too many opportunities to go wrong. In a way, we can view it as a low level API, but to be effective, we need higher levels of abstraction and convenience. This is where Software Transactional Memory (STM) and actors come in and these two take quite a different approaches. STM takes a bold step of providing managed mutable variables, those that can be modified only within a controlled setting called transactions. The actors, on the other hand, isolate mutable variables so they're accessed by no more than one thread at a time. Even though these two approaches are quite different, they share one thing in common, both eliminate the need to explicitly synchronize. By removing the explicit locks, they remove so much burden from the programmers' shoulders and let programmers focus on their application logic and capabilities rather than worry about correctness of concurrency.
InfoQ: What should the Java developers and architects keep in mind when they are working on making their applications concurrent?
Venkat: I have an entire chapter dedicated to this question in the book. Developers (programmers, architects, ...) are in desperate need for programming API that will help them reap the benefits of the powerful hardware they have on hand today. Fortunately, there are quite a few options to choose from, if they're willing to look beyond the facilities offered only in the JDK.
Concurrency choice is orthogonal to the language choice. It is not about which programming language we use, but which library we select to use. There are some general guidelines we can follow—avoid shared mutable variables and design the application to avoid thread-safety issues rather than struggle to ensure thread-safety.
Often I am asked which language I prefer. From the concurrency point of view, the choice of language is not as important as the choice of the library. If the developers are going to use the same old JDK from a different language on the JVM, they would not gain the benefits of the different language from the concurrency point of view. They would gain a lot more by switching libraries than switching languages as far as concurrency goes. So, in addition to maintaining some basic design principles, developers can gain right away if they explore libraries and evaluate options that are available today.
InfoQ: You talk in the book about implementing concurrency using different languages like Java, Groovy, JRuby, Scala, and Clojure. How do you see this "polyglot concurrency" trend emerging in the future?
Venkat: The language space on the JVM is going to be quite split, I now believe that there will not be one prominent language on the JVM (I hope to be proven wrong on that belief). That is not a bad thing, however. I view languages as gateway or bridges to selecting a paradigm of programming, a way of designing and developing code.
While the modern languages on the JVM bring different capabilities to the platform, they also have quite a bit in common. For instance, they're all concise, highly expressive, to various extent provide functional style of coding, etc. So, there's benefit to adopting any of these languages.
From the concurrency point of view, selecting one of these languages is not enough. We have to still evaluate and chose the approach to concurrency. I've written the book for the polyglot programmers, I show how to use these libraries from the prominent languages on the JVM that you've mentioned.
Venkat also talked about how programming concurrency is different from programming in general.
Venkat: Programming concurrency is orders of magnitude more difficult than programming in general. In order to create scalable, high performing applications that function correctly we need simple, effective solutions. We simply can't afford to experiment and apply patchwork in fixing errors.
I view the JDK solution as the assembly language of concurrent programming in Java. Much like how most of us prefer to program in a higher level language than assembly, we should focus on higher level of abstraction provided by libraries to tackle the complex problem of concurrency.
A few years ago I did not realize how effectively we can use these approaches from any language on the JVM, including Java. Once I discovered how easy, effective, and practical these solutions are, I wanted to present these solutions so programmers can quickly make use of these solutions today. I hope the book helps programmers learn how to make use of these wonderful libraries to program concurrency.
InfoQ: Thanks for your time, Venkat.
Thank you for the opportunity to speak to you and reach out to your readers.
About the Author
Dr. Venkat Subramaniam is an award-winning author, founder of Agile Developer, Inc., and an adjunct faculty at the University of Houston. He has trained and mentored thousands of software developers in the US, Canada, Europe, and Asia, and is a regularly-invited speaker at several international conferences. Venkat helps his clients effectively apply and succeed with agile practices on their software projects.
Venkat is the author of ".NET Gotchas," the coauthor of 2007 Jolt Productivity Award winning "Practices of an Agile Developer," the author of "Programming Groovy: Dynamic Productivity for the Java Developer" and "Programming Scala: Tackle Multi-Core Complexity on the Java Virtual Machine" (Pragmatic Bookshelf). His latest book is "Programming Concurrency on the JVM: Mastering synchronization, STM, and Actors”.
I found the explanation on change even with immutable data one of the clearest I ever saw.
I had many colleague buying it after seeing my copy.
Again thank you very much
Your book is on a very actual subject, and yet almost no comments.
Hope to read the book soon
Anatole Tresch Mar 03, 2015