InfoQ

News

Confusing unit-of-work with threads

Posted by Niclas Nilsson on Sep 14, 2007 03:00 AM

Community
Architecture,
.NET,
Java
Topics
Object Oriented Design ,
Programming
Tags
Concurrency ,
Frameworks ,
Design Patterns

Most server-side applications and many desktop applications contains data that is tied to a particular task that’s being executed. A common solution is to keep that kind of data in thread-local storage; to keep the data in variables bound to the executing thread. Convenient, but a practice based on a faulty assumption.

Bob Martin wrote an article about the problem of assuming that a unit-of-work has a one-to-one relationship with a thread.

ThreadLocal variables are a wonderfully convenient way to associate data with a given thread. Indeed, frameworks like Hibernate take advantage of this to hold session information. However, the practice depends upon assumption that a thread is equivalent to a unit-of-work. This is a faulty assumption.

ThreadLocal is the Java terminology, but the construct is common in multi-threaded environments. Bob remembers:

Thirteen years ago, while working on my first book, Jim Coplien and I were having a debate on the nature of threads and objects. He made a clarifying statement that has stuck with me since. He said: “An object is an abstraction of function. A thread is an abstraction of schedule.”

Mapping the data of a unit-of-work to one thread is a standard pattern today; a pattern that is found in many popular frameworks, and even though that approach works most of the time, there are situations where the faulty abstraction leaks.

It is not uncommon for a task to have different priorities for different parts of the task, and there are no rules that a task must be single-threaded. Bob exemplifies by saying that a unit-of-work may very well need responsive communication with an external service while performing a relatively long-time computation based upon the incoming data; a problem commonly solved by using two threads. He asks:

Where are the unit-of-work related variables? They can’t be kept in a ThreadLocal since each part of the task runs in a separate thread. They can’t be kept in static variables since there is more than one thread. The answer is that they have to be passed around between the threads as function arguments on the stack, and recorded in the data structured placed on the queue.

TapsaKoo encountered a similar situation a while ago. When trying to work in a domain-driven way in WinForms, he described his problems finding a place to keep session-specific data.

If the application has only one form open at a time, I could save the session-object into the CallContext. What if the application has multiple forms open at a time and each of them wants to have a separate instance of my session-class? CallContext is out of the question. So are all thread-specific alternatives. What is left? Nothing? I’m not the first person pondering this issue. A solution probably exists but I can’t find it. Do I really have to inject the session-object into every object instance that might need it? Or should I refactor a lot behavior from domain-classes into services and inject the session-object into them. I don’t like this approach because I want my classes to be more than data containers.

When reading Uncle Bobs post, TapsaKoo agreed that there are no easy solutions to the problem:

It doesn’t matter if you have 1 or 10 threads. The problem is always the same. UnitOfWork or SessionState should have a place that does not depend on threads. It’s a dangerous assumption that UnitOfWork is directly related to one single thread. That assumption seriously limits your other architectural choices.

Bob concludes that something seems to be missing:

So, though convenient, ThreadLocal variables confuse the issue of separating function from schedule. They tempt us to couple function and schedule together. This is unfortunate since the correspondence of function and schedule is weak and accidental.

What we’d really like is to be able to create UnitOfWorkLocal variables.

7 comments

Reply

Right theorical assumption, too much pain in practice by Julien Delfosse Posted Sep 14, 2007 5:05 AM
Re: Right theorical assumption, too much pain in practice by Alex Miller Posted Sep 14, 2007 6:18 AM
Re: Right theorical assumption, too much pain in practice by Steven Devijver Posted Sep 14, 2007 9:39 AM
Re: Right theorical assumption, too much pain in practice by Cameron Purdy Posted Sep 17, 2007 2:45 PM
Sharing context between threads. by Peter Lawrey Posted Sep 14, 2007 4:43 PM
Java EE 5 solution: TransactionSynchronizationRegistry by Patrick Linskey Posted Sep 15, 2007 11:55 AM
fluid variables by Steven Shaw Posted Sep 18, 2007 6:42 AM
  1. Back to top

    Right theorical assumption, too much pain in practice

    Sep 14, 2007 5:05 AM by Julien Delfosse

    While I agree with this point theorically, I don't think multi threads unit of works can be applied practically (in most applications).

    If you take transaction management, for example, it is clear that the transaction life must be bound to your unit of work. Problem is, in application servers, JTA transactions (and associated resources as JDBC connections obtained through JNDI lookup) are bound to a particular thread. Same goes for caller information and other useful, infrastructure provided data.

    If you were to implement a multi thread unit of work, you'd have to abandon those low level features. It simply seems too much pain for me. I prefer using a logical unit of work (in the use case sense) that spans several technical units of work (in the ORM/tx sense of the term) that can be distributed amongst many thread. Mechanisms as workflows, JMS asynchronism, sync points, ... exist to support this way of programming, and you still can use all the features provided by your application server.

  2. JTA is particularly insidious in this regard (binding work to the current thread). Actually, that seems common in most transactional systems. I even think that's ok as long as you can strip the info off the thread at some point and move it to a different thread.

    When I was at MetaMatrix, we used the Atomikos transaction manager and we were able to move transactions between threads (and include sub-transactions on different VMs). But it certainly wasn't pretty and didn't seem to be the way people normally used it.

  3. Indeed, sharing connections and transactions between threads is unusual and unpractical. Also, the transaction schemantics of JTA and Spring - notably, the propagation levels - have clear behavior when it comes to delimiting the association of "unit-of-work" objects with the threads.

  4. Back to top

    Sharing context between threads.

    Sep 14, 2007 4:43 PM by Peter Lawrey

    There are two approaches I have taken to share context between threads. Both required a pool of threads to share a ThreadGroup.
    1) Create a ThreadGroupLocal. This holds a value/values for each ThreadGroup shared between threads in that group.
    2) Create a custom ThreadGroup which holds a thread safe Map of values.

    Where you have something which is single threaded I associate a single thread pool and a cached (variable sized) pool with the thread group. Single threaded processing requires adding a task the first pool and any which isn't single threaded is added to the second pool.

    This is the cut down version. You add sub-tasks to either the single pool or the multi threaded pool. The single threaded pool can kick off any number of tasks concurrently and then collect the results with the Future.get() method. Multi-threaded tasks can add sub-task which must run in the same threaded to the single pool.
    This structure is used to run a large number of unrelated tasks at once. (Although the single threaded portions have to wait for each other)


    public class MyThreadGroup extends ThreadGroup{
    public MyThreadGroup(String name) {
    super(name);
    }

    private final ScheduledExecutorService executorService = Executors.newSingleThreadScheduledExecutor(new MyThreadFactory());
    private final ExecutorService multiExecutorService = Executors.newCachedThreadPool(new MyThreadFactory());
    private final Map<String, Object> values = new ConcurrentHashMap<String, Object>();

    public static <T> Future<T> submitSingle(Callable<T> callable) {
    // add code to check in is not in the single threaded executor already.
    return getMyThreadGroup().executorService.submit(callable);
    }

    public static <T> Future<T> submit(Callable<T> callable) {
    return getMyThreadGroup().multiExecutorService.submit(callable);
    }

    public static Object getValue(String key) {
    return getMyThreadGroup().values.get(key);
    }

    public static Object setValue(String key, Object value) {
    return getMyThreadGroup().values.put(key, value);
    }

    private static MyThreadGroup getMyThreadGroup() {
    return (MyThreadGroup) Thread.currentThread().getThreadGroup();
    }

    class MyThreadFactory implements ThreadFactory {
    public Thread newThread(Runnable r) {
    return new Thread(MyThreadGroup.this, r, getName());
    }
    }
    }

    </t></t></t></t></t></t>

  5. Back to top

    Java EE 5 solution: TransactionSynchronizationRegistry

    Sep 15, 2007 11:55 AM by Patrick Linskey

    In Java EE 5, this disparity has been addressed with the new TransactionSynchronizationRegistry interface. In particular, take a look at the putResource(Object,Object) and getResource(Object) methods.

    -Patrick

    --
    Patrick Linskey
    bea.com

  6. JTA is particularly insidious in this regard (binding work to the current thread).


    One can suspend() the transaction on one thread and resume() it on another.

    Peace,

    Cameron Purdy
    Oracle Coherence: The Java Data Grid

  7. Back to top

    fluid variables

    Sep 18, 2007 6:42 AM by Steven Shaw

    Probably what you are missing is the idea of fluid variables (called special variables in Common Lisp).

Exclusive Content

Book Except and Interview : Aptana RadRails, An IDE for Rails Development

Aptana RadRails: An IDE for Rails Development by Javier Ramírez discusses the latest Aptana RadRails IDE, a development environment for creating Ruby on Rails applications.

Fast Bytecodes for Funny Languages

Cliff Click discusses how to optimize generated bytecode for running on the JVM. Click analyzes and reports on several JVM languages and shows several places where they could increase performance.

Scott Ambler On Agile’s Present and Future

Scott Ambler, Practice Lead for Agile Development at IBM, speaks on the current status of the Agile community and practices having a look at the perspective of the Agile’s future.

Manager's Introduction to Test-Driven Development

Dave Nicolette and Karl Scotland try to introduce non-technical managers to one of the most popular Agile development techniques: Test-Driven Development (TDD).

Structured Event Streaming with Smooks

Smooks is best known for its transformation capabilities, but in this article Tom Fennelly describes how you can also use it for structured event streaming.

How to Work With Business Leaders to Manage Architectural Change

Successful architectures evolve over time to meet changing business requirements. Luke Hohmann presents how to collaborate with key members of your business to manage architectural changes.

Colors and the UI

In this article, Dr. Tobias Komischke explains how colors used in a GUI can influence our interaction with a computer and offers advice on using the appropriate colors for the interface.

Building your next service with the Atom Publishing Protocol

In his presentation, recorded at QCon San Francisco, MuleSource architect Dan Diephouse explores ways to use the Atom Publishing Protocol (AtomPub) when building services in a RESTful way.