BT

Your opinion matters! Please fill in the InfoQ Readers’ Survey!

Mixins for C# and Visual Basic

| by Jonathan Allen Follow 641 Followers on Jun 29, 2011. Estimated reading time: 1 minute |

Mixins are small bits of functionality that are useful to a wide variety of otherwise unrelated classes. In languages that support multiple inheritance mixins are added as secondary base classes while dynamically typed languages simply merge in the extra functionality. Since C# and VB don’t normally support these options, mixins are normally added using base classes that can become bloated or by a lot of copy-and-paste. Composition isn’t much help here, as the mixed-in methods and properties would need to be delegated to the internal object.

The re-mix project offers an alternative. Using runtime code generation, simple classes are combined with one or more mixin classes. While it has the appearance of multiple inheritance it doesn’t actually use it. Instead it uses a combination of object composition and a matching interface.

For example, say you want a mixin that adds deep cloning support to classes. You would create an interface called ICloneable and a matching mixin called CloneableMixin that implements it. CloneableMixin automatically gets a reference to its parent object through which it can perform the cloning operation.

At runtime one can then take arbitrary classes and mix them with the CloneableMixin to create new types. The new type will inherit from the base type and implement all of the interfaces that its mixins implemented. All of these new interface methods will be delegated to instances of the mixins.

Another use of mixins is to override behavior of the base class. Under this model the methods in mixin become overrides for base class methods in the generated class.

You can learn more about re-mix and mixins in general from Stefan Wenig and Fabian Schmied’s Lang.NET presentation.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

General thoughts, and then some by john zabroski

Jonathan,

I am always impressed by your ability to find all these different open source projects people are working on. You must read a ton, and especially focus your attention on .NET.

May I ask which libraries you've written about on InfoQ that you've found helpful? Do you have a blog where you discuss what your favorite libraries are and how you are using them?

Related: One concern I have over Re-Mix is how does it handle thousands of mix-ins? What are the performance impacts? How does it dispose of mix-ins that are no longer being used? I am always awfully skeptical about these projects, for that reason. Heck, Microsoft cancelled IronRuby and JavaScript for the DLR, and that was with a full-time staff with a budget working on it. My cautious perspective has always been "If it's not something .NET supports native and if it doesn't document performance characteristics that are different from .NET, then there are hidden dangers in using this for a large project."

Cheers!

Re: General thoughts, and then some by Stefan Wenig

John,

AFAIK, IronRuby was not cancelled, MS just passed stewardship on to an OSS community. And I've never heard of technical/performance reasons for this, I believe it was just a priority thing. Don't know about JavaScript for .NET, that was ages. Anyways: The DLR is shipped with .NET 4.

As for mixins: Here are a few numbers from a report we generated on our own product (created using svn.re-motion.org/svn/Remotion-Contrib/MixinXRef/):

Assemblies: 97
Mixins: 229
Mixed Mixins: 5
Non-Applied Mixins: 26
Target Classes: 720
Involved Interfaces: 173
Involved Attributes: 50

Works pretty well for us. We've never actually tried thousands of mixins though. It would be interesting to create a benchmark for various numbers of mixins and target types (the types that mixins are applied to).

There would be different performance indicators:

- on-demand generation of subclass proxies at runtime (affecting application startup)
- up-front generation of subclass proxies at compile-time (affecting build turnaroud)
(you have a choice of either one, or you might use runtime generation for development and compile-time for deployment, especially with client apps)
- object creation (pretty fast, basically a hashtable lookup and a call to a generated delegate, but of course slower than 'new', which is ridiculously fast with empty ctors)
- object usage (methods, properties): just a level of indirection, nothing like the impact you'd expect from a dynamic language
- memory overhead (that one was actually a problem until we did a major optimization in this area)

So compile-time/app-startup impact is a whole different story than runtime impact. The latter one is rather neglectible. Even more so if you use any other piece of infrastructure, like ORMs, that often have a much larger impact than the low-level thingy that sublclass proxies are.

Mixins cannot be disposed of, unless you load them in their own AppDomains. That's true for any generated type in the CLR. However, re-mix is designed to create a static structure of types and mixins, so that's no more important than losing any other class. It is possible to apply mixins at runtime, in various configurations. The only reason to do this that we found was unit tests, and here type disposal is not an issue.

With all that said, your request for hard performance numbers is a valid one.

HTH,
Stefan

Thanks for the article! by Stefan Wenig

We were going to contact InfoQ about re-mix, maybe even provide an article. However, there are a few thins we'd like to add in the documentation section first.

I'd like to add that there are basically two ways to look at the mixin story:

1) The multiple-inheritance way
I'd just like to have another base type for implementation inheritance. Mixins are arguably a more elegant solution that avoid some of the problems of MI. But you basically use them to achieve the same design goal. You nailed that point well.

2) Multi-dimensional separation of concerns
This is a much longer story. MDSOC was an academic movement of the 90s, that later faded for reasons I don't fully understand (my best guess is that they eventually went all-in on AOP, and then they died with it).

We're more interested in option 2. The basic idea here is that you have classes with just their intrinsic properties and functionality. Much like an anemic domain model, if you will. You then add functionality to classes via sets of mixins. (Such a set would be a hyperslice in MDSOC/HyperJ.)
These sets are then combined to form actual applications. What you win is separation of concerns: you no longer have the single dimension of single inheritance along which to structure your applications modules and source code, but any number you choose (units of change, features, different versions)

I highly recommend the paper "N Degrees of Separation: Multi-Dimensional Separation of Concerns" by Tarr et.al.
www.computer.org/portal/web/csdl/doi/10.1109/IC...
It's a paid download, but many free ressources are available online. Just ask Google.

It's a good read for anybody, not too academical for anybody interested in such concepts. A more elaborate version is amzn.to/lUVmaJ

HyperJ was abandoned in favor of AspectJ. But AOP got out of fashion for anything other than cross-cutting concerns (logging, security etc.) for many reasons. Mixins (with the right set of features) could be just the right technology to get MDSOC into the mainstream. The biggest problem right now seems to be that nobody pursues the concept of MDSOC any more. (There is DCI by Trygve Reenskaug, who was one of the major influencers of MDSOC with his concept of role models. They were all about SoC too. DCI also uses roles and talks a lot about mixins, but it's rather vague on these things and I'm still searching for the connection.)

- Stefan

Re: Thanks for the article! by john zabroski

Stefan,

There is one very basic reason why multi-dimensioanl separation of concerns failed. Traditionally, the advancement OO offered over ordinary Structured Development was that it organized the problem domain around entities with roles and responsibilities associated with the problem domain.

This minor change in viewpoint was dramatic, but mathematically definable: The flaw in Structured Development was functional decomposition, which implied that the total solution was organized as a tree. This led to a lot of duplication of code, which in turn led to maintenance problems. To solve the duplication problem, people would create service routines that encapsulated repeated logic. However, the trade-off was dramatic and severe with respect to software quality: The programmer had just converted a directed acyclic graph (top-down composition) into a lattice. The result was spaghetti code.

OO, on the other hand, focused on a different way to "structure programs": using higher-order behavior! This formalism naturally maps to "message passing" and "asynchronous" programming, and as shown by the Actor Model can function at Internet scale with the introduction of simple feedback mechanisms like timeouts (as in Erlang).

Not only is composing programs using higher-order behavior more modular for many applications, in the sense that modules are defined in terms of behaviorful objects fulfilling roles and responsibilities in the problem domain, it also allows practitioners better ability to estimate how much effort writing something will take. To estimate, simply determine if a client's change order requires a change in class structure. If it does not, then the change order will be cheap to implement. If it isn't, then the programmer will be able to tell the client that the request drifs from the original requirements.

Multi-dimensional separation of concerns, on the other hand, did not really address problem domain concerns but rather re-use concerns. Moreover, because it did not address problem domain concerns, it lacked an obvious killer application, because re-use only dominates as the size of the application increases.

I can imagine why you might think MDSOC would be useful for Re-Motion, since I am an expert in Object-Relational Mapping technology and can easily see how you might apply it there. I almost tried applying MDSOC once to said problem, until I stopped, retraced my requirements, and realized Equational Logic is a much simpler foundation for rewriting expression trees! Indeed, four simple rules govern an academic theory called Rewriting Logic (RWL). And, as far as logics go, Equational Logic is one of the simplest: the logic of substituting equals for equals! Since it is really just term substitution, Equational Logic has been the foundation for term rewriting languages for decades, and works really well. Term rewriting languages like Maude also feature very, very strong module systems that can be parameterized by equations rather than just types. Its higher-kinded polymorphism is also something you cannot natively express in .NET today. Moreover, why would you want to write AST (or graph) transformations in C#, which make reasoning about tree transformations very indirect due to mismatch between OO and graph rewriting. Finally, in the most dynamic case, user input could direct the application to construct types on the fly, and that is a powerful use case I want to easily express. I have learned too many good usecases for it, since any modern user interface (Natural User Interface (NUI), with "rapid interaction" with the user) needs to provide for open-ended behavior directed by the user, not the programmer who encodes a fixed set of questions the user can ask.

At the same time as seeing a more straight-forward way to solve my problems, programmers like me are skeptical that the re-use MDSOC provides is long-term vs. short-term, and whether my projects can make real use of MDSOC. Whenever the type system your MDSOC language fails to capture a static constraint about your problem domain, you will have to resort to unit testing instead to provide a Programming-By-Contract definition!

Speaking of real problems I have...

You mention a severe shortcoming of .NET's alleged "Common Language Runtime": AppDomain objects. I think any hardcore C# engineer agrees that AppDomain's were basically a mistake, and there was substantial refactoring in Silverlight to limit that mistake. What .NET really needed was true "processes", in the singly-threaded execution sense. At the same time, .NET needs better ways to allow user-configurable policies for scheduling resources, including computations themselves as resource consumers (eating CPU and Memory). The fact I cannot unload a type without unloading an AppDomain is frustrating, and points to a severe weakness in .NET's transparent memory management. For a killer application I work on, nothing would be nicer than specifying a set of types with a linear logic, and having the garbage collector take advantage of that linear logic to reclaim the types when the statements written in my logic can no longer make use of them. The Java Virtual Machine has the same problems, and Sun and now Oracle are both wasting their time with "kitchen sink" features bolted on after the fact to fix it.

The lack of these Virtual Machines from empowering the programmer with first-class scheduling and resource accounting is a real barrier to the key earthmovers in our industry. Without first-class scheduling and resource accounting, Cloud-based Tier 1 Data Centers are NOT possible! Today, most people use Cloud-based services like S3 & EC2, Azure and SQL Azure, BigTable and AppEngine, but the ways they can use these services are limited to Tier 3 Data Centers, and not capable of managing core operations. In the future, this must change if companies like Oracle, Microsoft, IBM and Amazon wish to continue to grow, because the ideal scenario for all of these companies is to provide as much service to the rest of the world as possible! They can't do that without virtual machines with first-class scheduling and resource accounting. It just isn't possible. At the same time, programming language design is showing fascinating ways language designers can help, using techniques like continuations and delimited continuations or other equivalent control flow constructs necessary for first-class scheduling.

Re: General thoughts, and then some by Jonathan Allen

To be honest I don't really have an interest in most of this stuff. I find the frameworks such as ORMs, Dependency Injection, Aspect Oriented Programming, and the like are far more trouble than they are worth. If I were to write a blog, it would be filled with useless rants about how bad I think this kind of thing is.

That said, I’m quite aware of my own ignorance as to how this stuff works in the real world. I’m working on a project called Granite to serve as a foundation for testing out ideas. I don’t have a lot of free time unfortunately, but eventually I want to get it to the point where I can start doing side-by-side comparisons using what I would consider to be real code.

granite.codeplex.com/

Re: General thoughts, and then some by Jonathan Allen

IronPython and IronRuby are very much still alive. While Microsoft no longer works on the core language, they are working on some IDE support inside Visual Studio.

www.infoq.com/search.action?queryString=python+...

I have personally used IronPython for bulk file processing where in each source file included hundreds of thousands of complex records. Much to my surprise the custom parsing logic done with IronPython didn’t even register on the profiler. The biggest bottleneck in terms of performance was simply the use of a System.Data.DataTable. Switching to a list of dictionaries resulted in a better than 50% speed improvement.

Re: Thanks for the article! by Jonathan Allen

I can't say I agree with you on the topic of AppDomains. I've found them to be quite useful for isolating flaky libraries that I've needed to work with. As far as I'm concerned they are just lightweight processes, which is actually quite important given how expensive processes are in Windows.

As for unloading classes, I never saw the point. Any attempt to unload a class you have to deal with both potential race conditions and the performance hit of loading it back in the next time it is needed.

Re: Thanks for the article! by john zabroski

AppDomain has no strong isolation properties, as of .NET 2. An unhandled exception will bring down the entire application. Any library flaw can bring down an entire application. This demonstrates two major flaws with the way AppDomain objects work:

(1) Failure in one AppDomain can cascade to another AppDomain
(2) Lack of exception handling at AppDomain boundaries, which includes inability to handle exceptions raised from asynchronous calls (OutOfMemoryException, StackOverflowException)

To get around this design limitation, most people wrap AppDomain loading and unloading into reliable communication Channel objects, which approximate the features of Erlang.

Singularity, an OS written in C#, goes even further and removes AppDomain concept altogether, since an AppDomain is an overly general concept required to satisfy requirements of many environments. Instead, Singularity uses a typed language to enforce memory safety and runs every "software isolated process" (SIP) in the kernel's ring 0 memory, bypassing the hardware's memory protection and also thereby avoiding the memory overhead associated with memory rings as a protection mechanism. SIPs are basically AppDomain objects done right, and are very similar to Channels in the Inferno OS created by Rob Pike et al in the 1980s at Bell Labs.

Cheers!

Re: General thoughts, and then some by john zabroski

Dependency Injection is simply the native execution model of .NET, and simply amounts to cutting out ambient authority from programs. This is a VERY good idea.

Aspect-Oriented Programming, in the style of AspectJ, is a bad idea IMHO. Friedrich Steimann [1] is a very good researcher who has a detailed rant about AOP [2] [3], so there is no need to call it a "useless rant" or to blog about it.

Object-Relational Mappers vary in quality, and all I have seen have very poor strategies for dealing with heterogeneous execution of code (converting an expression tree (or object graph) into SQL statement(s)). The SQL generated by most ORMs is naive, primarily because the SQL is not specialized based on the environment it will execute in. Important factors like DBMS Vendor, DBMS Version #, etc. all matter. These issues are not typically addressed, because designing internal subsystems of ORMs correctly requires more polish than open source developers tend to be capable of coordinating. Also, most individual contributors to open source projects do not actually expect to convert their database from one vendor to the next, and so the desire to see efficient query generation regardless of vendor is lacking market forces. There is simply no itch to scratch, or if there is an itch, it is negligible in comparison to the time and effort required to scratch it.

As far as I can tell, the Re-Motion project is one of the first efforts to actually challenge my controversial statements above bout ORMs. But, as I understand it, they are doing it by adding another layer of abstraction, rather than modifying and refactoring the internal subsystem of their favorite ORM. Not that there is anything obviously wrong with that approach, but it does seem subverted on close inspection. For example, as I understand it, most ORMs do not do a good job exposing the query generator as a compiler architecture like LLVM, and so Re-Motion is fundamentally targeting a moving target. A quick dig into Hibernate reveals what a mighty hack grouping is there, and how it instantly breaks down when vendors support more advanced ways to group sets of data and provide roll-ups on those groups. Without access to query generation primitives, Re-Motion is simply rewriting the AST*, when really it should be re-writing and re-ordering query generation passes on a per query, per vendor basis. ORM Query Generators themselves face the same problem, not having detailed access to the black boxes that are Cost-Based Optimizers and Estimated/Actual Execution Plans. This is a combinatorically explosive problem.

* Apologies if I misrepresented Re-Motion. As Stefan Wenig posted above, there is a lot of code in that project and digging through the internals completely is time-consuming, so I made only a cursory review and handwaved the rest.

Cheers!

[1] www.kbs.uni-hannover.de/~steimann/
[2] F Steimann "Why most domain models are aspect free" in: 5th Aspect-Oriented Modeling Workshop AOM@UML (2004).
[3] F Steimann "Aspects are technical, and they are few" European Interactive Workshop on Aspects in Software EIWAS'04 (2004).

Re: General thoughts, and then some by john zabroski

Try the following mod:

"IronPython and IronRuby are dead, because there is no development team improving the language or Microsoft customer support for serious problems run into."

Re: General thoughts, and then some by Jonathan Allen

I am on the mailing list for those and I'm seeing pretty good response rates for help and change requests. The new owners are taking good care of it and I would certainly recommend it.

Re: Thanks for the article! by john zabroski

Stefan,

One last note:

I hope my comments above did not come across as arrogant, or that "I know better than you". On the contrary, I was really excited to see the Re-Motion project. When I first heard of it, I thought to myself, "Finally, somebody who thinks somewhat similarly to me, and sees the same problems!" My comments about Equational Logic were NOT intended to read as "I can do better", but rather "Why I don't see the need for MDSOC that you might." Ergo, if I don't see the need, others probably have come up with their own formal methods for solving the same problems.

Re: General thoughts, and then some by john zabroski

Thank you very much for the correction.

I didn't pick up on the fact Miguel de Icaza put his support behind IronPython and IronRuby.

Would be nice to see Pash get finished up so that PowerShell is portable to Linux (although the usefulness of that may be in doubt, given how Get-WMIObject is in most scripts people use).

Re: Thanks for the article! by Stefan Wenig

John, it's good that some people still show some interest in this stuff, but I think you've got a lot of incorrect assumptions here. That's probably because we've been quite silent on re-motion. Let my try to shed some light on it:

First, there are three projects that you're mixing in your comment:
- re-mix is just this: mixins. it caters to the larger vision, which has MDSOC at its core, but has been designed to provide just this basic language extension, no frills, no distraction, use it for anything you like and any architectural approach. some compile-time tooling (validation, pre-compilation, reports)
- re-linq is just infrastructure for any LINQ-provider, and an almost complete SQL backend. it has no connections to re-linq. it's lean and mean, designed to be included in 3rd party libs.
- re-motion is a large opinionated framework. it extends the concept of mixins to hyperslices, including ORM and fadcading capabilities. we use it as a foundation for our own product development. besides the design advantages of MDSOC (no tangling/scattering), there are two main advantages if you build a product with MDSOC:
-- you can easily assemble different editions in a product lines (for different types of customers, basically)
-- the product can easily be customized using the very same mechanisms that we use for product development.
essentially, using MDSOC, you build not just a product, but a plattform. say goodbye to brittle events, explicit user exits and database triggers. we believe that this is a whole new ballgame for LoB products like CRM or ERP.

All three are OSS, but only re-linq and re-mix are in a state where we can recommend general usage already. with re-motion, we have some documentation work to do (well, it does have reference documentation and a few hands-on labs, but we need to convey the vision too or it won't make sense.)

I'm on vacation for all of July, but I'd like to continue the discussion, especially as we unveil more of re-motion.

Re: General thoughts, and then some by Stefan Wenig

After a brief look on Steimann's papers, I think this just underlines my belief that the MDSOC community just gave in to the then-popular AOP movement, and eventually was torn down with it. Steimann seems to focus on cross-cutting concerns and eventually comes to the conclusion that there's very few of them, not enough to justify all the AOP hype. I agree, and I think MSFT did the right thing calling this policy injection and avoiding the AOP name tag.

But AOP was popular, and technically, it could also be used to implement MDSOC. I think it sucks at that. But more important was that by association, MDSOC eventually was dismissed toghether with cross-cutting concerns, although it set out to solve a very different problem.

Steimann himself recommends mixins and interfaces as a more viable alternative. He doesn't directly address MDSOC, and where his perception of AOP apporoaches it, he becomes vague.

I still think it's time to take MDSOC out of AOP's grave and realize that it could still be a great solution to problems that plague many of us. More so than code-generation based MDA which Steimann seems to like (how dead is that? models cannot express novel ideas, only what the modeling language foresees.)

As for misrepresenting re-motion, pls. read my other comment.

Re: Thanks for the article! by Stefan Wenig

typo in 2nd bullet point:
- re-linq is just infrastructure for any LINQ-provider, and an almost complete SQL backend. it has no connections to re-linq. (should be: no connections to re-mix)

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

16 Discuss
BT