New-age Transactional Systems - Not Your Grandpa's OLTP
John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.
The content has been bookmarked!
There was an error bookmarking this content! Please retry.
Posted by Jonathan Allen on Apr 07, 2008
With the addition of LINQ, extension methods, and improved anonymous delegates, many algorithms no longer need explicit loops. In a post titled "If you are using a loop, you're doing it wrong", Chuck Jazdzewski looks at a possible future for C#.
Chuck Jazdzewski opens with the source of his quote, a college professor teaching APL. APL is a language from the 60's that focused heavily on vector and matrix operations. While loops existed, they were generally not needed. Chuck continues,
It is similar with LINQ, if you are using a loop you are doing it wrong. I find myself doing a lot of prototyping lately and I am forcing myself to use LINQ; not because I don't like it, far from it, I really like LINQ, but using loops is so ingrained into my psyche that I have to stop myself and force myself to think in LINQ. Every time I am tempted to write a loop that involves a collection or an array I ask myself, could I use LINQ instead? Programmers with more of a database background seem to take to LINQ like a duck to water. They think in sets and vectors, I don't, but I am getting there.
While Chuck relies heavily on LINQ expressions and extension methods, he does not eliminate loops entirely. In one case he has to move them into IEnumerable extension method that hides the complexity from the calling function. This is done for his generic function Reduce, which takes a list of items and combines adjacent items that 'match'. Both the criteria for matching, and how to actually combine them, are passed in as anonymous functions. This makes it unnecessary for anyone to ever write a similar function.
The goal of the code Chuck shows is to create a function that takes a sequence of ranges, text and a name like "Whitespace" or "Keyword", and apply styles to them. In the end his function is reduced to three query statements:
While most commenter's where positive, not everyone was convinced this is the right way to go. Holger Flick writes,
The headline should be "If you are using a loop, somebody might still be able to read your code without need to analyse it" :)
Don't get me wrong, I use LINQ quite often and love it. However, in this case it becomes too complex to grasp while reading the source code and thus hard to manage IMHO. I rather write a "multi-line" if e.g. instead of using the one-line ?? -approach.
Will C# code be mostly free of loops in the future? And for that matter, would it be a good thing?
Using Drools? See what you're missing! Get the Power of Drools with the Assurance of Red Hat
18 agile and lean practices for effective software development governance
A Guide to Branching and Merging Patterns
A practical guide to choosing the right agile tools
Monitor your Production Java App - includes JMX! Low Overhead - Free download
Functional programming has been around forever... if it was superior in every ways, we would have been using it sooner in the mainstream sooner :)
That said, for certain things in definately does help. .NET 3.5 doesn't have a ForEach extension method, so you have to add it. Once you do, the trivial loops are indeed simplified and much cleaner now. For more complex stuff well... functional paradigms don't look quite as pretty in the debugger, for one.
Its like everything else. If it makes code cleaner and easier to understand, go for it. If not, use the classic way (and there are a LOT of cases for the classic way still...)
For loops are implicitly single threaded. As the multi-core revolution occurs, being able to do more than one thing at once is important. Vector-based languages can automatically parallelise the operations, as can operations where the looping is an implicit rather than explicit part.
One of the reasons Google's code queries can be scaled is that the mapreduce specifies a requirement that can be broken down and sent across multiple boxes, rather than a 'foreach page in pages' which would do one at a time. The more languages that adopt intrinsic parallelism, the better.
What I use in the "extension method over IEnumerable<T> vs loop" decision is whether item order matters:
MapReduce type operations where each iteration is side-effect free of the other iterations, this is a great idea.
On the other hand, in operations where each iteration causes side effects for the next one (i.e. local state), it is a trickier decision. Parallelism wont help you there, because the iterations are not independent, and you are therefore imperative.
That all said, I like the "write it once as an extension method" approach because once something is well tested, I should not worry about the *how*, just that it *does*.
Personally, I have quite a collection of useful IEnumerable extensions that Ive been collecting... things like Shuffle (randomly shuffles the contents of a collection), SplitIntoNGroups(int n) (brings you back a set of nearly equally sized collections), Head(), Tail(), and so forth.
For loops inside of a method generally dont get reused outside their context. Generalizing things for all enumerables reduces the amount of code I ultimately have to write, so it is a net win for me.</t>
Actually Parallel FX, which I believe is still in alpha, is going to add parallel for loops to C#. For the simple case, all you do is write "Parallel.For" instead of "for".
PLINQ is even more applicable here (the Task Parallel Library and PLINQ are both part of ParallelFX)...
Most of the time, it's as simple as writing .AsParallel() to your exiting LINQ query to parallelize the computation.
Yes, but by that same logic, object oriented programing is not superior in "every way" and yet it is the mainstream paradigm.
Object oriented programming has been around forever too. It took a while before it's time came. Maybe the same is true for functional programming (just on a longer time line). Maybe the functional planets are aligning and an environment is being created that will nurture the functional philosophy and bring it mainstream. Multi-core could be the mother of the new functional movement, or a least the Object-Functional movement.
Microsoft is putting F# in as an official part of Visual Studio for a reason.
It isn't best for everything. Woe to the person who thinks that writing a presentation layer in functional languages like F# is a good idea. That said, for certain kinds of problems - particularly cases where parallelism is important, functional, be it through C# parallel support and lambdas or in F#.
Hah! Check out Cells, a lisp library by Kenny Tilton. It beats the hell out of any other presentation layer I have ever seen.
If you try an imperative paradigm over a functional language, that will hurt, of course. A declarative paradigm, though, is clear, easy to understand, very reliable (ie, bug-resistant) and completely apt for functional programming -- and, interestingly, a hell for imperative languages. :)
Check this article by Kenny. It doesn't describe what Cells is -- you can find that elsewhere in his blog -- but I have personally felt it is an interesting showcase.
John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.
Kevlin Henney examines code samples to see what can be learned from them starting from the premise that one won’t write great code unless he knows how to read it.
Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.
Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).
Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.
Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.
One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.
InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.
8 comments
Watch Thread Reply