InfoQ Homepage Presentations Copious Data, the "Killer App" for Functional Programming
Copious Data, the "Killer App" for Functional Programming
Summary
Dean Wampler supports using Functional Programming and its core operations to process large amounts of data, explaining why Java’s dominance in Hadoop is harming Big Data’s progress.
Bio
Dean Wampler is a contributer to several open-source projects and the founder of the Chicago-Area Scala Enthusiasts. He is the author of Functional Programming for Java Developers, the co-author of [Programming Scala](http://programmingscala.com/), and the co-author of Programming Hive, all from O'Reilly. He pontificates on twitter,@deanwampler, and at polyglotprogramming.com.
About the conference
Lambda Jam is a new conference that can take your skills to the next level. This is not your traditional conference of sitting and listening - a significant portion of each day will be devoted to hands-on practice or workshops. We aim to stretch your skills and teach you something new!
Community comments
I should have mentioned LINQ
by Dean Wampler,
Re: I should have mentioned LINQ
by Andre Artus,
I should have mentioned LINQ
by Dean Wampler,
Your message is awaiting moderation. Thank you for participating in the discussion.
Someone privately mentioned to me how LINQ was his first entree into Functional Programming in the .NET world. I think LINQ is important enough that I should have mentioned it in the talk. It's a great demonstration of a unified view of data, in memory, on disk, or in a database, with SQL-like operators built into the programming language. LINQ has been copied in other languages, like Scala.
Re: I should have mentioned LINQ
by Andre Artus,
Your message is awaiting moderation. Thank you for participating in the discussion.
I enjoyed the talk. I've been programming in functional languages for some time (Haskell, F#), but often have to resort to C# and Java for my day-jobs. I have noticed how many traditionally imperative languages are increasingly adopting ideas from FP, even systems programming languages like D support ideas like immutable data, pure functions, limited lazy evaluation, and standard library support for the usual combinators.
LINQ came to mind a lot for me during the talk, most prominently when you mentioned joins as it handles that quite well. There is no reason why a LINQ facade cannot be put on something like Hadoop.
An idea I find interesting is Nested Data-Parallelism (developed by Guy Blelloch) which can be [grossly] abstracted as keeping separate views over your data: one view supports the solution structure (allows programmer to divide and conquer; reason about the problem/solution) while another view (the physical/memory layout) is laid out for efficiency. While a lot of the current work seems [to my knowledge] to focus on multicore and GPGPU, I see no reason why it cannot expand to distributed systems for computations where the commutative property holds (for a given nesting level).