InfoQ

InfoQ

Topic/Tag specific view

Map-Reduce Content on InfoQ


Latest featured content about Map-Reduce

Approachable Concurrency for the JVM with Groovy Parallel Systems

Topics
Groovy,
Map-Reduce,
Java,
GOTO 2011,
JVM Languages,
Big Data,
Dynamic Languages,
Languages,
GOTO Conference,
Database Design,
Concurrency,
Actors,
Programming,
Database,
Conferences

Dierk König introduces GPars, Groovy’s library for concurrent programming, explaining a simpler and less error-prone way to use fork/join, map/reduce, actors, and dataflow in Java and Groovy.

News about Map-Reduce

MapReduce Patterns, Algorithms, and Use Cases

Topics
Map-Reduce,
Big Data,
Design Pattern,
Database Design,
Patterns,
Database,
Object Oriented Design,
Design,
Hadoop

In his new article “MapReduce Patterns, Algorithms, and Use Cases”, Ilya Katsov gives a systematic view of the different MapReduce patterns, algorithms and techniques that can be found on the web or in scientific articles along with several practical use case studies.

Hortonworks Announces Hadoop Data Platform

Topics
Map-Reduce,
Big Data,
Open Source,
Operations,
PaaS,
Database Design,
Database,
Yahoo!,
Infrastructure,
Programming,
Cloud Computing,
Apache,
Hortonworks,
Hadoop

Hortonworks, a company created in June 2011 by Yahoo! and Benchmark Capital, has announced the Technical Preview Program of Data Platform based on Hadoop. The company employs many of the core Hadoop contributors and intends to provide support and training.

MapR Releases Commercial Distributions based on Hadoop

Topics
Map-Reduce,
Big Data,
Database Design,
Architecture,
Database,
Announcements,
MapReduce,
MapR,
Hadoop

MapR Technologies released a big data toolkit, based on Apache Hadoop with their own distributed storage alternative to HDFS. The software is commercial, with both a free edition, M3, as well as a paid edition, M5. M5 includes snapshots and mirroring for data, Job Tracker recovery, and commercial support. MapR's M5 edition will form the basis of EMC Greenplum's upcoming HD Enterprise Edition.

Yahoo Hadoop Spinout Hortonworks Announces Plans

Topics
Map-Reduce,
Big Data,
HBase,
Apache,
Database Design,
Open Source,
Columnar Databases,
Architecture,
Web Servers,
Announcements,
Database,
Programming,
Hortonworks,
Hive,
Hadoop

Yahoo spun-out its core Hadoop team, forming a new company Hortonworks. CEO Eric Baldeschwieler presented their vision of easing adoption of Hadoop and making core engineering improvements for availability, performance, and manageability. Hortonworks will sell support, training, and certification, primarily indirects through partners.

Scale-up or Scale-out? Or both?

Topics
Map-Reduce,
Big Data,
Database Design,
Database,
Architecture,
Apache

A prevalent trend in IT in the last twenty years was scaling-out, rather than scaling-up. But due to the recent technological advances there is a new option, scaling-out scaled-up servers based on GPUs.

Hadoop Redesign for Upgrades and Other Programming Paradigms

Topics
Map-Reduce,
Java,
Big Data,
Database Design,
Languages,
Clustering & Caching,
Architecture,
Programming,
Performance & Scalability,
Database,
Infrastructure,
Announcements,
Mesos,
Yahoo!,
HPC,
Grid Computing,
Hadoop

Yahoo recently announced and presented a redesign of the core map-reduce architecture for Hadoop to allow for easier upgrades, larger clusters, fast recovery, and to support programming paradigms in addition to Map-Reduce. The new design is quite similar to the open source Mesos cluster management project - both Yahoo and Mesos commented on the differences and opportunities.

Scalable System Design Patterns

Topics
Map-Reduce,
Big Data,
Caching,
Database Design,
Orchestration,
Load Balancing,
Clustering & Caching,
Scalability,
Architecture,
Performance & Scalability,
Infrastructure,
Database

Ricky Ho revisited his three year old post on that question and realized that a lot had changed since then.

Interviews about Map-Reduce

Ville Tuulos on Big Data and Map/Reduce in Erlang and Python with Disco

Topics
Map-Reduce,
Ruby,
Python,
Dynamic Languages,
Erlang,
Big Data,
Languages,
Fault Tolerance,
Parallel Programming,
Database Design,
Open Source,
Functional Programming,
Architecture,
Programming,
Infrastructure,
Erlang Factory 2011,
Database,
Language,
Performance & Scalability,
MapReduce,
Hadoop

Ville Tuulos talks about Disco, the Map/Reduce framework for Python and Erlang, real-world data mining with Python, the advantages of Erlang for distributed and fault tolerant software, and more.

Rob Pike on Parallelism and Concurrency in Programming Languages

Topics
Map-Reduce,
Ruby,
Java,
Big Data,
Dynamic Languages,
Languages,
.NET,
Parallel Programming,
Threading,
Database Design,
Linux,
Compilers,
Concurrency,
GOTO Conference,
Programming,
Architecture,
Database,
Operating Systems,
MapReduce,
Language,
Language Design,
JAOO Conference,
Google Go,
Thread,
Conferences

Rob Pike discusses concurrency in programming languages: CSP, channels, the role of coroutines, Plan 9, MapReduce and Sawzall, processes vs threads in Unix, and more programming language history.

Ron Bodkin on Big Data and Analytics

Topics
Map-Reduce,
Big Data,
QCon San Francisco 2010,
Database Design,
Operations,
QCon,
Data Analysis,
Architecture,
Database,
Infrastructure,
Machine Learning,
Business Intelligence,
Conferences,
Hadoop

Ron Bodkin discusses big data architecture, real-time analytics, batch processing, map-reduce, and data science.