InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

Google Wave’s Architecture

Posted by Abel Avram on Jun 01, 2009

Sections
Process & Practices
Topics
Architecture ,
Collaboration
Tags
Google Wave

Google Wave is three things: a tool, a platform and a protocol. The architecture has at its heart the Operational Transformation (OT), a theoretical framework meant to support concurrency control.

First of all, a definition is needed. Google Wave is:

a new communication and collaboration platform based on hosted XML documents (called waves) supporting concurrent modifications and low-latency updates.

The Tool

Google Wave is an email program + instant messenger + collaborative document sharing & editing tool. It is using JavaScript and HTML5 on the client side running in browsers like Chrome, Firefox, Safari, including mobile platforms (iPhone, Android), and Java + Python on the server side, but the server side can be implemented with anything a customer wants. The tool was built with GWT and uses Google Gears to handle drag and drop which is not yet included in HTML 5. The tool needs a dedicated server to handle concurrent communications which is needed especially for large teams. The server can be outside in the cloud or inside in a private enterprise or, simply, in someone’s home.

Google Wave was demoed during Google I/O last week.

The Platform

Google Wave comes with a public API and the company promises to open source the entire platform before the product goes live. As a platform, Wave allows developers to modify the base code and extend it with gadgets and robots. Gadgets are small programs running inside of a wave, while robots are “automated wave participants.” Wave can also be embedded in other mediums like blogs.

The Protocol

Data Model

The main elements of Google Wave’s data model are:

Wave - Each wave has a globally unique wave ID and consists of a set of wavelets.

Wavelet - A wavelet has an ID that is unique within its containing wave and is composed of a participant list and a set of documents. The wavelet is the entity to which Concurrency Control / Operational Transformations apply.

Participant - A participant is identified by a wave address, which is a text string in the same format as an email address (local-part@domain). A participant may be a user, a group or a robot. Each participant may occur at most once in the participant list.

Document - A document has an ID that is unique within its containing wavelet and is composed of an XML document and a set of "stand-off" annotations. Stand-off annotations are pointers into the XML document and are independent of the XML document structure. They are used to represent text formatting, spelling suggestions and hyper-links. Documents form a tree within the wavelet.

Wave View - A wave view is the subset of wavelets in a wave that a particular user has access to. A user gains access to a wavelet either by being a participant on the wavelet or by being a member of a group that is a participant (groups may be nested).

Operational Transformation

This is the crucial part of Wave’s technology. Google Wave makes extensive use of Operational Transformations (OT) which are executed on the server. When an user edits a collaborative document opened by several users, the client program provides an Optimistic UI by immediately displaying what he/she types but it also sends the editing operation to the server to be ratified hoping that it will be accepted by the server. The client waits for the server to evaluate the operation and will cache any other operations until the server replies. After the server replies, all cached operations are sent from client to server in bulk. The server, considering operations received from other clients, will transform the operation accordingly and will inform all clients about the transformation, and the clients will update their UI accordingly. Operations are sent to the server and propagated to each client on a character by character basis, unless it is a bulk operation. The server is the keeper of the document and its version is considered the “correct” version. In the end, each client will be updated with the final version received from the server, which is the result of possibly many operational transformations. There are recovery means provided for communication failure or server/client crash. All XML documents exchanged between the client and the server carry a checksum for rapid identification of miscommunications.

Client-Server Protocol

Operations. Wavelets, the basic component of a wave, go through a series of changes called operational transformations. These changes need to be propagated and applied to each client otherwise a client gets out of sync.

Operation Sequencing. All operations applied to wavelets are sent in strict order. An operation is not sent until the server has responded to the previous one. The server orders the operations based on a version number. Each client needs to apply the operations respecting the proper order.

Opening a wavelet. To start communicating on a wavelet, a client sends an Open Request containing the Wave ID and the Wavelet ID to the server. The server responds with a snapshot - the serialized state of the wavelet - or a history hash of the corresponding version.

Server-client Communication. The server sends to the client one of the following: a delta (a sequence of one or more operations), a version number or a history hash.

Client-server Communication. The client sends: a delta or a version number.

Recovery. When the communication fails, the clients starts by reopening the servlet by sending a history of hashes previously received from the server.

Federation

The Google Wave Federation Protocol allows multiple entities (wave providers) to share waves with each other. A wave provider can be a server running in someone’s home providing waves for a single user or all family members, or a corporation, or an ISP, Google being just another wave provider.

Useful links: Google Wave, Google Wave API, Wave Protocol.

Related Sponsor

In today’s hyper-competitive world, later may be too late to adopt Agile development and this Roadmap for Success will help you get started. Download "Agile Development: A Manager's Roadmap for Success" now!

server-side by Balaji D Loganathan Posted
Re: server-side by Abel Avram Posted
Google IO notes by Richard L. Burton III Posted
Re: Google IO notes by Abel Avram Posted
  1. Back to top

    server-side

    by Balaji D Loganathan

    >and Java + Python on the server side,
    Hi Abel,
    Nice aggregation.
    Just want to confirm whether the Google Wave back end was based on Java + Python ?
    Regards
    Balaji D Loganathan, Spritle Software

  2. Back to top

    Re: server-side

    by Abel Avram

    Hi Balaji,
    I've read and watched about Wave too much to remember where I heard of the OT server being written in Java in order to provide a link.
    A hint is the fact that Google provides API for Java and Python for robots running on GAE on server's side. But Google wants to provide an additional lower-level API so one can use anything on the server. Check out: googlewavedev.blogspot.com/2009/05/introducing-...

    Regards,
    Abel

  3. Back to top

    Google IO notes

    by Richard L. Burton III

    You can find my Google IO Notes at

    tinyurl.com/googleionotes

    Cheers,
    Richard L. Burton III

  4. Back to top

    Re: Google IO notes

    by Abel Avram

    Hi Richard,
    thanks for sharing the notes. Very comprehensive and concise.

Educational Content

Beauty Is in the Eye of the Beholder

Alex Papadimoulis discusses ugly code, where it comes from, how to avoid it, and how to get rid of it.

Architecting Visa for Massive Scale and Continuous Innovation

John Davies examines Visa’s architecture and shows how enterprises have architected complex integrations incorporating Hadoop, memcached, Ruby on Rails, and others to deliver innovative solutions.

Max Protect: Scalability and Caching at ESPN.com

Sean Comerford unveils ESPN.com’s architecture, what components are used and why, and the current changes the website goes through.

The Seven Deadly Sins of Enterprise Agile Adoption

Are there repeated patterns of failure on Enterprise Agile Enablement efforts? Sanjiv and Arlen discuss Seven Deadly Sins to avoid when adopting Agile in an enterprise.

Questions for an Enterprise Architect

Erik Dörnenburg answers: What is Enterprise and Evolutionary Architecture?, discussing 4 issues: Turning strategy into execution, Ensuring conformance, Where do the architects sit? Buying or building?

Wrap Your SQL Head Around Riak MapReduce

Sean Cribbs explains what Map-Reduce and Riak are, why and how to use Map-Reduce with Riak, and how to convert SQL queries into their Map-Reduce equivalents.

Polyglot Persistence for Java Developers - Moving Out of the Relational Comfort Zone

Chris Richardson shows how he ported a relational database to three NoSQL data stores: Redis, Cassandra and MongoDB.

The Golden Circle – Why How What

Jean Tabaka challenges the audience to reflect on what Agile practices they are employing, how they are using them, ending with the questions “Why have their organization chosen to go Agile?