Google Wave’s Architecture
First of all, a definition is needed. Google Wave is:
a new communication and collaboration platform based on hosted XML documents (called waves) supporting concurrent modifications and low-latency updates.
Google Wave was demoed during Google I/O last week.
Google Wave comes with a public API and the company promises to open source the entire platform before the product goes live. As a platform, Wave allows developers to modify the base code and extend it with gadgets and robots. Gadgets are small programs running inside of a wave, while robots are “automated wave participants.” Wave can also be embedded in other mediums like blogs.
The main elements of Google Wave’s data model are:
Wave - Each wave has a globally unique wave ID and consists of a set of wavelets.
Wavelet - A wavelet has an ID that is unique within its containing wave and is composed of a participant list and a set of documents. The wavelet is the entity to which Concurrency Control / Operational Transformations apply.
Participant - A participant is identified by a wave address, which is a text string in the same format as an email address (local-part@domain). A participant may be a user, a group or a robot. Each participant may occur at most once in the participant list.
Document - A document has an ID that is unique within its containing wavelet and is composed of an XML document and a set of "stand-off" annotations. Stand-off annotations are pointers into the XML document and are independent of the XML document structure. They are used to represent text formatting, spelling suggestions and hyper-links. Documents form a tree within the wavelet.
Wave View - A wave view is the subset of wavelets in a wave that a particular user has access to. A user gains access to a wavelet either by being a participant on the wavelet or by being a member of a group that is a participant (groups may be nested).
This is the crucial part of Wave’s technology. Google Wave makes extensive use of Operational Transformations (OT) which are executed on the server. When an user edits a collaborative document opened by several users, the client program provides an Optimistic UI by immediately displaying what he/she types but it also sends the editing operation to the server to be ratified hoping that it will be accepted by the server. The client waits for the server to evaluate the operation and will cache any other operations until the server replies. After the server replies, all cached operations are sent from client to server in bulk. The server, considering operations received from other clients, will transform the operation accordingly and will inform all clients about the transformation, and the clients will update their UI accordingly. Operations are sent to the server and propagated to each client on a character by character basis, unless it is a bulk operation. The server is the keeper of the document and its version is considered the “correct” version. In the end, each client will be updated with the final version received from the server, which is the result of possibly many operational transformations. There are recovery means provided for communication failure or server/client crash. All XML documents exchanged between the client and the server carry a checksum for rapid identification of miscommunications.
Operations. Wavelets, the basic component of a wave, go through a series of changes called operational transformations. These changes need to be propagated and applied to each client otherwise a client gets out of sync.
Operation Sequencing. All operations applied to wavelets are sent in strict order. An operation is not sent until the server has responded to the previous one. The server orders the operations based on a version number. Each client needs to apply the operations respecting the proper order.
Opening a wavelet. To start communicating on a wavelet, a client sends an Open Request containing the Wave ID and the Wavelet ID to the server. The server responds with a snapshot - the serialized state of the wavelet - or a history hash of the corresponding version.
Server-client Communication. The server sends to the client one of the following: a delta (a sequence of one or more operations), a version number or a history hash.
Client-server Communication. The client sends: a delta or a version number.
Recovery. When the communication fails, the clients starts by reopening the servlet by sending a history of hashes previously received from the server.
The Google Wave Federation Protocol allows multiple entities (wave providers) to share waves with each other. A wave provider can be a server running in someone’s home providing waves for a single user or all family members, or a corporation, or an ISP, Google being just another wave provider.
Balaji D Loganathan
Just want to confirm whether the Google Wave back end was based on Java + Python ?
Balaji D Loganathan, Spritle Software
I've read and watched about Wave too much to remember where I heard of the OT server being written in Java in order to provide a link.
A hint is the fact that Google provides API for Java and Python for robots running on GAE on server's side. But Google wants to provide an additional lower-level API so one can use anything on the server. Check out: googlewavedev.blogspot.com/2009/05/introducing-...
Google IO notes
Richard L. Burton III
Richard L. Burton III