BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles InfoQ Case Study: NASDAQ Market Replay

InfoQ Case Study: NASDAQ Market Replay

This item in japanese

Bookmarks

NASDAQ Market Replay provides a NASDAQ-validated replay and analysis of the activity in the stock market. The application is built using the Adobe Flex and AIR platform, and utilizes the Amazon Simple Storage Service (S3) for persisting historical market data. The combination of S3 and AIR offers a powerful deployment model with little internal infrastructure required. The simple, yet robust, deployment is possible because the AIR runtime runs on the client machine. Amazon S3 removes the need for a traditional middle-tier server, as the data is accessed in from the Amazon "cloud".

The Market Replay application enables users to view the best bids and offers at any point in time, replay the market in simulated real-time, and zoom to view events at the millisecond level. Investors can validate best-execution and Reg NMS compliance. Brokers and traders can review events at the time when their trades occurred to determine whether there was a problem or a missed opportunity. Brokers can send clients a NASDAQ-validated replay of the moment a trade occurred to validate their performance.

Figure 1. Market Replay Case Study

Problem Domain

A universal experience among investors and traders, both professional and non-professional, is the question "what happened?" Did the broker get best execution and comply with Reg NMS? Did the trader miss an opportunity? Why did a retail investor get a different price than expected? What investors, traders, and compliance officers need to answer these questions is a way to rewind and replay the market, slow it down, and zoom in to the second and even millisecond level to see exactly what happened.

The NASDAQ Market Replay application does this using "NASDAQ Official" validated data. There are a number of use cases where Market Replay is useful. For example:

  • A compliance officer receives a report indicating that trades may not be in compliance with Reg NMS or other best execution rules. The compliance officer uses NASDAQ Market Replay to reconstruct quotes at the time of the trades, move forward and backward to look for timestamp mismatches, and zoom to the millisecond level to see quotes that lasted less than a millisecond. After completing the analysis, the officer can show the NASDAQ-official replay or screenshots to customers or regulators to verify compliance.
  • A retail investor receives a trade confirmation from his broker and wants to know why the price differs from the one he expected. The retail investor uses NASDAQ Market Replay at a financial portal to replay the market at the time of his trade. The investor learns about his broker's performance and about the market as a whole.
  • A brokerage call-center operator receives a call from a client questioning the price for a trade. Rather than providing general assurances, the operator can provide the customer with a NASDAQ-branded screen shot from Market Replay that shows the instant the trade occurred. The operator can provide the customer with a link to Market Replay where the customer can view the replay of events leading up to and after the trade. Brokerages could include a link to the relevant replay on NASDAQ Market Replay with each trade confirmation. Market Replay will reduce call volume and reduce time spent on the phone for each customer inquiry.
  • A trader, active investor, or day trader makes a great trade that takes advantage of a series of events in the market. Or, the trader sees an interesting market event and wants to better understand what happened. The trader uses NASDAQ Market Replay to replay and review events to see exactly what happened. The trader shares the replay with other active investors to demonstrate his trading skill or ask others for their input.

NASDAQ Market Replay provides both casual and professional investors the information they need to understand what has happened in the market.

Solution Overview

The Market Replay application was launched in February, 2008, at the same time as Adobe officially released the Adobe AIR 1.0 platform for production use. With the power of the AIR and S3 platforms, it only took about 6 months to take the application from concept to a working production implementation. A team of 10 people worked on the application at different points throughout the implementation, working on both the backend data offloads and the user interface.

The client desktop application was built using Adobe Flex and runs in the AIR runtime. Adobe Flex is an application development framework that is used by developers to build applications that run in the Flash Player. Adobe AIR allows developers to create applications for the desktop using Web technologies such as HTML/CSS, Ajax, Flash, and Flex. In addition, AIR provides offline support and a simple deployment paradigm for delivering the client applications.

S3 provides a robust solution for storing the high volumes of data required to provide the replays. This allows for a unique deployment as the AIR application runs on the user's computer and data is stored in S3. Thus, no heavy server infrastructure was required to deploy the application into production.

The Market Replay implementation improves on similar solutions in a number of ways. For example:

  • Compared to existing tools for showing the status of the order book at a certain point in time, Market Replay provides a higher level of ease of use, visualization, and the replay functionality thanks to Adobe Flex and AIR technologies.
  • Compared to a single firm or vendor's internally stored market data, NASDAQ Market Replay provides NASDAQ-validated data coming straight from the source.
  • Compared to manually building the order book using an historical quote database and expensive analytical software, NASDAQ Market Replay is faster, less expensive, and less prone to error.
  • Compared to internal databases that typically offload data after 10 to 30 days because the size of the database makes it slow and expensive to operate, Market Replay never needs to offload data because it uses S3, an inexpensive file system that is extremely scalable.

The pairing of AIR and S3 makes it possible to retrieve and visualize the data quickly for the users. One of the application's main functions is allowing users to view the state of the consolidated order book at any point in time. The application is able to quickly load new order book files from S3, and the AIR application uses the processing power of the user's desktop computer to update the user interface. This enables users to move easily from one instance to another, without any delays for a server to recalculate and disseminate the display of the next state of the order book.

Drill Down: Adobe Flex & AIR

An essential feature of the application is that it provides users a realistic replay of the market activity in the same way a trader would have seen it at their workstation in real time. This functionality requires sorting and aggregating the order update messages to create the consolidated order book at any point in time. The Flex interface provides two main displays to accomplish this:

  1. Time chart: the time chart displays the best bid and offer at each point in time calculated from the quote data.
  2. Order book: the order book displays the state of the full order book at each instant and updates it dynamically during a replay.

It is key that the application be able to display every detail in the data. This requires zooming down to quote updates that could be as short as a fraction of a millisecond. The application must allow users to zoom into smaller and smaller time frames to view each element in the data. In addition, the replays must be slowed down so that those events can be understood by the human eye. These types of animations / visualizations of the data made Adobe Flex a natural choice.

One of the major advantages of deploying to the Flash Runtime is the native animation support, which provides the foundation for the powerful visualizations in the Replay application. The Adobe Flex framework, which is built on top of the Flash API's, provides a full set of charting components for displaying common visualizations of data. The Flex charting components, along with the other out-of-the-box Flex components, are well implemented to provide rich functionality but still be extensible. Thus, the standard Flex charting components were extended and customized for the Market Replay application. By beginning with the off-the-shelf Flex data visualization components, NASDAQ was able to develop their custom components more quickly than would have been possible starting from scratch.

Using the out-of-the-box Flex charting components is fairly straightforward. In the following example, fictitious data is hard coded into the MXML source file. MXML is a declarative XML markup provided as part of the Flex framework for coding parts of Flex applications. It is an abstraction on top of the core Flash Player programming language ActionScript.

<?xml version="1.0"?>
<mx:Application xmlns:mx="http://www.adobe.com/2006/mxml">

<mx:Script>
<![CDATA[
import mx.collections.ArrayCollection;

[Bindable]
private var stockDataAC:ArrayCollection = new ArrayCollection( [
{ Date: "25-Jul", Open: 40.55, High: 40.75, Low: 40.24, Close:40.31},
{ Date: "26-Jul", Open: 40.15, High: 40.78, Low: 39.97, Close:40.34},
{ Date: "27-Jul", Open: 40.38, High: 40.66, Low: 40, Close:40.63},
{ Date: "28-Jul", Open: 40.49, High: 40.99, Low: 40.3, Close:40.98},
{ Date: "29-Jul", Open: 40.13, High: 40.4, Low: 39.65, Close:39.95},
{ Date: "1-Aug", Open: 39.00, High: 39.50, Low: 38.7, Close:38.6},
{ Date: "2-Aug", Open: 38.68, High: 39.34, Low: 37.75, Close:38.84},
{ Date: "3-Aug", Open: 38.76, High: 38.76, Low: 38.03, Close:38.12},
{ Date: "4-Aug", Open: 37.98, High: 37.98, Low: 36.56,Close:36.69},
{ Date: "5-Aug", Open: 36.61, High: 37, Low: 36.48, Close:36.86} ]);
]]>
</mx:Script>

<mx:Panel title="Sample Visualization" height="100%" width="100%">

<mx:HLOCChart id="hlocchart" height="100%" width="100%"
paddingRight="5" paddingLeft="5"
showDataTips="true" dataProvider="{stockDataAC}">
<mx:verticalAxis>
<mx:LinearAxis baseAtZero="false" />
</mx:verticalAxis>

<mx:horizontalAxis>
<mx:CategoryAxis categoryField="Date" title="Date"/>
</mx:horizontalAxis>

<mx:horizontalAxisRenderer>
<mx:AxisRenderer canDropLabels="true"/>
</mx:horizontalAxisRenderer>
<mx:series>
<mx:HLOCSeries openField="Open" highField="High"
lowField="Low" closeField="Close"/>
</mx:series>
</mx:HLOCChart>

</mx:Panel>
</mx:Application>

In this example, data is hard coded in the source to allow for a self-contained example. In the actual Market Replay implementation, real data is loaded from the Amazon S3 data store. The data then uses the HLOCChart (High Low Open Close) Flex component to render a visualization of the data. See the following resulting screen shot.

Figure 2. Sample Visualization

Adobe AIR provides an ideal runtime for the client application, as the local resources allow for doing calculations on large amounts of data quickly. The application allows users to select a time range to calculate the minimum and maximum best bids and offers for each exchange, and the national best bid and offer. These calculations typically require a large data size and heavy processing to compute the results. Processing user requests on the server would have required a powerful web server, and resulted in delays as customers awaited a round-trip request for each calculation request. Adobe AIR allowed for minimal server infrastructure by pushing most of the work to the user's desktop.

Another way AIR improved the user experience was by enabling replay and analysis functionality that does not depend on uninterrupted network access. Once the data file is retrieved from S3, the replay and calculations can be used with or without an internet connection. This is useful not only for NASDAQ's clients but also for NASDAQ's sales force when they're visiting customers and cannot count on having internet access when they demonstrate the application.

The power of Adobe AIR and Flex were a key factor in launching development efforts for the application. They provided the necessary tools to quickly show quality results, as the work was begun with off-the-shelf Flex components. The components were then customized as the application evolved, eventually leading to the completed application.

Drill Down: Amazon Simple Storage Service (S3)

Amazon S3 was selected because Market Replay requires that historical market data be stored in a way that is inexpensive and extremely scalable. Stock markets generate many gigabytes of trading data each day. The Market Replay application requires that every detail be stored and rapidly available to enable the system to answer user's requests quickly.

S3 was also desirable because NASDAQ wanted to offer a data service that is able to keep many years of data online and immediately available at a reasonable cost to all involved. With S3, Market Replay can support billions of files for countless numbers of users without sacrificing performance.

Market Replay supports users who have regulatory, legal, and customer inquiries about trades that may have occurred many months or years ago. It is a key requirement that all historical replay data be immediately available. S3 has proven to be able to maintain consistently fast access speeds.

Before loading data into S3, NASDAQ uses staging servers to convert the data from the format that is distributed over real-time feeds, to a format that is optimized for replaying. This proprietary conversion process results in very simple and efficient text files that are optimized for quick upload and download. The files contain all the quote information required for the AIR desktop client application to build replays and analyses of the market in extreme detail. In full production, this means that Market Replay loads hundreds of thousands of files each day to S3.

Market Replay utilizes a simple and small comma delimited flat file format for the data. S3 is designed to store and retrieve such huge numbers of files quickly and reliably. The files have a simple-human readable format as seen in the example below. The filename provides the stock ticker symbol, date, and start time of the period covered by the data in the file. The first few records provide the initial state of each exchange's bid and offer. Subsequent records show all the changes. The include fields are: Exchange, sequence number, shares at bid, shares at offer, price of bid, price of offer, start time (milliseconds since midnight), end time (milliseconds since midnight)

M,7838954,300,100,39.81,200,40136513,42919007
I,8557803,0,0,0,0,40838710,44256757
W,10814573,200,200,40.63,40.99,42896510,42901353
D,10816233,800,100,40.57,40.86,42897527,42900730
C,10816354,100,100,40.79,40.83,42897590,42900667
P,10817504,200,300,40.79,40.83,42898433,42900667
Q,10817505,200,200,40.79,40.83,42898437,42900657
Q,10819570,200,200,40.79,40.84,42900657,42900657
Q,10819576,200,100,40.79,40.87,42900657,42900657
Q,10819577,200,100,40.79,40.88,42900657,42900657

Organizing and managing millions of files would seem problematic, but the Replay application fits well with the flat file model, as there is a limited amount of data needed for any particular replay or analysis. That makes for an amount of data that can be stored in a manageable text file on the server. The application knows how to convert the user's replay request into a filename, request it from Amazon S3, and then parse it once it arrives. The application is not currently designed for open ended queries across many stocks or across long time periods. It is designed to provide views and analysis for a single stock in extreme detail.

Amazon provides both REST and SOAP interfaces for accessing data stored in S3. The Flex framework provides support for working with both REST and SOAP interfaces. In addition, there is an Open Source API, called as3awss3lib, hosted on Google Code that provides comprehensive support for interacting with S3 in AIR runtime using Action Script.

An example of accessing files using the as3awss3lib follows:

//sample method for init the downloading of a file
private function getFile():void {
//creates as3awss3lib wrapper with auth parameters
var s3Service:AWSS3 = new AWSS3(this.accessKey, this.secretAccessKey);

//add event handlers for async calls
s3Service.addEventListener(IOErrorEvent.IO_ERROR, onIOError);
s3Service.addEventListener(AWSS3Event.ERROR, onError);
s3Service.addEventListener(AWSS3Event.OBJECT_RETRIEVED, onFileDownloaded);

//calls AWSS3 method to get file
s3Service.getObject(fileName, key);
}

//Event handler used file is returned
private function onFileDownloaded(e:AWSS3Event):void {

//get file details
var currentObject:Object = downloadQueue.shift();
var ext:String = mimeMap.getExtension(e.data.type);
var fileName:String = (ext != null && currentObject.key.indexOf(".") == -1) ?
currentObject.key + "." + ext : currentObject.key;

//save file to specified downloadLocation
var fs:FileStream = new FileStream();
fs.open(downloadLocation.resolvePath(fileName), FileMode.WRITE);
fs.writeBytes(e.data.bytes);
fs.close();

Alert.show("Your file(s) have been successfully downloaded.", "Success!", Alert.OK, null, null, null);
}

In this example, the "getFile" method creates the AWSS3 class provided by as3awss3lib and invokes the "getObject" method for retrieving the desired file. Flex accesses all remote services asynchronously. Thus, the getObject call happens asynchronously and uses the handler declared for the AWSS3Event.OBJECT_RETRIEVED event to process results when the call is returned. In this sample code, the file is saved and an Alert is shown to the user.

The data preparation process breaks the data into individual files. One file covers a single stock symbol, on a single day, for a single 10-minute time period (all time periods are standardized 9:25-9:35, 9:35-9:45, etc). The filename identifies the stock, date, and time-period of the data in the file. It is entirely sequential by date, stock symbol, and time. Users begin a replay by entering a stock symbol, date, and time. The client application translates that symbol, date, time information into a filename. The client then reaches out to S3 and asks for the file that it needs. S3 is very efficient at quickly retrieving files and returning the contents.

Along with the technical benefits of Amazon S3, it was selected because of its pricing model. Its pricing is transparent and predictable. It allows for accurately forecasting cost of operations, and monitoring spending in real time.

Careful preparation of the files uploaded to S3 and the S3 cost profile are critical to enabling NASDAQ to commit to never offloading data. S3 is inexpensive and most of the expense occurs when data is uploaded or downloaded. The cost of maintaining the large number of files that are not touched in a typical month is very low.

S3 charges only for the capacity Market Replay uses. This greatly reduced the cost of developing and launching Market Replay because it did not require NASDAQ to purchase hardware that would take months or years to use at full capacity. S3's scalability and pricing model enables for scaling up as it is needed, without having to buy extra "headroom" in anticipation of a possible acceleration in user base growth.

Conclusion

NASDAQ Market Replay is nearing the release of a new version that will include trades as well as quotes. It required major changes to the application, including: synchronizing the quotes data and trades data, a complete rewrite of the chart interface, a new panel that includes a list of trades and their characteristics, and new calculations done on the trades data. The quality and power of the Market Replay application architecture and the AIR and S3 platforms have been demonstrated through the enhancement process, as these major updates were implemented in only about a month of development time. In addition, AIR makes it easy to rollout new versions of the application because the application checks and installs updates each time it is launched.

It is an exciting time for the software industry, as the NASDAQ Market Replay implementation demonstrates that a powerful data driven application can be brought to the market quickly and deployed within a limited budget. Much of this is due to the improvements in the platforms, with both the emergence of cloud computing and strong client side runtimes.

Readers can get free trial access to NASDAQ's Market Replay application at NASDAQ's DataStore (https://data.nasdaq.com/MR.aspx). Those who would like full access should request the application from their regular stock market data provider such as a broker, financial web portal, or financial information vendor.

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Thanks

    by sunil d'monte,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Really interesting case study, thanks for posting it. I wonder if you could elaborate more on your decision not to have a server application sitting between the client and S3? E.g. could you share some performance numbers if you did any prototyping? Also, are you storing any user state, and if so, where? E.g. what if you had a "Save this search as..." feature, which users could re-use at a later point of time...

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT