BT

Building a Mars Rover Application with DynamoDB

| Posted by Kenta Yasukawa Follow 0 Followers , Daniela Miao Follow 0 Followers on Jan 25, 2015. Estimated reading time: 15 minutes |

 

DynamoDB is a fast and flexible NoSQL database service that can be easily managed, so you don't have to worry about administrative burdens such as operating and scaling your databases. Instead, you can focus on designing your application, and launch it on DynamoDB with a few simple steps.

In this article, we will show you how to build an application with Amazon DynamoDB.

Mars Rover Application

The sample application we discuss in this article demonstrates the capabilities of DynamoDB database. The web application showcases data that NASA has made publicly available: images that the Curiosity Mars Rover has been sending back from the red planet, along with their metadata in JSON format. A short snippet of the NASA JSON data is shown below, along with a snapshot of the demo application, or you can try out the live demo!

Figure 1: A screenshot of the Mars Rover demo application

Here is the JSON data set with the image details.

(Click on the image to enlarge it)

Figure 2: A snippet of image JSON data from NASA

Prior to launching the demo application, we collected all NASA data as shown above and ingested all image JSON data into a DynamoDB table for later querying. After all data have been ingested into DynamoDB, we perform various queries and updates on the tables to generate the Mars Rover app which displays beautiful image galleries as shown in the demo.

The default view of the application is a timeline of all images received from one of Curiosity's cameras, or instruments, displayed in reverse chronological order. Users can vote for their favourite pictures and a real-time vote count is maintained for each image. In addition, users can open the "Mission Control" side menu to change the instrument used, date range, or sort images by vote count instead. Finally, users can view images they have voted for under "My Favorites" option.

All features described are enabled through querying the DynamoDB table where the image data are stored. In order to build such an application, you would normally need to consider various functional components such as access control, user tracking, serializing/deserializing data etc. We want to show you how DynamoDB makes this all simple by explaining how we built the Mars Rover demo, and how you can build your own application with DynamoDB too! However, before we deep dive into the demo, let's go through a quick primer on DynamoDB.

Data Model

The DynamoDB data model concepts include tables, items and attributes. A table is a collection of items and each item is a collection of attributes.

Unlike a relational database, DynamoDB is a schema-less NoSQL database. Individual items in a DynamoDB table can have any number of attributes. Each attribute in an item is a name-value pair. An attribute can be single-valued or multi-valued set, details of data types will be discussed below. In addition, the newly released JSON document support allows JSON objects to be stored directly as items in DynamoDB, up to 400KB per item. For example, NASA provides each image from Mars Rover as a JSON object, so each image can be stored as a single item in DynamoDB, and the attributes such as location and time can be directly imported.

Consider storing a set of photos from Mars Rover in DynamoDB. You can create a table, marsDemoImages, with an unique imageid attribute assigned to each image (called its primary (hash) key): marsDemoImages ( imageid, ... )

Each item in this table could have several other attributes, a few examples are shown below:

Table 1: Example items of the marsDemoImages table

Note that "imageid" is the only required attribute in this case, all other attributes could be automatically imported from the NASA JSON image data set. In fact, in this case item 101 doesn't have a "camera_model” attribute. "Mission+InstrumentID" is a composite attribute, which will be explained in a later section.

Primary Key

When you create a table you must specify the primary key of the table. DynamoDB supports the following two types of primary keys:

  • Hash Type Primary Key: the primary key is made of one attribute, a hash attribute. In the preceding example, the hash attribute for the marsDemoImages table is "imageid", as shown below.

Table 2: Example items of the marsDemoImages table with primary hash key highlighted

  • Hash and Range Type Primary Key: the primary key is made of two attributes. The first attribute is the hash attribute and the second one is the range attribute. In the Mars Rover example, let's assume we want to group items by "imageid" first, then by "votes", the hash attribute will be "imageid", and range attribute will be "votes".

Table 3: Example items of the marsDemoImages table with primary hash and range keys highlighted

Queries, Updates and Scans

In addition to using primary keys to access and manipulate specified items, Amazon DynamoDB also provides several ways for searching specific data: Query, Update and Scan.

  • Query: a Query operation finds items in a table using only primary key attribute values. You must provide a hash key attribute-value pair and optionally a range key attribute-value pair. For instance, in the Mars Rover app, we can query for a specific picture by setting "imageid = 201".
  • Update: an Update operation is similar to a Query, except you can modify attributes of the item as well. A conditional update allows you to modify an item only when certain, pre-specified conditions are met. We will see an example of this later when we want to update the vote count of images in the Mars Rover app.
  • Scan: A Scan operation examines every item in the table. By default, a Scan returns all of the data attributes for every item.

Secondary Indexes

Instead of scanning the entire table, which can be inefficient sometimes, we can create secondary indexes to help the querying process. Secondary indexes on a table will help optimize querying on non-key attributes. DynamoDB supports two kinds of secondary indexes:

  • Local secondary index: an index that has the same hash key as the table, but a different range key.
  • Global secondary index: an index with a hash and range key that can be different from those on the table.

Secondary indexes can be thought of as separate tables that are grouped by the index hash key first, then by the range hash key. For example, in the marsDemoImages table, we might want to look up images from a specific mission and instrument, filtered by a time range, so we could create a secondary index grouped by the "Mission+Instrument" attribute first (hash key), then grouped by the "TimeStamp" attribute (range key). A sample illustration is provided below, and we will go into more details of secondary indexes for the marsDemoImage table in the next section.

Table 4: Sample of a secondary index for the marsDemoImages table

Data Types and JSON Support

Amazon DynamoDB supports a newly expanded set of data types:

  • Scalar types: Number, String, Binary, Boolean, and Null.
  • Multi-valued types: String Set, Number Set, and Binary Set.
  • Document types: List and Map.

For example, in the marsDemoImages table, imageid is a Number type attribute and camera_model is a String type attribute.

Most noteworthy here is the newly released data types Lists and Maps, which are ideal for storing JSON documents. The List data type is similar to a JSON array, and the Map data type is similar to a JSON object. There are no restrictions on the data types that can be stored in List or Map elements, up to 400KB per item and up to 32 levels of nested attributes. In addition, DynamoDB lets you access individual elements within lists and arrays, even if those elements are deeply nested. This is an exciting feature of DynamoDB that makes developing web applications with JSON data very easy and intuitive, we will now go over how the Mars Rover demo works behind the scenes.

How It Works - Under the Hood

How does the Mars Rover demo actually work? In this section, we explain how JSON document support for DynamoDB has made building such an application easy and intuitive. We built our application using AngularJS, a popular Javascript web application framework, but concepts mentioned in this post should apply to any other language. If you want to preview the source code, it is openly available under the awslabs Github account.

To understand how the application operates, let's take a look at the overall architecture of the Mars Rover application shown below, we will go through each component step by step.

Figure 3: Architecture of the Mars Rover application

Browser retrieves app code from Amazon S3

Whenever users visit the Mars Rover demo application website, the browser fetches the application code which contains HTML, CSS, and Javascript from Amazon S3. Using DynamoDB and S3, we are able to run this application entirely on the client side, obviating any servers that we have to manage ourselves.

Application authenticates user via Amazon Cognito

During this step, the application grants users access to the DynamoDB table by using Amazon Cognito, a simple user identity and data synchronization service that helps establish unauthenticated guest connections to DynamoDB. This allows any user to query only the DynamoDB tables associated with the application and update a limited set of attributes in the tables. If you want to try launching the demo yourself, the entire application can be run locally on your machine via DynamoDB Local, for development or testing purposes. Instructions on launching the app locally can be found in the Mars Rover app source code README on Github.

Back to the authentication in the live demo, we used Amazon Cognito to easily manage guest access to our DynamoDB tables, and collect relevant statistics on number of visitors etc. A sample screenshot is shown below:

(Click on the image to enlarge it)

Figure 4: Sample screenshot of the Cognito statistics interface

With Amazon Cognito, you can create unique user identifiers for accessing AWS cloud services by using public login providers such as Amazon, Facebook, and Google, or by using your own user identity system. Users can also start using your app as unauthenticated guests. We use the unauthenticated guest access feature to provide AWS credentials to web browsers and uniquely identify each visitor. We deployed our application into production by following the steps listed below, and you can do the same with your own application:

  1. Create an Amazon Cognito Identity Pool for the application. This can be done on the Amazon Cognito management console, using the default options while making sure only “Enable Access to Unauthenticated Identities” is checked.
  2. Configure AWS Identity and Access Management (IAM) to give minimum permissions required for the demo application:
    • Read from marsDemoImages table
    • Query with date-gsi and votes-gsi
    • GetItem
    • Write to marsDemoImages table
    • Updating votes field
    • Read from userVotes table
    • Query on the user's own item, but not to the others
    • Write to userVotes table
    • PutItem on the user's own item, but not to the others
  3. Modifying configurations in the app to use Amazon Cognito. The Mars Rover Demo application instantiates DynamoDB client in viewer/app/scripts/services/AWS.js and provides it with AWS credentials according to pre-specified configurations. You can edit these configurations before you launch the demo application locally or before you create a distribution package with grunt. To switch to use the Cognito Identity Pool, the configuration can be found in viewer/lib/mynconf.coffee.

Queries and updates to DynamoDB

Users can make custom image selections based on dates, votes or their favourites. All selections trigger queries to the DynamoDB tables and indexes. To understand how this process works, we need to deep dive into several aspects of DynamoDB:

  1. table schema and global secondary index (GSI) setup,
  2. query execution, and
  3. update execution

Table schema and GSI setup

Let's start by creating a DynamoDB table! You can do this via the AWS management console, AWS Development SDKs. In our case, we used Javascript, written in CoffeeScript, available under/viewer/lib/prepare_tables.coffee. The most important portion describes the schema and GSI setup for the DynamoDB table used to store the image data, the table representation is shown below:

Table 5: Table schema for marsDemoImages

We decided to combine the data fields "Mission" and "InstrumentID" to allow querying on multiple attributes at once. Since each view in the application is always specific to one instrument of one particular mission, it makes sense to concatenate "Mission" and "InstrumentID", use this combined attribute as the GSI hash key, then allow a third attribute to be the GSI range key. For instance, users can view all images from the "Front Hazcam" instrument of the Mars rover from the Curiosity mission, filtered by date. GSIs facilitate this type of querying, the GSIs for the table are shown below:

Table 6: Global secondary index schema for marsDemoImages

The date GSI is created to allow users to filter images by photo creation date, based on a specific instrument and mission. GSIs group items together by its index hash and range key, this means the date GSI contains image data grouped by "Mission+Instrument" first, then by "Timestamp". This allows the application to quickly find images of a specific date, such as pictures taken on 10/04/2014 from the "Curiosity+Front Hazcam" mission-instrument combination.

Similarly, the vote GSI is created to enable the "Top Voted" view of the Mars Rover demo app. In this case, the index hash key is still "Mission+Instrument", while the range key is "votes". Remember this index will group items by "Mission+Instrument" first, followed by "votes", meaning it optimizes querying of images of a specific mission and instrument combination, sorted by their vote count.

Next, we need a separate table to keep track of which users have voted for which photos, preventing a user from voting for the same photo multiple times. This table has a simple schema and no GSIs:

Table 7: Table schema for userVotes

Finally, the createTable method is invoked to created all tables and secondary indexes in DynamoDB. This is completed as a part of the /viewer/lib/prepare_tables.coffee script, which is executed automatically when you follows instructions in the source code README.

Query Execution

The Mars Rover application operates on the popular web development framework AngularJS. Essentially, each view of the web application is generated by its respective controller: timeline view has a timeline controller and favorites view has a favorites controller etc. These controllers all use a common Amazon DynamoDB service to communicate with the DynamoDB table. This MarsPhotoDBAccessservice can be found in viewer/app/scripts/services and contains all query and update operations used in the application. In particular, the queryWithDateIndex function uses the document-level Javascript SDK to make accessing items simple and intuitive:

Similarly, the vote GSI can be queried to allow users to view images sorted by number of votes received in descending order:

Update Execution

Voting for a photo works similarly as querying, except we need to update existing items in the table instead. Before that, however, we need to check whether a user has already voted for the same photo previously. This can be done by performing a conditional write to the second DynamoDB table, userVotes, which we created to keep track of which users voted for which photos already. The condition can be set using the Expected parameter, as shown below.

In the above code snippet, we specify our expectation that the item with the given imageid and userid should not exist, since this should be the first time a user votes for this photo. Next, we try to put the item into the userVotes table, with the condition in place.

Once the checking process completes, we can update the total vote count in "marsDemoImages" table using the JSON document SDK, where individual JSON fields can be updated in a simple and intuitive way. Let's take a look at how the incrementVotesCount function works:

Note parameters "UpdateExpression" and "ExpressionAttributeValues" are introduced in the JSON document SDK, which provides a lot more support for JSON data access, for full details, please refer to the repo on awslabs Github account and specific documentation on modifying item attributes.

Retrieve thumbnail images from Amazon DynamoDB

Upon querying the DynamoDB table, JSON results are received by the browser, at which point the thumbnail images can be retrieved from Amazon S3. While this is the current implementation on the live demo website, we have also enabled the capability of storing all thumbnail binary data in DynamoDB, under attribute "data" for each item. DynamoDB can store binary data without the need to specify type, constraint or scheme, as long as it is not a hash or range key attribute and meets the 400KB item size limit. We chose to load images from S3 rather than directly from DynamoDB in the public live demo to conserve read throughput costs. However, DynamoDB Free Tier gives you 25GB of storage space, at up to 25 capacity units for both reads and writes. It's a great way to get some hands-on experience with your own web application and DynamoDB.

Conclusions

This article shows how to use Amazon DynamoDB to create a Mars Rover application. You can use the same concepts described in this post to build your own web application. Let's recap the process:

  1. Design your DynamoDB table, including the schema, primary hash (and optionally) range key, and secondary indexes.
  2. Create the table and indexes via the AWS management console or using one of our AWS SDKs. We used the Javascript SDK in our demo.
  3. Choose the language and web development framework you want to use. We chose Javascript and the AngularJS framework.
  4. Code up your application by writing functions that query or update your DynamoDB table. This is made especially easy if you use our document-level SDK when working with JSON.
  5. Launch your application!

About the Authors

Daniela Miao is currently a Software Development Engineer at Amazon Web Services, working on the DynamoDB developer ecosystem team. The team is dedicated to improving the customer experience of using DynamoDB through writing libraries and tools to ease writing applications on DynamoDB. She hopes to help lower the barrier to using DynamoDB through developer education via walkthrough examples, sample applications, blog posts etc. If you have questions, suggestions or simply seeking more information on DynamoDB, please reach out to dynamodb-feedback@amazon.com.

Kenta Yasukawa is a Senior Solutions Architect for Amazon Web Services (AWS). He has mainly focused on designing cloud-based solutions for gaming customers, mobile application backends, social network services and so on. He is passionate about designing scalable and reliable architectures with fully leveraging the capabilities of AWS cloud. Amazon DynamoDB is a key component in the architecture design and he has seen many of his customers have successfully built highly scalable and reliable architecture with Amazon DynamoDB. If you would like to hear such success stories, please feel free to reach out. 

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Great write-up! by Mathias Leppich

Without having tested it, I guess there is a security issue, as all users are grated the privilege to update the "votes" field in the "marsDemoImages" table.

I don't see any way to prevent users from updating the field to whatever they want, affecting the ranking in the "votes-gsi".

However, I can envision a fix for that issue by only granting the users access to their records in the "userVotes" table and using a service like AWS Lambda to increment the vote count in the "marsDemoImages" table in response to the DynamoDB changes events in "userVotes"... effectively using AWS Lambda as something known in the SQL World as Triggers and StoredProcedures.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

1 Discuss
BT