Teaching the Computer to Play the Chrome Dinosaur Game with TensorFlow.js Machine Learning Library

Fritz’s HeartBeat Medium publication recently issued an article by Aayush Arora which shows how Google’s machine learning TensorFlow.js library can be leveraged in the browser to teach the computer to play the Chrome Dinosaur Game.

The Chrome Dinosaur Game (sometimes called the T-Rex Game) appeared five years ago in a Chrome browser when a user would try and visit a website while disconnected from the Internet. The Chrome Dinosaur Game is a simple infinite runner, in which a player may have to jump over cacti, and dodge underneath obstacles. Controls are basic: pushing the space bar translates into a jump, and the down arrow into ducking. The goal is to survive for as long as possible, and a timer measures how long the player manages to sift through obstacles:

4 years later, Google finally explains the origins of its Chrome dinosaur game

The chosen set of features, given the nature of the game, is the speed of the game, the width of the oncoming obstacle and its distance from the player’s T-Rex. The computer will learn to map those three variables to outputs from which one of two decisions can be taken: jump or don’t jump (note that the original version of the game also allows the dinosaur to crouch, which is not modelized here in the decision list). The computer will learn by trial and error, gathering training data every time it fails at the game, then restarting the game with the accumulated experience.

Tensorflow.js is used as a machine learning library. TensorFlow’s tutorials identify the following steps for a machine learning implementation:

In this particular example, we start with no training data, so the first step is essentially empty. For the second step, Arora uses a neural network, based on a sequential model, with an input and output layer, both with a sigmoid activation function. The first layer includes the three predictive variables previously mentioned: game speed, width of oncoming obstacle, and distance from the T-Rex. The first layer computes six units which serve as an input for the second and final layer. The final layer has two outputs, whose values correspond respectively to the probability of jumping or not jumping:

import  *  as  tf  from  '@tensorflow/tfjs';

dino.model  =  tf.sequential();
dino.model.add(tf.layers.dense({
  inputShape:[3],
  activation:'sigmoid',
  units:6
}))

dino.model.add(tf.layers.dense({
  inputShape:[6],
  activation:'sigmoid',
  units:2
}))

The third step involves converting input data into tensors which TensorFlow.js can handle:

dino.model.fit(
  tf.tensor2d(dino.training.inputs), 
  tf.tensor2d(dino.training.labels)
);

No shuffling is implemented in the third step, as we incrementally add inputs to an initially empty training set, each time the computer fails at the game. Normalization is realized here by having the output values in the training set between 0 and 1. As a matter of fact, when the T-Rex fails at avoiding an obstacle, the corresponding input triple (game speed, width of oncoming obstacle, and distance from T-Rex) is mapped to either [1, 0] or [0, 1], which encodes the outputs of the second layer. If the T-Rex was jumping and failed at evading the obstacle, then the appropriate decision was to not jump: [1, 0]. Conversely, if the T-Rex was not jumping, and met with an obstacle, the appropriate decision was to jump: [0, 1].

As a fourth step, when training data is made available, the model is trained with a meanSquaredError loss function and the Adam optimizer with a learning rate of 0.1 (the Adam optimizer is quite effective in practice and requires no configuration):

dino.model.compile({
  loss:'meanSquaredError',
  optimizer: tf.train.adam(0.1)
})

The fifth step occurs during a game repetition: as the game proceeds, and new values of the input triple are computed, predictions are run and jump/no jump decisions are taken, when it makes sense to take them (e.g. when the T-Rex is not busy jumping):

if (!dino.jumping) {
  // whenever the dino is not jumping decide whether it needs to jump or not
  let action = 0;// variable for action 1 for jump 0 for not
  
  // call model.predict on the state vecotr after converting it to tensor2d object
  const prediction = dino.model.predict(tf.tensor2d([convertStateToVector(state)]));

  // the predict function returns a tensor we get the data in a promise as result
  // and based don result decide the action
  const predictionPromise = prediction.data();
  
  predictionPromise.then((result) => {
  // converting prediction to action
  if (result[1] > result[0]) {
  // we want to jump
  action = 1;
  // set last jumping state to current state
  dino.lastJumpingState = state;
  } else {
  // set running state to current state
  dino.lastRunningState = state;
  }
  
resolve(action);
});

Fritz is a machine learning platform for iOS and Android developers. TensorFlow.js is open source software available under the Apache 2.0 license. Contributions and feedback are encouraged via TensorFlow’s GitHub project and should follow TensorFlow’s contribution guidelines.

InfoQ Software Architects' Newsletter

Follow us on

Rate this Article

This content is in the Web Development topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter