Facilitating the spread of knowledge and innovation in professional software development



Choose your language

InfoQ Homepage News AWS Step Functions Gains Callback Patterns to Resume Paused Workflows

AWS Step Functions Gains Callback Patterns to Resume Paused Workflows


Amazon Web Services (AWS) recently announced that AWS Step Functions supports callback patterns to "automate workflows for applications with human activities and custom integrations with third-party services". Workflow executions can now be paused until applications return a token via the Step Functions API, which obsoletes the previously required polling or other workarounds.

AWS Step Functions enables application developers to "coordinate multiple AWS services into serverless workflows" (previous coverage). Step Functions workflows are defined in the JSON-based Amazon States Language, which allows to maintain state machines via configuration as code, yet also to visualize state machine diagrams so that they are "easy to understand, easy to explain to others, and easy to change". Workflows are composed of states and tasks, which initially allowed to use AWS Lambda functions and (potentially long-running) activities for the actual work, and since late 2018 also supports several other service integrations like DynamoDB, ECS, SNS, and SQS (previous coverage).

To enable the pausing of workflow executions until an external activity has been completed, Step Functions has so far required to poll the Step Functions API via activity workers. However, besides requiring upfront registration of the activity, and extra care to avoid latency when polling for activity tasks, implementing serverless approval steps also required to work around the polling architecture with a scheduled Lambda function.

AWS has now added a dedicated wait for a callback with a task token service integration pattern that complements the default request response and wait for a job to complete patterns to enable purely event-driven serverless workflows with manual steps and third-party integrations. The provided example workflow covers a scenario that needs to "integrate with an external microservice to perform a credit check as a part of an approval workflow":

AWS Step Functions Callback Pattern

Image: Wait for callback with task token example workflow (via Step Functions documentation)

A complete task definition using the Amazon SQS service integration might look as follows:

"Send message to SQS":{  
      "Message":"Hello from Step Functions!",

The .WaitForTaskToken suffix selects the callback pattern, and the $$.Task.Token reference injects the value of the task token from the context object during workflow execution. A consumer of the resulting SQS message can then resume the paused workflow and report the outcome of an external process via the SendTaskSuccess or SendTaskFailure API actions. Because "a task that is waiting for a task token will wait until the execution reaches the one year service limit", AWS recommends to configure a heartbeat timeout. The heartbeat timeout clock can then be reset via the SendTaskHeartbeat API action to report that "the task represented by the specified taskToken is still making progress".

AWS Serverless Hero Ben Kehoe has just covered a common use case with a blog post on using callback URLs for approval emails with AWS Step Functions. His sfn-callback-urls solution is implemented as a serverless application and published to the Serverless Application Repository (previous coverage). It allows to "generate one-time-use callback URLs" and is exemplified with a workflow that "sends an email containing approve/reject links, and later a confirmation email". Kehoe emphasizes that it is easy to "expand this state machine for your use cases", and that this workflow pattern can also be "used as part of a larger state machine".

Netflix has built the conceptually similar JSON DSL based open-source workflow orchestration engine Conductor, which features a wait task that can be updated via HTTP to implement callbacks from external triggers. Microsoft Azure provides an HTTP webhook action to allow pausing and resuming an app via its Logic Apps workflow orchestration service.

In related news, AWS Step Functions has recently added support for nested workflows, added access to workflow metadata, and now also supports workflow execution events via CloudWatch Events respectively Amazon EventBridge (previous coverage), which eases building auxiliary serverless workflows by monitoring state machine events like start, success, or failure. Earlier this year, AWS Step Functions Local has been made available as an executable JAR package and Docker image to enable implementing and testing of state machines in development and build environments.

The AWS Step Functions documentation features a developer guide, including a section on service integration patterns, the Amazon States Language, and the API reference. The AWS CLI Step Functions commands and the Amazon States Language specification are documented separately. Support is provided via the AWS Step Functions forum. Callback patterns are available at no additional charge beyond the regular usage-based AWS Step Functions pricing.

Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p