BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Zato - Python-based ESB and Backend Application Server

Zato - Python-based ESB and Backend Application Server

Overview

Zato is an open-source ESB and application server written in Python. It is designed to integrate systems in SOA and to build backend applications (i.e. API only).

You can find the project’s documentation here and its GitHub page here.

Zato is meant to be used by anyone who already has Python or other dynamic languages such as Ruby or PHP in their toolset, or by technical teams thinking of trying out a dynamic language in their work because they’ve already seen them used elsewhere or would like to give a non-frontend system written in one of these a try.

The platform is lightweight, yet complete. Out of the box support includes HTTP, JSON, SOAP, SQL, AMQP, JMS WebSphere MQ, ZeroMQ, Redis NoSQL, FTP, browser-based GUI, CLI, API, security, statistics, job scheduling, load-balancing, hot-deployment and an extensive set of documentation, both guides and reference-style, to cover it all from the perspective of an architect, programmer or sysadmin.

The initial version was released on the 18th of May, 2013 and the latest release, 1.1, was published early June.

Architecture

A Zato environment is a set of one or more clusters. Each cluster is composed of servers sharing a single SQL and Redis database. A cluster-specific HA HTTP load-balancer is in front of the servers.

All servers are always active and always run the same set of services. To achieve an active-standby set-up a load-balancer can be used to take any server offline if necessary.

The load-balancer is an embedded HAProxy instance that is controlled by admins remotely from command line or via GUI using SSL XML-RPC underneath, with or without client certificates. It’s possible to assign weights to servers and use any other feature HAProxy itself offers, such as connection ACLs or rate limiting.

Servers are built on top of the gunicorn/gevent projects which is a combo that uses libevent to select best asynchronous event notification libraries on each platform supported, such as epoll on Linux.

To take advantage of as many CPU as a single box can offer, Zato pre-forks a configured number of worker processes each using the selected asynchronous networking library to handle connections and all of them listening on the same socket. The load-balancer can be used to spread the load across multiple boxes and to provide HA.

One of the servers in a cluster assumes the role of starting the scheduler and connectors to AMQP/JMS WebSphere MQ/ZeroMQ resources. A keep-alive ping mechanism assures another server takes over the role should that one special server unexpectedly go down.

Applications can be integrated using HTTP (with special support for JSON/SOAP and plain XML), FTP, AMQP, JMS WebSphere MQ (for seamless interoperability with existing MQ Java apps), Redis and SQL. HTTP is the only means through which Zato services can be invoked synchronously with the requesting application waiting for response in a blocking manner.

Any Python libraries can be used by programmers and if Zato doesn’t offer a feature itself yet, it’s still possible to use other technologies, for instance, XML-RPC or SMTP, being only a matter of importing a Python’s built-in package.

A browser-based GUI and CLI are used for cluster management. While the former has mostly to do with the management of already running clusters, the CLI is used for installing Zato components, such as servers, in an operating system.

Redis and an SQL operational database are used to store a cluster’s configuration. Redis is used for rapidly changing and frequently updated data such as statistics or user’s run-time information while an SQL ODB stores data that is easily mapped into relational structures.

Although for configuration the GUI is mainly used, it’s also possible to export/import a cluster’s configuration to/from JSON and store it in an external config repository where it can be versioned, tagged or diffed against.

A built-in scheduler with a GUI can be used for one-off or recurring jobs (Cron syntax can also be used).

Servers and services never share any state except through Redis or SQL ODB. There are no custom protocols or data formats to keep servers in a consistent state.

Zato uses 160+ of its own admin services to manage itself and each of these is available through a public API over HTTP with JSON/SOAP or from the command line. The GUI and CLI tools are themselves all clients of these services.

For Python applications, a convenience client has been created so applications written in Python can still use Python objects only when communicating with services exposed by Zato.

Services

A Zato service is a Python class that implements a single specific method. Such a service can take input and produce output, both steps being optional.

Services can be installed either statically or hot-deployed from GUI or command line. Automatic compilation to bytecode is performed on-fly.

Any data format can be used but Zato has additional support for JSON, SOAP and regular XML. If using any of these, the (de-)serialization is done behind the scenes and developers can use plain Python objects only using dot notation, it’s not necessary to create beans/models/stubs/classes out of a schema, such as XSD - although it means there will be no code completion.

The same service can be exposed over HTTP, AMQP, JMS WebSphere MQ, ZeroMQ or the scheduler without any changes to the code and without restarting servers with the exception that only HTTP can be used for synchronous invocations.

Services can optionally take advantage of SimpleIO (SIO) which is a declarative syntax for expressing simple requests and responses in order to expose a service through either JSON or XML/SOAP with no code changes. No complex documents can be used with SIO, it won’t accept arbitrarily nested structures. Any documents of any structure can be used with Zato, just not always with SIO.

Here’s how a basic service to fetch a company’s market capitalization value using Yahoo YQL/JSON and Google’s XML API could look like.

The service accepts a ticker symbol (such as GOOG or RHT) and issues two HTTP requests, cleans up the responses and combines them into a common format that can be returned as either JSON or XML depending on what the request format was.

# anyjson
from anyjson import loads
# bunch
from bunch import bunchify
# decimal
from decimal import Decimal
# lxml
from lxml.objectify import fromstring
# Zato
from zato.server.service import Service
class GetMarketCap(Service):
""" Returns market capitalization value in billion USD by a company's symbol.
"""
class SimpleIO:
response_elem = 'market_cap'
input_required = ('symbol',)
output_required = ('provider', 'value')
def handle(self):
# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# Yahoo
# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# Fetch a connection to Y! by its name
yahoo = self.outgoing.plain_http.get('Yahoo! YQL')
# Build URL params to issue a YQL query with.
q = 'select * from yahoo.finance.quotes where symbol="{}"'.format(
self.request.input.symbol)
url_params = {'q':q, 'format':'json', 'env':'http://datatables.org/alltables.env'}
# Invoke Y! and create a bunch instance out of the JSON response so
# we can reference the elements using dot notation.
yahoo_response = bunchify(loads(yahoo.conn.get(self.cid, url_params).text))
# Clean up the response from Y! - chop off the last character if there
# was a business response at all. Assumes the response is always
# in billions.
if yahoo_response.query.results.quote:
value1 = yahoo_response.query.results.quote.MarketCapitalization
value1 = Decimal(value1[:-1]) if value1 else 'n/a'
else:
value1 = 'n/a'
# A new response item is appended to the list of items Zato will
# serialize to JSON or XML, depending on how the service was invoked. item1 = {'provider':'Yahoo!', 'value': str(value1)}
self.response.payload.append(item1)
# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# Google
# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# Fetch a connection to Google by its name
google = self.outgoing.plain_http.get('Google Finance')
# Build URL params to invoke Google with
url_params = {'stock':self.request.input.symbol}
# Invoke Google and create an Objectify instance out of the XML response
# so we can reference the elements using dot notation.
google_response = fromstring(google.conn.get(self.cid, url_params).text)
# Clean up the response from Google - convert from millions to billions
# if there was a business response at all.
if hasattr(google_response.finance, 'market_cap'):
value2 = Decimal(google_response.finance.market_cap.get('data')) / 1000
else:
value2 = 'n/a'
# Again, a plain Python dict (hashmap) is appended to the response object
# and serialization will be done by Zato.
item2 = {'provider':'Google', 'value': str(value2)}
self.response.payload.append(item2)

This is a very simple integration example and not every scenario will let one use SIO but regardless of any service's complexity the point was to emphasize that coding in Python should not be shun. Many aspects will typically look almost like pseudo-code except it will be an executable one - here are some more usage examples to illustrate it.

In fact, this is the reason why Zato was created using this language - Python hits the sweet spot between being a sufficiently high level language so one can stay focused on integrations with as little nuisances and language-specific quirks as possible yet at the same time it is a real general purpose programming language, not a limited, possibly graphical, domain-specific 4GL one.

Back to the example, a GUI can be used to create resources it expects - in the example these are 'Yahoo! YQL' and 'Google Finance' outgoing HTTP connections. This can also be done from command line using JSON config but a GUI will be shown.

These particular APIs require no security but had it been required, HTTP Basic Auth, WS-Security Username Tokens or technical accounts (similar to Basic Auth but doesn't require BASE64) could have been used.

The service can be now hot-deployed using either GUI or from command line. The latter will be used here:

$ cp stockmarket.py /opt/server1/pickup-dir

A confirmation message will be written out to server logs:

2013-06-20 19:25:16,115 - INFO - Uploaded package id:[53],
payload_name:[stockmarket.py]

It's possible to invoke it from command line using Zato's CLI or GUI now. Let's use the former with JSON and plain XML:

$ zato service invoke /opt/server1 stockmarket.get-market-cap \
--payload '{"symbol":"GOOG"}'
{u'market_cap': [
{u'value': u'298.8', u'provider': u'Yahoo!'},
{u'value': u'298.81505', u'provider': u'Google'}
]}
$
$ zato service invoke /opt/server1/ stockmarket.get-market-cap --data-format 
xml \
--transport soap --payload '\
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" \
xmlns:zato="https://zato.io/ns/20130518"> \
<soapenv:Body> \
<zato:request> \
<zato:symbol>IBM</zato:symbol> \
</zato:request> \
</soapenv:Body> \
</soapenv:Envelope>'
<market_cap>
<zato_env>
<cid>K255124128065587321859442392853212603320</cid>
<result>ZATO_OK</result>
</zato_env>
<item_list>
<item>
<provider>Yahoo!</provider>
<value>8.763</value>
</item>
<item>
<provider>Google</provider>
<value>8.76324</value>
</item>
</item_list>
</market_cap>
$

CLI is just a means to quickly access services by admins or developers but in practice, the service would be now mounted on a secure HTTP channel over which it would be accessed by client applications. No restarts would be needed.

A glance over other features in 1.1

Services

Services are written in Python though if necessary, it is also possible to create them in C or C++.

Unless a developer insists on digging into low-level details, a service is decoupled from the transport layer and can focus on data validation, enrichment, transformations, routing or invocations of other services. A browser-based GUI, CLI or JSON config are used for the management of services.

Services are also isolated from security - they can assume that if they were invoked it means previous layers took care of authentication/authorization.

A single service can be exposed through multiple channels simultaneously, each using different data format, security definition or transport.

If using JSON/XML/SOAP a service receives a nice-looking Python object on input that can be accessed using regular dot notation (e.g. request.customer.payment.date) without any manual (de-)serialization needed on programmer's end. If using SimpleIO (SIO) the same object will be produced regardless of the data format so the same service can be exposed over multiple channels for client applications each using either of the three formats.

Hooks can be used to influence a service's lifecycle and to implement code common across more than one service (this in addition to Python's class inheritance).

Any synchronous invocations always happen in the same OS-level process and thread so if any exception is raised it can be caught and a live traceback (stacktrace) is available.

Asynchronous invocations are routed through Redis. A message is first published there and another service, possibly on another server, picks up the request. This is also how the scheduler works, a job (service) execution request is published on Redis and a target service receives it on input.

It should be noted that all programming abstractions around data sources, formats and transports are for a developers's convenience only and it's always possible to, for instance, directly access raw messages as bytes/strings should a need arise to implement a feature Zato doesn't offer itself.

Almost all operations on live clusters, such as deployment or reconfiguration, can be done without server restarts and without disrupting the message flow.

GUI

The GUI is written in Django using hand-written HTML/CSS/jQuery and was developed with guidelines postulated by information design and usability experts such as Stephen Few in mind.

Its main features are:

  • Cluster management - quick links to key components, the ability to add/remove servers to a load-balancer's config and to check whether a server is up and running from both a server's and load-balancer's point of view.
  • Load balancer - GUI to update basic data and a HAProxy's config source code view. It's also possible to remotely execute HAProxy commands and access its statistics.
  • Services
    • Hot-deploying a service
    • Listing services along with basic statistics request rate, mean response time (also as a chart)
    • Inspecting a particular service, what channels it's exposed over, on what servers and what basic statistics regarding a service are
    • Service invoker to execute services from the browser
    • Source-code browser with syntax highlighting to see what exactly is installed on a server (source code is always available because this is what is deployed, bytecode is generated on fly)
    • Uploading/downloading a WSDL (although it's not used by Zato itself to validate requests) and making it available on a publically accessed URL
    • Storing and accessing sample requests/responses (1 in N requests)
    • Storing and accessing slow requests/responses (exceeding a given threshold)
  • Security - adding and removing definitions that can be reused by channels and outgoing connections
  • Channels and outgoing connections - the former is AMQP, JMS WebSphere MQ, plain HTTP, SOAP or ZeroMQ, latter includes AMQP, FTP, JMS WebSphere MQ, plain HTTP, SOAP, SQL or ZeroMQ. Creating new objects in these classes or updating existing ones almost never requires a restart and needs no coding.
  • Redis NoSQL - a GUI for executing Redis commands remotely. Also, the ability to specify data dictionaries, mappings for expressing that, for instance an American dollar is USD in ISO 4217 but has a numerical code of 840.
  • Scheduler for creating, updating or manually executing jobs. Running a job means executing a service that can be optionally given a static payload on input.
  • Option to mark clusters with colors in GUI, e.g. production is always blue, tests are green and development is grey, this is to prevent a 'fat finger' syndrome.
  • Statistics to quickly answer two questions - what were the slowest services and what were the most commonly used ones in a given time frame (hour/day/month/year or an arbitrary period). The data can be compared in the browser or exported to CVS (also available through API). The load balancer provides its own run-time statistics too.

CLI

Command-line interface can be used to perform a range of admin operations, such as creating cluster components, running sanity checks against it, starting, stopping or updating components or managing the crypto material.

Two commands will be shown below. The first one fetches OS-level information regarding a running component. The other one can be used to create a fully working environment consisting of two servers, web admin and a load-balancer, each using randomly generating crypto material, all configured and set up to work.

$ zato info /opt/z1/load-balancer
+--------------------------------+--------------------------------+
| Key | Value |
+================================+================================+
| component_details | {"created_user_host": |
| | "dev1@box1", "version": |
| | "1.1", "component": |
| | "LOAD_BALANCER", "created_ts": |
| | "2013-06-19T14:55:42.027946"} |
+--------------------------------+--------------------------------+
| component_full_path | /opt/z1/load-balancer |
+--------------------------------+--------------------------------+
| component_host | box1/box1 |
+--------------------------------+--------------------------------+
| component_running | True |
+--------------------------------+--------------------------------+
| current_time | 2013-06-20T15:05:12.078273 |
+--------------------------------+--------------------------------+
| current_time_utc | 2013-06-20T13:05:12.078289 |
+--------------------------------+--------------------------------+
| master_proc_connections | [connection(fd=4, family=2, |
| | type=1, |
| | local_address=('127.0.0.1', |
| | 20151), remote_address=(), |
| | status='LISTEN')] |
+--------------------------------+--------------------------------+
| master_proc_create_time | 2013-06-20T13:04:15.440000 |
+--------------------------------+--------------------------------+
| master_proc_create_time_utc | 2013-06-20T11:04:15.440000+00: |
| | 00 |
+--------------------------------+--------------------------------+
| master_proc_name | python |
+--------------------------------+--------------------------------+
| master_proc_pid | 10793 |
+--------------------------------+--------------------------------+
| master_proc_username | dev1 |
+--------------------------------+--------------------------------+
| master_proc_workers_no | 0 |
+--------------------------------+--------------------------------+
| master_proc_workers_pids | [] |
+--------------------------------+--------------------------------+
$
zato quickstart create /opt/qs-1 postgresql localhost 5432 zato1 zato1 
localhost 6379

ODB database password (will not be echoed):
Enter the odb_password again (will not be echoed):
Key/value database password (will not be echoed):
Enter the kvdb_password again (will not be echoed):
[1/8] Certificate authority created
[2/8] ODB schema created
[3/8] ODB initial data created
[4/8] server1 created
[5/8] server2 created
[6/8] Load-balancer created
Superuser created successfully.
[7/8] Web admin created
[8/8] Management scripts created
Quickstart cluster quickstart-309837 created
Web admin user:[admin], password:[hita-yabe-yenb-ounm]
Start the cluster by issuing the /opt/qs-1/zato-qs-start.sh command
Visit https://zato.io/support for more information and support options
$

API

Zato's own admin services are documented and available for client applications in both JSON and SOAP to use with a goal of creating alternative tools or GUIs for devs or administrators. In fact, the Django-based web admin Zato provides is such an application - everything being performed through the API, the web console never directly accesses any config data store.

Summary

Zato 1.1 is lightweight, yet complete, and can already be used for many tasks. More features will be added with time. In particular, apart from a set of additions in tooling and GUI areas, next few releases will focus on providing business APIs out of the box for connecting to concrete systems or applications along with a development kit for users to create their own ones. More work will also be done to codify and support other integration patterns.

About the Author

Dariusz Suchojad has 12 years’ experience in enterprise architecture and software engineering with 8 years in EAI/ESB/SOA/BPM/SSO in telecommunications and banking. He feels equally well in dissecting proprietary protocols, developing systems of systems, talking with businesses and anything in between. Having spent far too many nights on putting out fires caused by the poor quality of solutions he had to use daily, he quit his job and spent a total of 16 months spread over a couple of years on creating Zato, which is a Python-based platform for integrations and backend servers.

Rate this Article

Adoption
Style

BT