BT

Evaluating Hyperledger Composer

| Posted by Glenn Engstrand Follow 0 Followers , reviewed by Thomas Betts Follow 37 Followers on Aug 06, 2018. Estimated reading time: 12 minutes |

Key Takeaways

  • Only a very narrow subset of use cases is appropriate for this technology.
  • The way that Hyperledger Fabric uses MVCC (multiversion concurrency control) when validating write batches is safe enough for financial applications in a decentralized ledger but may not scale efficiently enough in order to be attractive to B2C startups.
  • Avoid this technology if you can make all of your transactions idempotent.
  • This technology is still somewhat immature.
  • Even though this is an open-source project, there are currently some limitations in choice of cloud provider when moving to a production environment (that could change).

I have been following the three-year-old Hyperledger Fabric open-source project since its code base moved to GitHub about two years ago. The Hyperledger projects are hosted by the Linux Foundation and sponsored mostly by IBM. They promote the use of what are known as private, or permissioned, blockchains. With a public blockchain, the first anonymous miner who solves a cryptographic puzzle gets to commit the next block of ledger entries to the chain. Private blockchains solve the consensus problem among authenticated peers using algorithms such as Raft or Paxos.

With a blockchain, you get CRUD-style access to the ledger. You also get the ability to store mini programs known as smart contracts on the ledger. When a transaction is submitted to a smart contract, all ledger state-mutating operations executed within the chaincode are atomic — either all operations get committed or none of them do. If the underlying ledger data accessed by the chaincode has been changed by the time that the operations made by the chaincode are to be committed, then the transaction is aborted. This happens automatically and is a big part of the value to writing smart contracts.

The same folks who released Hyperledger Fabric also released another open-source project, Hyperledger Composer, which makes it easy for developers to write chaincode for Hyperledger Fabric and the DApps (decentralized applications) that can call them.

Why now?

Thoughtworks is a technology consulting company (acquired by Apax Partners) that markets itself as "a community of passionate individuals whose purpose is to revolutionize software design, creation and delivery". Twice a year, they release a Technology Radar report that recommends certain technologies be put on hold, assessed, put on trial, or adopted. Volume 18 of this Technology Radar (pdf), published in May 2018, placed Hyperledger Composer in the trial ring, which they define as "Worth pursuing. It is important to understand how to build up this capability. Enterprises should try this technology on a project that can handle the risk."

In my role as a software architect, I evaluate emerging technology, and Hyperledger Composer made it onto my personal radar. Every time I evaluate an emerging technology, I use it to implement a rudimentary news feed microservice. Each of these microservices are feature identical and are load-tested in the exact same manner. In this way, I can make reasonable statements about the performance characteristics of any specific technology in comparison with the other technologies. I chose the problem domain of a news feed because of its familiarity and ubiquity in social networks and because it is complicated enough to require non-trivial solutions yet simple enough to understand without getting lost in implementation minutiae. I publish the source code for these microservices, along with the source code needed to load-test them and to collect and analyze the performance results, in a freely available repository on GitHub. In the spirit of scientific peer review, I encourage you to clone the repo and reproduce the results for yourself.

Building the test microservice

Hyperledger Composer allows you to write smart contracts in server-side JavaScript. It makes available a native client library by which Node.js applications can access the ledger and submit transactions to these smart contracts. For the purposes of this experiment, I used an already developed Node.js microservice (see server/feed4 in the repo) as the control. I copied the source code for that microservice to a new folder (see server/feed7/micro-service in the repo) then I replaced all references to MySQL, Redis, and Cassandra with calls to the Hyperledger Composer client API. It is the feed7 project that serves as the test in this experiment. Both projects use Elasticsearch because one of the requirements of each news-feed service is a keyword-based search, and a blockchain is not appropriate for that. Like most of the other microservices in this repo, the feed7 microservice uses Swagger to define its REST API. The specification can be found in the server/swagger/news.yaml file.

With Hyperledger Composer, you create a business network that consists of a data model, a set of transactions that manipulate the data model, and a set of queries by which those transactions can access data within the model. Hyperledger Composer works with Hyperledger Fabric, whose basic network consists of CouchDB, the default peer, the business network peer, a certificate authority service, and an orderer. The feed7 microservice accesses Hyperledger Fabric in the context of this business network, which you can find in the server/feed7/business-network folder.

Figure 1: Feed7 components (the test).

In the model for this business network, the broadcaster is the participant. There are friendship, inbound, and outbound assets. The friendship asset captures the friend relationship between two broadcasters. Each inbound asset is a news-feed item meant for the associated broadcaster. The outbound asset is a news-feed item that was sent by the associated broadcaster. There are two transactions in this business network: broadcasters can friend each other and a broadcaster can broadcast a news-feed item to its friends. The only query needed inside the business network is for the broadcast transaction to access the broadcaster’s friends.

async function broadcastParticipants(tx) {
	const factory = getFactory();
	const created = Date.now();
	const now = new Date();
	const k = tx.sender.participantId + '|' + created + '|';
	const outboundRegistry = await getAssetRegistry('info.glennengstrand.Outbound');
	const ok = 'Outbound:' + k + Math.random();
	const inboundRegistry = await getAssetRegistry('info.glennengstrand.Inbound');
	var o = factory.newResource('info.glennengstrand', 'Outbound', ok);
	o.created = now;
	o.subject = tx.subject;
	o.story = tx.story;
	o.sender = tx.sender;
	await outboundRegistry.add(o);
	const friends = await query('broadcasterFriends', { broadcaster: 'resource:info.glennengstrand.Broadcaster#' + tx.sender.participantId });
	for (i = 0; i < friends.length; i++) {
   		    const friend = friends[i];
    		const ik = 'Inbound:' + k + Math.random();
    		var inb = factory.newResource('info.glennengstrand', 'Inbound', ik);
    		inb.created = now;
    		inb.subject = tx.subject;
    		inb.story = tx.story;
    		inb.recipient = friend.to;
    		await inboundRegistry.add(inb);
	}
}

Code Sample 1: A smart contract.

The Hyperledger Composer API that is intended to be called within a smart contract closely resembles the API that is intended to be called by the Node.js DApp, but there are some interesting differences. Within the smart contract, you must use the async/await mechanism, but within the DApp, you had to use promises. Smart contracts always had to use predefined queries but DApp code could build a query dynamically and run that. When querying or retrieving either a participant or asset from the DApp, you had to include the constant "PID:" as a part of the key but not when accessing the same data from chaincode.

function submitTransaction(bizNetworkConnection, transaction, from, subject, story, callback, retry) {
	const elastic = require('../repositories/elastic');
	bizNetworkConnection.submitTransaction(transaction)
    .then((result) => {
    	const retVal = {
   	   	"from": from,
   	   	"occurred": Date.now(),
   	   	"subject": subject,
   	   	"story": story
    	};
    	elastic.index(from, story);
    	callback(null, retVal);
    }).catch(() => {
    	setTimeout(() => {
   	 	submitTransactionRetry(bizNetworkConnection, transaction, from, subject, story, callback, 2 * retry);
    	}, retry + Math.floor(Math.random() * Math.floor(1000)));
   	 });
}

exports.addOutbound = function(args, callback) {
  const BusinessNetworkConnection = require('composer-client').BusinessNetworkConnection;
  const bizNetworkConnection = new BusinessNetworkConnection();
  bizNetworkConnection.connect(process.env.CARD_NAME)
	.then((bizNetworkDefinition) => {
    		const factory = bizNetworkDefinition.getFactory();
    		var transaction = factory.newTransaction('info.glennengstrand', 'Broadcast');
   		    transaction.sender = factory.newRelationship('info.glennengstrand', 'Broadcaster', 'PID:' + args.body.value.from);
   		    transaction.subject = args.body.value.subject;
    		transaction.story = args.body.value.story;
    		submitTransaction(bizNetworkConnection, transaction, args.body.value.from, args.body.value.subject, args.body.value.story, callback, 2000);
	});
}

Code Sample 2: A DApp calling a smart contract.

In the DApp source code, you may notice all this retry logic when submitting a transaction. That is because Hyperledger Fabric uses MVCC (multiversion concurrency control) when validating write batches and will easily throw a read conflict error. What you do to resolve that is to sleep a slightly randomized amount of time then retry the transaction.

Testing the microservice under load

Both the control and the test use the same load-test application, which you can find in the client/load folder of the repo. The load test creates 10 participants in an eternal loop, and gives each participant anywhere from two to four friends. It has each participant broadcast 10 news-feed items, each of which consists of 150 randomly generated numbers. The load-test app spins up three threads, each doing this process 90% of the time. The other 10% is testing the search functionality.

Instead of calling the news-feed microservice directly, the load-test application calls an open-source API gateway called Kong, which proxies each request from the load-test application to the news-feed microservice. Kong is configured to use the http-log plugin in order to send request and response logs for each call to another microservice, which in turn sends the performance-related parts to Elasticsearch in batches. You can find the source code for the Kong logger microservice in the client/perf4 folder.

I used Kibana to visualize the performance data, including throughput, average latency, and percentile latency. Whenever possible, I always collected summaries of performance metrics from two hours of data.

Figure 2: Per-minute throughput of outbound post requests for a test (i.e., Hyperledger Composer and Fabric) experiment.

Figure 3: Per-minute average of latency of outbound post requests for a test (i.e., Hyperledger Composer and Fabric) experiment.

I deployed the control twice, both times in EC2 using m4.xlarge instances; one time when the feed4 service was running in a Docker container and another time when it wasn’t. The Docker version experienced 6% less throughput and almost no difference in latency. I also deployed the test twice, both times in EC2 using m4.xlarge instances for Kong, Cassandra (used by Kong), Elasticsearch, and the load-test application. The first test deployed Hyperledger Fabric, Composer, and the feed7 business network and microservice on an m4.xlarge and the second test used an m4.2xlarge, to see the performance differences in scaling up.

Figure 4: Feed7 deployment (the test).

To allow for valid comparative analysis, and because production configurations were not readily available, both the control and the test used developer configurations for everything. There exists an AWS CloudFormation template for Hyperledger Fabric, but it is deploying the basic network, which is Hyperledger-speak for a developer configuration. The only online documentation for a production configuration that I could find, outside of advertisements for IBM Cloud, were a couple of blogs on Hacker Noon by some folks from VMware. Those blogs claimed a production configuration and a diagram indicated that the orderer was backed by Kafka, but the configtx.yaml file in the referenced GitHub repo shows an OrdererType of solo, not Kafka. That suggests a developer configuration. A comment in the source code says, "The solo consensus scheme is very simple, and allows only one consenter for a given chain. It accepts messages being delivered via Order/Configure, orders them, and then uses the blockcutter to form the messages into blocks before writing to the given ledger."

Performance results

There is both good and bad news for Hyperledger when it comes to performance under load. Here is the bad news: the Hyperledger version of the news feed demonstrated over 300 times less throughput and was three orders of magnitude slower than the traditional version. But the good news is that doubling the hardware capacity for the Hyperledger version yielded a 20% improvement in throughput and almost halved the latency.

The control sustained over 13,000 outbound post requests  (i.e., a news-feed broadcast) per minute (RPM). Average latency was 4 ms and the 99th percentile was 9 ms. The test experienced on average 29 outbound post requests per minute for the m4.xlarge and 38 for the m4.2xlarge. The average latency was 4.7 s for the m4.xlarge and 3.2 s for the m4.2xlarge. The 99th percentile latency was 10.2 s for the m4.xlarge and 4.9 s for the m4.2xlarge.

Figure 5: Outbound-post comparative performance summary. Latency numbers are in milliseconds.

There are some other inefficiencies that I need to cover here. The CPU and performance-related metrics on the control quickly reached steady state while the same metrics on the test got worse and worse over time. The biggest offender with CPU was the default peer process in Fabric. This was surprising because the microservice always accessed the news-feed business network whose corresponding peer container was not as CPU intensive. Perhaps the default peer is used to endorse transactions? I could find no way to remove it from the configuration. In a production configuration, you would have multiple peers, otherwise the ledger would not be decentralized.

For both the test and the control, the microservice will eventually crash, once the SSD for the database runs out of available storage. For the control, that happens in the Cassandra database after almost 30 million outbound posts have occurred. For the test, that happens in the CouchDB database after about 4,000 outbound posts have occurred. The SSD storage for both the control and the test have the same capacity, which is 20 GB. Clearly, storage efficiency is not currently a primary concern of the developers contributing to the Hyperledger Fabric project.

Conclusion

Originally, I believed that the news-feed application would be a good use case for a blockchain. The primary action of the load-test application is appending friends and appending news-feed items, which sounds very similar to appending to a ledger. Now, however, I believe that analogy to be superficial. The major concern for blockchains is to prevent what is known as the double-spend problem — what good is a blockchain that cannot prevent participants from spending the same money twice? For public blockchains, that problem is handled using unspent transaction outputs, or UTXO. Hyperledger Fabric addresses the problem via MVCC on the read sets when validating write batches. Fabric does have inefficiencies that can be addressed as it matures, but I believe that this use of MVCC in order to prevent double spending is an inherent cause for the low throughput and high latency. For all intents and purposes, the news-feed transactions are essentially idempotent. There are no significant consequences if two participants friend themselves in a different order or multiple times, or broadcast items to each other in a different order or multiple times. Fabric is allocating a lot of CPU time and memory to prevent a problem that has no significant impact on a news feed.

This evaluation leads me to believe that the future of software development will not be eaten by blockchains. There is only a very narrow subset of use cases that justify the high computing costs inherent in automatic, guaranteed, distributed concurrency control and validation. Basically, you need to require a consortium marketplace where idempotent transactions are not a possibility. While evaluating Hyperledger Composer at this time has some merit, the current level of maturity makes committing to a production release in the near future highly problematic. The Hyperledger projects are all open source but at the time of this writing there appear to be limitations in choice of cloud provider when progressing to a production environment.

About the Author

Glenn Engstrand is a Software Architect at Adobe Systems, Inc.. His focus is working with engineers in order to deliver scalable, server side, 12 factor compliant application architectures. Glenn was a breakout speaker at Adobe's internal Advertising Cloud developer's conferences in 2018 and 2017 and at the 2012 Lucene Revolution conference in Boston. He specializes in breaking monolithic applications up into micro-services and in deep integration with Real-Time Communications infrastructure.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Hyperledger Fabric Use Cases by Baiju Jacob

Hello Glenn, Thank You very much for an informative assessment of the performance of Hyperledger network. Composer is a very good tool for building POCs but if you are working on Production grade network, it is best to use raw Fabric which will give you more flexibility. Endorsing peers in Fabric Network are those where the smart contracts are deployed. Based on the endorsement policy, it will run on each organisation's endorsing peers required in the policy. This is a critical decision while designing and sizing the peer nodes in the network.

There is a significant difference in the use cases for public chains and private/permissioned chains. Permissioned chains are used between organisations where there is a trust issue between them which is causing enormous delays. This could be reporting of a serious event in a Clinical trial for pharmas or for tracebality of a defective perishable lot in a food supply chain. These are necessarily not high throughput scenarios but the complexity of reconciliation in these processes mean that it takes weeks and months today which has financial and public consequences.

In short many of the use cases for blockchain cannot and may not be equated or compared with current use cases and cannot be tested with similar benchmarks. It is to be used where automating trust gives big dividends and eliminates months of audits/reconciliations or when the truth of the data is very critical for a right outcome. There are plenty of use cases for this and many of them may just be an add on to existing applications which will greatly optimise your business process

Re: Hyperledger Fabric Use Cases by Glenn Engstrand

Thank you, Baiju, for your insights into blockchain use cases. It never occurred to me that permissioned blockchains could help with workflow traceability involving high value assets whose responsibility is shared by multiple organizations. I would imagine that some challenges might present themselves when reconciling that traceability with privacy focused regulations such as HIPAA and GDPR but I suspect that those are solveable problems.

I wonder what other broad use cases could be more easily addressed by blockchains? Scenarios where high throughput and low latency is not required. Perhaps other readers here could share their experiences.

Re: Hyperledger Data Privacy & Throughput by Baiju Jacob

Hello Glenn,
As you rightly pointed out, the immutability of Blockchain has raised concerns for Data Privacy and has run into challenges with privacy centered regulations like GDPR. The good news is that Hyperledger Fabric V1.2 released in July 2018 introduces 'private data collection' which we believe could solve most of the challenges. Private Data Collection if properly designed and implemented could be used for private sensitive data which are required to build the trust and finalise the outcome at the time of the transaction, which can be removed or erased depending on the need at a later date without affecting the integrity of the chain. We have test implemented it and will apply this in some of the upcoming real use cases and will keep you posted.

Regarding throughput it will be directly related to use case. If you have a use case where you have to collect endorsements from multiple organisations across the globe endorsement and network latency could be high compared to transaction which has a leaner endorsement policy. Similarly your commit latency could possibly be improved by a better design of Key Value store and ordering latency could be improved using adequately sized Kafka based ordering service.

So I wont rule out all high through put use cases but some may have to wait till more improvement happens in the later releases. Our experience shows that many of the real use cases currently being worked on are not for replacing existing apps but to do something which they could never do before and many of them are integrating to existing apps.

Cloud Restrictions Clarification by Baiju Jacob

Another point I missed in my reply is on the conclusion that it is locked into one cloud provider. This is not accurate. All the components of Hyperledger are available as docker images and you can build up your network on your choice of cloud or on premise servers. It could even include multiple clouds and in house datacenters. Obviously you need the expertise to build/connect this network. But it is possible and has been done. Obviously the commercial cloud providers are not going to advertise this :). There is definitely a need to have better management tools to make this easier for all. Hyperledger Cello is a project under the foundation which is working on this and should address this soon.

Re: Cloud Restrictions Clarification by Glenn Engstrand

Permit me to clarify, Baiju. This article never made the "conclusion that it is locked into one cloud provider." What I said was "at the time of this writing there appear to be limitations in choice of cloud provider when progressing to a production environment." There is a lot of activity in this space and cloud vendors are rapidly making positioning statements when it comes to enterprise blockchains. Who knows what the future may hold?

I'm glad that you believe it to be possible to build your own production worthy networks on premises or manage them yourself in the cloud. It's a pretty large barrier to entry if companies are supposed to hire or train employees to become Hyperledger production operations gurus. That doesn't happen with the more traditional data storage technologies. If I want to stand up my own production worthy MySql or Cassandra cluster, I don't take two years off to study the source code. I sign a Percona or Datastax support contract. Those two companies have a lot of impeccable reputation that you could safely bet on. What companies could I engage with that would give me that same piece of mind with running production worthy Fabric? Care to name some names?

Re: Cloud Restrictions Clarification by Baiju Jacob

Hello Glenn,
My point was to clarify that there is no need to wait for cloud providers to set up the network if an enterprise wants to. What the cloud providers provide today is a fabric template on top of an ubuntu/linux instance.They dont change any source code. You can have the same with any cloud of your choice. The source is the same which is open to all like any Linux offering. Enterprises have been doing this for many years either with their inhouse team or with their System integrators. Cloud subscription model is a new option which has come up in the last five years. For smaller organisations or those who dont have SIs/in house expertise you can check out CDPaaS offering from ChainDigit (www.chaindigit.com/CDPaaS).And the expertise required is basic system administration and operations expertise. All Fabric components run on docker containers. With most cloud providers unless you sign up for their services organisations still have to do it themselves (eg: who is permissioned into your network, deployment of smart contracts(chain code) relevant to your network etc.). We are building decentralised applications here. Forcing a decentralised consortium of organisation to one centralised cloud provider itself is contrary to the basic concepts of blockchain.

Re: Cloud Restrictions Clarification by Glenn Engstrand

I got a 404 on that chaindigit address. Good luck with your endeavors, Baiju. I mean no offense, sir, but I am beginning to suspect that you don't understand your target audience at Optum Government Solutions.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

7 Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT