BT

How Alibaba Catered To $3 Billion Sales In A Day

by Roopesh Shenoy on Dec 26, 2012 |

Chinese Ecommerce Giant, Alibaba, recently managed to sell $3 billion worth of product in a single 24 hour period. InfoQ got a chance to ask a few questions to Zhuang Zhuoran and Youtan, architects from Tmall and Taobao, about the challenges of handling such loads and how they meet them.

Tmall, China’s leading B2C e-commerce site, and Taobao, the largest C2C online shopping platform in China, are both subsidiaries of Alibaba Group with a total of more than 500 million registered users. This year marks the fourth consecutive year of Taobao “Double Sticks Promotion”, which witnessed Gross Merchandise Volume of RMB 19.1 billion (roughly $3 billion) from a total of 147 million user visits.

On the challenges of making e-commerce work at "China Scale":

On 11 November, 2012 (the Double Sticks promotion day), Tmall and Taobao witnessed 147 million user visits, purchases of 30 million people and nearly 100 million paid orders. At 0:00, more than 10 million users were concurrently online. The technical team faced several great challenges, such as how to satisfy various functional needs of Double Sticks, how to make a complete and accurate assessment of the system in the preparation process, how to effectively implement various optimization and disaster recovery plans, how to make right decisions in case of emergencies, as well as how to ensure the stability, performance and user experience of the network under the impact of mass traffic.

The processing peak of Tmall transaction system appeared in the first hour, when the system successfully processed 13,000 request orders per second. The system peak was 40,000 QPS (queries per second) with an average response time of 200ms. Tmall Product Details Page witnessed up to 1.6 billion system visits, with the peak throughput reaching 69,000 visits/sec and the response time retaining 12ms at the peak time. Tmall saw its page view rising to 590 million, with the peak throughput reaching 14,000 visits/sec.

Zhuang explains that at the application level, applications of Tmall and Taobao are all built on the self-developed service-oriented architecture along with MVC framework and Spring. This is supported by a distributed file system, distributed caching, messaging middleware and CDN network bandwidth. The core database is accessed through a self-developed data-access middleware and the horizontal splitting and data transportation of underlying database are completely transparent to the applications. 

Based on this scale-out architecture, Tmall and Taobao systems can flexibly add machines to cope with the traffic flow pressure caused by promotion activities. 

We spend a lot of time calculating capacity, conducting in-depth analysis of the dependencies between all applications of the website, proportion of flow distribution and call links within applications, making accurate QPS assessment of various stand-alone machines through online pressure test in early stages, so as to make an objective judgment about the cluster processing capacity. It is really challenging to operate this process, because Tmall and Taobao systems are essentially not weakly coupled, and pressure test of a single system can not reflect the system bottleneck effectively. Meanwhile, we cannot completely copy the online environment and configuration to build a complete environment for pressure testing, instead, we should rely more on the online pressure test to truly reflect the system shortcomings.

Finally, we estimate the expected business target based on the site's natural growth trend and historical data of Double Sticks and then caculate the expansion goal of each system correctly according to the estimated business target.

Merely relying on horizontal expansion can reduce the machine utilization after sales peak and greatly increase the dependence on the flexible allocation capabilities of the operation and maintenance personnel. Therefore, this year we tried elastic computing framework for some applications, such as cloud.tmall.com, in which different applications of different merchants share the system resources of one cluster. On November 11, 2012, its bandwidth, VM and storage resources were flexibly upgraded. Many of our internal applications also adopt this mechanism, which marks a technical breakthrough during our preparation for this year's Double Sticks promotion.

Taobao and Tmall teams have conducted targeted optimizations of the system, including the optimization of SQL and cache hit rate, adjustment of the database connection and application server parameters, JVM parameter configuration, as well as code review and inspection when preparing for the Double Sticks promotion. Besides, they employ a large number of solid state drives (SSD) to improve the overall performance of the database storage.

The teams also have a business-downgrading and traffic restriction plan for shutting down non-core operations if the load increases beyond what is expected.

Business downgrading means cutting non-core business functions to ensure the stable operation of the core functions. In order to realize elegant business downgrading, we have to split functions into relatively separate code units, isolate them by priority, then control them in the background to downgrade some non-core business functions, so as to reduce system dependence and performance loss and enhance the overall throughput of the cluster.

If downgrading is not sufficient, we need to restrict the traffic flow. First, we control the application flow by queuing the web applications at the front end, i.e., use the custom module of the web server to enable QPS flow restriction function and perform mandatory QPS flow control according to the maximum pressure that the protected web server can withstand, after which users will enter the waiting page. In order to avoid the unbearable avalanche-effect in the back-end services caused by the surge in the traffic flow of a web application in the front end, we restrict the traffic flow of low-priority business in the back-end services. This would ensure that the back-end services will not be overwhelmed by different sources of business pressure and guarantee the access to the core business.

Tmall and Taobao prepared a total of more than 400 system downgrading plans for 2012 Double Sticks Promotion.

To ensure the accurate implementation of all downgrading and flow restricting plans, we conducted several drills in the preparation process. We wish that we would never use these emergency plans, but we must ensure the accuracy and convenience of each plan.

Emergency Decision making process:

On November 11, more than 400 engineers worked together to ensure the smooth functioning of the whole event. For a short decision-making process, we established a field intelligence sorting centre responsible for collecting and consolidating customer feedback and eliminating duplicate and invalid feedback from different information sources, including customer service, operations, safety, product and merchants. This would ensure that there would be no information overload for the technical team.

Secondly, although we have field headquarters, the decision-making responsibilities in-case of emergencies lies with the front-line development engineers. Roles and responsibilities of all engineers working together are clearly defined. Each application is allocated with 1-2 core owners, who make emergency decisions based on the changes of various system indicators in the monitoring center, so as to ensure timely response. The emergency decision would be escalated to the headquarters only when it comes to large business impact or huge damages to the user experience.

Taobao and Tmall also have an effective open source strategy in place with a lot of code being open sourced at code.taobao.org. Several frameworks such as the remote communication framework HSF, mesaging middleware Notify and the data access middleware TDDL have been open sourced.

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Alibaba Open Source Project by 温 少

Hosting on github.com, four orginazations:
github.com/alibabatech
github.com/alibaba
github.com/taobao
github.com/alipay

Four projects are very popular in china:
1) github.com/taobao/tengine (A distribution of Nginx with some advanced features)
2) github.com/AlibabaTech/druid (jdbc connection pool)
3) github.com/AlibabaTech/fastjson (fast json processor)
4) github.com/alibaba/dubbo (Dubbo is a distributed service framework enpowers applications with service import/export capability with high performance RPC)

Re: Alibaba Open Source Project by Roopesh Shenoy

Thanks for those links, I wasn't aware Alibaba uses Github too! My fault I din't really search there!

The total number of unique visitor at that day by 章 文嵩

The total number of unique visitor at that day should be 213 million, which includes visitors to taobao.com and tmall.com.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

3 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT