BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Presentations One to Many: The Story of Sharding at Box

One to Many: The Story of Sharding at Box

Bookmarks
41:00

Summary

Tamar Bercovici presents Box’s transition from a single MySQL database to a fully sharded MySQL architecture, all the while serving 2 billion queries per day.

Bio

Tamar Bercovici is a Staff Software Engineer at Box where she leads the Data Access Team in scaling Box’s database architecture and ORM layer. Prior to Box, Tamar was an early-stage employee at XMPie (now a Xerox company), where she drove the development of the award winning uImage product. Tamar holds a Ph.D. in Computer Science from the Technion – Israel Institute of Technology.

About the conference

Cloud Tech is the largest gathering of cloud technologists & engineers in the bay area. Our speakers include the top cloud computing entrepreneurs & experts.

Recorded at:

Nov 17, 2013

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Funny how websites built with PHP end up re-inventing sharding

    by peter lin,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    It's 2013 and people are still rolling their own sharding.

  • re-inventing sharding

    by Robert Sullivan,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    She didn't dwell on why Box's requirements ruled out existing implementations, but probably the 2 billion plus queries per day has something to do with it. When you are talking facebook, twitter, amazon, etc, scaling is everything. Sure, many might ask whether facebook's Thrift looks exactly like CORBA, or why reinvent a PHP compiler, or whether it wouldn't have been easier to run PHP on the JVM than building their own VM, but they'd also say we should be coding in assembly language on the mainframe. As one example, here's what facebook has to say about CORBA:

    CORBA.Relatively comprehensive, debatably overdesigned and heavyweight. Comparably cumbersome software installation.

    When you've got a few smart folks on your staff, or even a few PhDs, you've probably got the talent available and can do stuff like this, that give a competitive advantage. And why not?

  • Re: re-inventing sharding

    by peter lin,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    proper sharding, which used to be called partitioned databases isn't new. There's literally dozens of papers on how to properly shard, manage and scale partitioned databases. From DB2 mainframe's database partitions to federated partitions, there's just far too much prior art to ignore. Partitions will eventually get unbalanced, especially if the partitioning scheme is something like username. Random partitioning tends to require less management, but at some point the partitions need to be rebalanced when new nodes are added or removed.

    Having a few smart Phd's is NOT enough to build a robust, scalable and easy to manage partitioned database. There are many lessons that only come from first hand experience using and building partitioned databases.

    My point isn't "don't re-invent". My point is look at existing prior art and learned what has been done to avoid making mistakes others have already made. Looking at how many Php shops have re-invented sharding poorly gives me the impression Php developers don't like to spend time reading prior art or making sure they avoid known issues with naive implementations.

  • Re: re-inventing sharding

    by Den Samo,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    hi Peter,
    What papers on sharding would you recommend?

    Thank you!

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT