Facilitating the spread of knowledge and innovation in professional software development



Choose your language

InfoQ Homepage Presentations Big Data in Real Time at Twitter

Big Data in Real Time at Twitter



Nick Kallen discusses how Twitter handles large amounts of data in real time by creating 4 data types and query patterns -tweets, timelines, social graphs, search indices-, and the DBs storing them.


Nick Kallen is a Systems Engineer at Twitter. He is the author of Arel, NamedScope, Cache Money, and Screw.Unit, and co-creator of FlockDb, Twitter's distributed graph database.

About the conference

QCon is a conference that is organized by the community, for the community.The result is a high quality conference experience where a tremendous amount of attention and investment has gone into having the best content on the most important topics presented by the leaders in our community.QCon is designed with the technical depth and enterprise focus of interest to technical team leads, architects, and project managers.

We need your feedback

How might we improve InfoQ for you

Thank you for being an InfoQ reader.

Each year, we seek feedback from our readers to help us improve InfoQ. Would you mind spending 2 minutes to share your feedback in our short survey? Your feedback will directly help us continually evolve how we support you.

Take the Survey

Recorded at:

Aug 03, 2011

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • I have a question

    by james xu,

    Your message is awaiting moderation. Thank you for participating in the discussion.

  • Wrong query?

    by José González D'Amico,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Around minute 22, Mr. Kallen shows the query for the timeline executed in the original implementation.
    In your timeline you see your friends' (AKA followees) tweets, not your followers'. And if I remember correctly, in the Twitter API source_id is a field of Tweets, not of Friends.
    Perhaps the query should be this?

    SELECT * FROM tweets
    WHERE source_id IN
    (SELECT user_id
    FROM friends
    WHERE destination_id = ?)
    ORDER BY created_at DESC
    LIMIT 20

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p


Is your profile up-to-date? Please take a moment to review and update.

Note: If updating/changing your email, a validation request will be sent

Company name:
Company role:
Company size:
You will be sent an email to validate the new email address. This pop-up will close itself in a few moments.