InfoQ Homepage Presentations Counting is Hard: Probabilistic Algorithms for View Counting at Reddit
Counting is Hard: Probabilistic Algorithms for View Counting at Reddit
Summary
Krishnan Chandra explains the challenges of building a view counting system at scale, and how Reddit used probabilistic counting algorithms to make scaling easier.
Bio
Krishnan Chandra is a data engineer at Reddit, and has been working in data engineering for 4 years. Before joining Reddit, Krishnan worked on backend engineering at Optimizely and LinkedIn. He holds bachelor's degrees in computer science and math from the University of Illinois at Urbana-Champaign.
About the conference
QCon.ai is a AI and Machine Learning conference held in San Francisco for developers, architects & technical managers focused on applied AI/ML.
Community comments
Flow question
by Deneb Garza,
Flow question
by Deneb Garza,
Your message is awaiting moderation. Thank you for participating in the discussion.
Looking at the diagram, I'm curious why you chose this flow:
client --> app server --> kafka --> counter --> spam filter --> counter --> redis/c*
versus this one:
client --> app server --> kafka --> spam filter --> counter --> redis/c*
i.e. What was the rationale behind not adding the spam filter as an upstream component of the counter? Seems like you could save a step by doing that.