Twitter is using replicated logs for high performance data collection and analysis of its systems. DistributedLog is the system developed at Twitter for this purpose. Twitter has developed a distributed key-value database, Manhattan. Manhattan can trade consistency for latency in reads following the eventually consistent data model. We examine Twitter's design and tradeoffs for DistributedLog.
Twitter has open sourced Diffy, an automated testing tool used in production for discovering potential bugs in new code running on Apache Trift and other HTTP-based services.
Twitter has replaced Storm with Heron which provides up to 14 times more throughput and up to 10 times less latency on a word count topology, and helped them reduce the needed hardware to a third.
Twitter has officially released Digits Login for Web, the latest interaction of Digits that extends the SMS-based login system to mobile app's sites powered by Digits.
Twitter recently announced open sourcing an anomaly detection package in R. Anomaly detection is a major study field as it can denote different things. A major spike in followers or favorites around a topic can happen because something major is happening and this may be something that needs to be broadcast around the network. But this same spike can also happen because of bots and spammers...
Yesterday concluded the second day of the Web Summit in Dublin, Ireland. We see what happened and what is new from last day at the event.
Twitter’s engineering group, known for various contributions to open source from streaming MapReduce to front-end framework Bootstrap recently announced open sourcing an algorithm that can efficiently recommend content. LinkedIn also open sourced a Machine Learning library of its own, ml-ease. In this article we present the algorithms and what they mean for the open source community.
Twitter Engineering has released details about Manhattan, its real-time, multi-tenant distributed database.
Facebook, Google, LinkedIn, and Twitter have decided to make sure that a relational databases is “web-scale”, so they have put their efforts behind WebScaleSQL, a branch of MySQL 5.6 Community Edition.
Twitter has open sourced their MapReduce streaming framework, called Summingbird. Available under the Apache 2 license, Summingbird is a large-scale data processing system enabling developers to uniformly execute code in either batch-mode (Hadoop/MapReduce-based) or stream-mode (Storm-based) or a combination thereof, called hybrid mode.
Ajax Control Toolkit has been updated to support jQuery and includes a new Twitter control which takes advantage of new Twitter API. It also includes an improved documentation which describes the usage of ToolkitScriptManager.
For many of us Twitter has become an essential communications utility. Since experiencing scalability problems in 2010, Twitter has moved to a loosely coupled service oriented architecture based on the JVM, allowing it new levels of scalability and feature agility. Twitter engineering recently reported a new record throughput and took time out to describe their new architecture.
Twitter has open sourced its Effective Scala guide. The document is on GitHub and is licensed under CC-BY 3.0. Scala is one of the primary programming languages used at Twitter, and most of the Twitter infrastructure is written in Scala. The Effective Scala guide is a series of short essays, a set of "best practices" learned from using Scala inside Twitter.
Twitter and Azul Systems have been elected to serve on the JCP Executive Committee for Java SE/EE, on voting percentages of 32% and 19% respectively. Both firms have also joined the OpenJDK project. VMware is no longer represented.
Twitter has open-sourced Storm, its distributed, fault-tolerant, real-time computation system, at GitHub under the Eclipse Public License 1.0. Storm is the real-time processing system developed by BackType, which is now under the Twitter umbrella.