With Facebook recently releasing Presto as open source, the already crowded SQL-in-Hadoop market just became a tad more intricate. A number of open source tools are competing for the attention of developers: Hortonworks Stinger initiative around Hive, Apache Drill, Apache Tajo, Cloudera’s Impala, Salesforce’s Phoenix (for HBase) and now Facebook’s Presto.
Opserver is an open source monitoring solution, released by StackExchange, of StackOverflow's fame. Opserver provides a quick overall view of each monitored system's health, while allowing the user to deep dive using a drill-down approach. InfoQ talked with Nick Crave, one of Opserver’s creators, for additional insight.
New version of Cascading released this week incorporates Hadoop 2 support and includes Cascading Lingual - an open source project that provides a comprehensive ANSI SQL interface for accessing Hadoop-based data
Facebook has open-sourced Presto, their distributed SQL query engine. Presto uses a pipelined architecture rather than the Map/Reduce design found elsewhere. In production since early this year, Facebook has since “deployed in multiple geographical regions and [they] have successfully scaled a single cluster to 1,000 nodes”.
New database developments indicate a return to SQL, but not by running the traditional relational stores on bigger and better hardware, not even on sharded architectures, but through NewSQL solutions.
NuoDB has announced version 2.0 of their NewSQL database, now a globally distributed database that can run in the cloud or on premises with real-time replication.
One of the biggest challenges when researching a new technology is determining where to start. A typical SQL Server installation could easily have hundreds of tables. Examining each one by hand to determine which would benefit from conversion, is a daunting challenge. This is where the AMR Tool comes into play.
In this report we look at the internals of SQL Server’s In-Memory OLTP to see how it uses timestamp-like transaction ids in lieu of locks.
SQL Server 2014 will offer Clustered Columnstore Indexes. These will offer the performance and compression benefits of column-oriented storage without the need to restrict the underlying table to read-only access.
Originally this report was titled “Natively Compiled Queries”, but that doesn’t do justice to how deep this runs. When a memory optimized table is created, SQL Server will create a DLL specifically for that table. All data access for the table, including indexes, occurs through this DLL.
SQL Server 2014’s Memory Optimized Tables handle indexes very differently than traditional tables. First and foremost, you must have at least one index and cannot have more than eight indexes. Only the primary key can be marked as unique and don’t even think about foreign keys or filtered indexes.
In SQL Server 2014 Microsoft will be unveiling its lock-free technology known as Memory Optimized Tables. Using a new storage and query subsystem, these represent a radical departure from traditional database design.
Microsoft has released release candidate version of Entity Framework 6 with support for interception, SQL logging, testability improvements. It also includes substantial changes to API, IntelliSense documentation.
Being clever about system architecture in advance is hard. Scaling successfully is more about being clever with metrics and introspection, creating efficient build and provisioning processes and being comfortable with radical change. These are some of the keys to scaling at Dropbox according to Rajiv Eranki in his recent presentation at the 2013 RAMP Conference.
Google is making MySQL available in the cloud as a fully managed service, including a JSON API for programmatic management.