Basho Technologies releases Riak TS distributed NoSQL database to store and analyze time series data. Basho team recently announced at AWS re:invent event, the availability of Riak TS which is optimized for reads and writes of time series data.
LinkedIn has open sourced PalDB, an embeddable read-only key value store, 8 times faster than LevelDB and taking several times less memory than a hashset.
Latest version of Document NoSQL database Couchbase supports multi-dimensional scaling, geospatial indexes and new query language called N1QL. Couchbase team announced earlier this month the general availability of Couchbase Server 4.0 version which also supports new filtering capabilities on Cross Datacenter Replication (XDCR) and enhanced security.
Twitter is using replicated logs for high performance data collection and analysis of its systems. DistributedLog is the system developed at Twitter for this purpose. Twitter has developed a distributed key-value database, Manhattan. Manhattan can trade consistency for latency in reads following the eventually consistent data model. We examine Twitter's design and tradeoffs for DistributedLog.
Recently released SQLite 3.9 provides a number of new features and enhancements, including support for JSON encoding/decoding, full text search version 5, indexes on expressions, eponymous virtual tables and more.
At its re:Invent conference, Amazon Web Services has unveiled two new instance types for its EC2 service: X1, sporting 2TB of memory, and T2.Nano, aimed to the lower end of compute requirements.
Amazon has announced QuickSight at AWS Re:invent conference. QuickSight a complete Business Intelligence solution to help customers gain insights from the data they have stored in AWS.
At Salesforce’s recent Dreamforce conference, the company announced an upcoming IoT platform that will allow for the ingestion of real time data and turn it into actionable tasks across its suite of cloud based services.
Stephen Colebourne and OpenGamma released v1.1 of ElSql, a library and DSL for managing SQL in external files. Colebourne is well known for his work as the spec lead of Java Time, a cornerstone of last year's Java 8 release, and for his creation of the Joda Time and Joda Money API's.
Hortonworks has quietly made available the DataFlow platform which is based on Apache NiFi and attempts to solve the processing needs of the IoAT.
Pivotal announced a complete re-design of Spring XD, its big data offering, during last week’s SpringOne2GX conference, with a corresponding re-brand from Spring XD to Spring Cloud Data Flow. The new product is focussed on orchestration.
The DBA’s primary job is to ensure that the business’s information is always available, with performance coming in at close second. We’ve already talked about optimizing distributed queries in Splunk and map-reduce queries in Hunk. In this report we expand upon that with more information that a DBA needs to know about Splunk databases.
Optimizing queries in Splunk’s Search Processing Language is similar to optimizing queries in SQL. The two core tenants are the same: Change the physics and reduce the amount of work done. Added to that are two precepts that apply to any distributed query.
A surprisingly common theme at the Splunk Conference is the architectural question, “Should I push, pull, or search in place?”
If you could handle all of the data you need to work with on one machine, then there is no reason to use big data techniques. So clustering is pretty much assumed for any installation larger than a basic proof of concept. In Splunk Enterprise, the most common type of cluster you’ll be dealing with is the Indexer Cluster.