InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
SQLite 3.9 Supports JSON, Indexes on Expressions and More
Recently released SQLite 3.9 provides a number of new features and enhancements, including support for JSON encoding/decoding, full text search version 5, indexes on expressions, eponymous virtual tables and more.
-
Amazon Announces QuickSight - Business Intelligence for Big Data on AWS
Amazon has announced QuickSight at AWS Re:invent conference. QuickSight a complete Business Intelligence solution to help customers gain insights from the data they have stored in AWS.
-
Salesforce Enters IoT Market
At Salesforce’s recent Dreamforce conference, the company announced an upcoming IoT platform that will allow for the ingestion of real time data and turn it into actionable tasks across its suite of cloud based services.
-
Hortonworks Addresses the IoAT with DataFlow Based on NiFi
Hortonworks has quietly made available the DataFlow platform which is based on Apache NiFi and attempts to solve the processing needs of the IoAT.
-
SpringXD being Re-architected and Re-branded to Spring Cloud Data Flow
Pivotal announced a complete re-design of Spring XD, its big data offering, during last week’s SpringOne2GX conference, with a corresponding re-brand from Spring XD to Spring Cloud Data Flow. The new product is focussed on orchestration.
-
Splunk ITSI: Adaptive Thresholds and Anomaly Detection
In theory the operations team determines what the thresholds for warnings and alerts should be. But in practice, the operations team often have no idea what these values should be. Using machine learning techniques such as adaptive thresholds, Splunk ITSI solves this problem.
-
Splunk for DBAs
The DBA’s primary job is to ensure that the business’s information is always available, with performance coming in at close second. We’ve already talked about optimizing distributed queries in Splunk and map-reduce queries in Hunk. In this report we expand upon that with more information that a DBA needs to know about Splunk databases.
-
Optimizing Distributed Queries in Splunk
Optimizing queries in Splunk’s Search Processing Language is similar to optimizing queries in SQL. The two core tenants are the same: Change the physics and reduce the amount of work done. Added to that are two precepts that apply to any distributed query.
-
Big Data Architecture: Push, Pull, or Search in Place?
A surprisingly common theme at the Splunk Conference is the architectural question, “Should I push, pull, or search in place?”
-
Architecture, Tuning, and Troubleshooting a Splunk Indexer Cluster
If you could handle all of the data you need to work with on one machine, then there is no reason to use big data techniques. So clustering is pretty much assumed for any installation larger than a basic proof of concept. In Splunk Enterprise, the most common type of cluster you’ll be dealing with is the Indexer Cluster.
-
Hunk/Hadoop: Performance Best Practices
When working with Hadoop, with or without Hunk, there are a number of ways you can accidentally kill performance. While some of the fixes require more hardware, sometimes the problems can be solved simply by changing the way you name your files.
-
Introducing Splunk IT Service Intelligence
Splunk is jumping into the service-monitoring sector with a new visualization called IT Service Intelligence.
-
Using Hunk+Hadoop as a Backend for Splunk
Splunk can now store archived indexes on Hadoop. At the cost of performance, this offers a 75% reduction in storage costs without losing the ability to search the data. And with the new adapters, Hadoop tools such as Hive and Pig can process the Splunk-formatted data.
-
Splunk .conf 2015 Keynote
Splunk opened their big data conference with an emphasis on “making machine data accessible, usable, and valuable to everyone”. This is a shift from their original focus: indexing arbitrary big data sources. Reasonably happy with their ability to process data, they want to ensure that developers, IT staff, and normal people have a way to actually use all of the data their company is collecting.
-
Luca Olivari on Multi-Model NoSQL Database OrientDB 2.1 New Features
Multi-model NoSQL database OrientDB supports storing and managing document and graph data sets. Orient Technologies, the company behind OrientDB, announced last month the general availability of version 2.1 of the database.