InfoQ Homepage Database Content on InfoQ
-
Orchestrating Chaos: Applying Database Research in the Wild
Peter Alvaro describes LDFI’s (Lineage-driven Fault Injection) theoretical roots in database research, presenting early results from the field and opportunities for near and long-term future research.
-
Managing Thousands of Data Services @Heroku
Gabriel Enslein discusses the evolution of fleet orchestration, immutable infrastructure, security auditing for managing data services for many Salesforce customers.
-
Scaling with Apache Spark
Holden Karau looks at Apache Spark from a performance/scaling point of view and what’s needed to handle large datasets.
-
Managing Data in Microservices
Randy Shoup shares microservices managing data patterns from Google, eBay, and Stitch Fix., talking on the need to access the data only through microservice's interface, communicate through events.
-
Serverless Design Patterns with AWS Lambda: Big Data with Little Effort
Tim Wagner discusses Big Data on serverless, showing working examples and how to set up a CI/CD pipeline, demonstrating AWS Lambda with the Serverless Application Model (SAM).
-
Power of the Log:LSM & Append Only Data Structures
Ben Stopford talks about the beauty of sequential access and append only data structures in the context of “Log Structured Merge Trees”.
-
Applied Distributed Research in Apache Cassandra
Jonathan Ellis explains the challenges and successes Cassandra has had in creating transactions, materialized views, and a strongly consistent cluster membership within this peer-to-peer paradigm.
-
Scio: Moving Big Data to Google Cloud, a Spotify Story
Neville Li tells the Spotify’s story of migrating their big data infrastructure to Google Cloud, replacing Hive and Scalding with BigQuery and Scio, which helped them iterate faster.
-
In-Memory Caching: Curb Tail Latency with Pelikan
Yao Yue introduces Pelikan - a framework to implement distributed caches such as Memcached and Redis. She discusses the system aspects that are important to the performance of such services.
-
Data Preparation for Data Science: A Field Guide
Casey Stella presents a utility written with Apache Spark to automate data preparation, discovering missing values, values with skewed distributions and discovering likely errors within data.
-
Building Reliability in an Unreliable World
Greg Murphy describes how GameSparks has designed their platform to be tolerant of many things: unreliable and slow internet connectivity, cloud resources that can fail without warning, and more.
-
AI from an Investment Perspective
The panelists discuss AI from an investment perspective, the challenges, the risks, trends, the role of Deep Learning, successful AI use cases, and more.