DevOps Follow 1013 Followers

The Evolution of Uber’s 100+ Petabyte Big Data Platform

by Hrishikesh Barua Follow 16 Followers on  Nov 10, 2018

Uber’s engineering team wrote about how their big data platform evolved from traditional ETL jobs with relational databases to one based on Hadoop and Spark. A scalable ingestion model, standard transfer format and a custom library for incremental updates are the key components of the platform.

AI, ML & Data Engineering Follow 1063 Followers

Data Lakes and Modern Data Architecture in Clinical Research and Healthcare

by Srini Penchikala Follow 40 Followers on  Nov 08, 2018

Dr. Prakriteswar Santikary, chief data officer at ERT, spoke at Data Architecture Summit 2018 Conference last month about data lake architecture his team developed at their clinical research organization. He discussed the data platform deployed in the cloud to streamline data collection, aggregation and clinical reporting and analytics, using concepts like serverless computing and data services.

AI, ML & Data Engineering Follow 1063 Followers

Event Sourcing to the Cloud at HomeAway

by Srini Penchikala Follow 40 Followers on  Nov 05, 2018 3

Adam Haines, Data Architect at HomeAway, recently spoke at the Data Architecture Summit 2018 Conference about how his team leverages event sourcing cloud design pattern to accelerate the big data initiatives in their organization.

Development Follow 746 Followers

GitHub Incident Analysis Shows How to Improve Service Reliability

by Sergio De Simone Follow 21 Followers on  Nov 01, 2018

On October 21, 2018, GitHub users experienced a degraded service during 24 hours due to an incident caused by routine maintenance work. This led to the display of outdated and inconsistent information and to the unavailability of webhooks and other internal services for 24 hours. GitHub post-incident report shows where things failed and suggests how to improve site reliability.

Cloud Follow 354 Followers

Cloudera and Hortonworks Merge with Goal to Increase Competition with Cloud Offerings

by Alex Giamas Follow 10 Followers on  Oct 31, 2018

Earlier this month, Cloudera and Hortonworks announced an all-stock merger at a combined value of around $5.2 billion. Analysts have argued that this merger is aimed at increased competition that both companies are facing from cloud vendors like Amazon, Google and Microsoft. In this article we log reactions from analysts and the industry, and the implications for current customers.

AI, ML & Data Engineering Follow 1063 Followers

Agile Data Modeling for NoSQL Databases

by Srini Penchikala Follow 40 Followers on  Oct 30, 2018

Pascal Desmarets recently spoke at Data Architecture Summit 2018 Conference about agile modeling and best practices for NoSQL databases.

DevOps Follow 1013 Followers

Scaling Global Traffic at Dropbox with Edge Locations and GSLB

by Hrishikesh Barua Follow 16 Followers on  Oct 27, 2018

The Dropbox engineering team shared their experience of architecting and scaling their global network of edge locations. Located around the globe, these run a custom stack of nginx and IPVS and connect to the Dropbox backend servers over their backbone network. A combination of GeoDNS and BGP Anycast ensures availability and low latency for end users.

Cloud Follow 354 Followers

Redis 5.0 Released with New "Streams" Data Type

by Alex Giamas Follow 10 Followers on  Oct 26, 2018

Redis recently announced version 5 of its popular database, 15 months after the release of Redis 4. Probably the most important feature of this version is the support for a new data type, Streams. Sorted set functionality has also improved and Redis modules have also been expanded, with the introduction of Clusters and Timers APIs. LOLWUT and other improvements are reviewed in the article...

Cloud Follow 354 Followers

Google's Apigee API Platform Enhanced with API Monitoring and "Extensions" to Connect GCP Services

by Steef-Jan Wiggers Follow 9 Followers on  Oct 20, 2018

Google Cloud's full lifecycle API Management platform Apigee provides customers control over, and visibility into, the API's that connect applications and data across their enterprises and clouds. Recently, Google announced the general availability of various new Apigee capabilities such as Apigee API monitoring, Apigee extensions, and Apigee hosted targets.

JavaScript Follow 457 Followers

Tim Berners-Lee Introduces "Solid" Decentralized Identity Platform

by Dylan Schiemann Follow 10 Followers on  Oct 16, 2018

Solid is a new decentralized identity platform from WWW Creator Tim Berners-Lee. Solid provides a mechanism for users to own and better control the usage of their data.

AI, ML & Data Engineering Follow 1063 Followers

William McKnight on Data Platforms and Creating a Modern Data Architecture

by Srini Penchikala Follow 40 Followers on  Oct 15, 2018

William McKnight gave a keynote presentation last week at Data Architecture Summit 2018 Conference on creating a modern data architecture using different data platforms.

DevOps Follow 1013 Followers

High Volume Space Exploration Time-Series Data Storage in PostgreSQL

by Hrishikesh Barua Follow 16 Followers on  Oct 13, 2018 1

The European Space Agency Science Data Center (ESDC) switched to PostgreSQL with the TimescaleDB extension for their data storage. ESDC’s diverse data includes structured, unstructured and time series metrics running to hundred of terabytes, and querying requirements across datasets with open source tools.

Cloud Follow 354 Followers

Azure Content Delivery Network Is Now Generally Available

by Steef-Jan Wiggers Follow 9 Followers on  Oct 13, 2018

Microsoft announced the general availability (GA) of the Azure CDN, allowing customers to deliver content from Microsoft’s global CDN network. The release was a follow up on the public preview last May.

DevOps Follow 1013 Followers

Gremlin Releases Application Level Fault Injection (ALFI) Platform for Targeted Chaos Experiments

by Daniel Bryant Follow 801 Followers on  Oct 07, 2018 2

Gremlin Inc has released their second product offering in the “Failure-as-a-Service” domain– Application-Level Fault Injection (ALFI). Building upon their initial platform that facilitated engineers in creating and running chaos experiments at the infrastructure level, ALFI enables failure injection at the application level via a native language library.

Cloud Follow 354 Followers

Netflix Keystone Real-Time Stream Processing Platform

by Alex Giamas Follow 10 Followers on  Sep 30, 2018

Netflix recently published a post in their tech blog discussing the design considerations and insights of Keystone, their Real-time stream processing platform. Keystone has been operational since December 2015 and has grown significantly over the years as Netflix subscribers have grown from 65 to over 130 million in the past 3 years. This article follows on the latest state of Keystone platform...