InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
MySQL to NoSQL: Data Modeling Challenges in Supporting Scalability
Kenneth M. Anderson shares some of the data modeling issues encountered while transitioning from a relational database to NoSQL.
-
Apache Cassandra Anti Patterns
Matthew Dennis covers the most common mistakes made with Cassandra that he has noticed being made both in deployment and code.
-
Transactions: Over Used or Just Misunderstood?
Mark Little provides advice on when it is not recommended to use transactions and how to use transactions with Web Services, NoSQL, REST and mobile infrastructures.
-
Deconstructing the Database
Rich Hickey deconstructs the monolithic database into separate services, transactions, storage, query, combining them with a data model based on atomic facts to provide new capabilities and tradeoffs.
-
How to Build Big Data Pipelines for Hadoop Using OSS
Costin Leau discusses Big Data, current available tools for dealing with it, and how Spring can be used to create Big Data pipelines.
-
Reverend Bayes, Meet Countess Lovelace: Machine Learning and Programming
Andy Gordon discusses machine learning using functional programming, explaining how Infer.NET Fun turns the succinct syntax of F# into an executable modeling language for Bayesian machine learning.
-
F# Big Data Scripting
Matthew Moloney shares some of the F# tools built at Microsoft Research for dealing with Big Data.
-
The Evolving Panorama of Data
Rebecca Parsons proposes taking a different look at data, using different approaches and tools, then looks at some of the ways social data is used these days.
-
Scaling Scalability: Evolving Twitter Analytics
Dmitriy Ryaboy shares some of the lessons learned scaling Twitter’s analytics infrastructure: Data loves a schema, Make data sources discoverable, and Make costs visible.
-
Lean Data Architecture: Minimize Investment, Maximize Value
Manvir Singh Grewal and Brandon Byars propose a business intelligence workflow along with Lean principles and practices for implementing a data warehouse and reporting capability.
-
Storm: Distributed and Fault-Tolerant Real-time Computation
Nathan Marz introduces Twitter Storm, outlining its architecture and use cases, and takes a look at future features to be made available.
-
Postgres Demystified
Craig Kerstiens presents the history of Postgres, the basics of developing with Postgres, notes on its performance, and tips on querying it.