Chris Mattmann explains the type and magnitude of data produced in scientific projects like the Square Kilometer Array Telescope, the tools to use for scientific data processing and much more.
Dean Wampler explains Scalding and the other Hadoop support libraries, the return of SQL, how (big) data is the killer application for functional programming, Java 8 vs Scala, and much more.
Emil Eifrem looks back at the history of Neo4j, an open-source, NoSQL graph database supported by Neo Technology. He describes some real world applications of graphs, domain modelling with graphs, and compares the performance of graph and relational databases. He also examines how Neo4j differs from other NoSQL and graph databases in the market and describes various Neo4j licensing options.
Ian Robinson discusses Neo4J's design choices for data storage and retrieval, CRUD operations, transactions, graph traversal and searches and HA deployment strategies. He also shares his thoughts on hypermedia controls and the concept of consumer driven contracts for continuous evolution of services.
In this interview, Michael Hunger talks about the evolution of persistence technologies over the last decade, the emergence of NoSQL databases, and looks at where graph databases fit in. He describes the goals behind the Spring Data Neo4j project, it's latest developments, and examines Cypher, a humane and declarative query language for graphs.
Rich Hickey and Justin Sheehy talk about scalability and transactionability of datastores. They explain tradeoffs for achieving read and/or write scalability on top of Datomic and Riak.
Rich Hickey explains the ideas behind the Datomic database: why Datalog is used as the query language, the functional programming concepts at its core, the role of time in the DB and much more.
Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.
Hilary Mason, interviewed by Ryan Slobojan, discuss the engineering behind bit.ly and their use of machine learning in their system architecture. Hilary also talks about their use of MySQL and MongoDB to manage terabytes of information about users and clicks and their implications on performing real-time analysis of anthropology on the human condition.
Adrian Cole discusses his jclouds project, which is an open source library that helps Java developers get started in the cloud and reuse their Java development skills. Cole also talks about some of the challenges of creating a cloud agnostic library, such as the use of different hypervisors and that various cloud implementations are written in different languages, such as VB, Python, Ruby, etc.
Emil Eifrem explains graph databases, what domains they fit well, and the state of Neo4j. Also: how graph databases stack up against RDBMs.