InfoQ Homepage Big Data Content on InfoQ
-
Developing Deep Learning Systems Using Institutional Incremental Learning
Institutional incremental learning promises to achieve collaborative learning. This form of learning can address data sharing and security issues, without bringing in the complexities of federated learning. This article talks about practical approaches which help in building an object detection system.
-
Accelerating Deep Learning on the JVM with Apache Spark and NVIDIA GPUs
In this article, authors discuss how to use the combination of Deep Java Learning (DJL), Apache Spark v3, and NVIDIA GPU computing to simplify deep learning pipelines while improving performance and reducing costs. They also show the performance comparison of this solution with GPU vs CPU hardware, using Amazon EMR and NVIDIA RAPIDS Accelerator.
-
Evolution of Azure Synapse: Apache Spark 3.0, GPU Acceleration, Delta Lake, Dataverse Support
At Microsoft Build 2021, Azure Synapse has announced significant improvements for its Apache Spark pool, its performance, and data querying and integration capabilities. This article outlines the improvements and provides the context.
-
Indestructible Storage in the Cloud with Apache Bookkeeper
At Salesforce, we required a storage system that could work with two kinds of streams, one stream for write-ahead logs and one for data. But we have competing requirements from both of the streams. Being the pioneers in cloud computing, we also required our storage system to be cloud-aware as the requirements of availability and durability are ever more increasing.
-
The Evolution of Precomputation Technology and its Role in Data Analytics
In this article, author Yang Li discusses the importance of precomputation techniques in databases, OLAP and data cubes, and some of the trends in using precomputation in big data analytics.
-
Performance Tuning Techniques of Hive Big Data Table
In this article, author Sudhish Koloth discusses how to tackle performance problems when using Hive Big Data tables.
-
The Brain is Neither a Neural Network Nor a Computer: Book Review of The Biological Mind
Underlying much of artificial intelligence research is the idea that the essence of an individual resides in the brain. This is contrary to neuroscience which has discovered that a brain cannot work independently from the body and its environment. Understanding this enables us see what is reasonable to expect from artificial intelligence, as well as technology designed to improve human life.
-
Overcoming Data Scarcity and Privacy Challenges with Synthetic Data
In this article, the author discusses the importance of using synthetic data in data analytics projects, especially in financial institutions, to solve the problems of data scarcity and more importantly data privacy.
-
Beyond the Database, and beyond the Stream Processor: What's the Next Step for Data Management?
Databases have been around forever with the same shape: you make a request to your data and then you receive an answer. Now, stream processors came along with a different approach: data isn’t locked up, it is in motion. Understand how stream processors and databases relate and why there is an emerging new category of databases that focus on data that stays in place as well as data that moves.
-
The End of the Privacy Shield Agreement Could Lead to Disaster for Hyperscale Cloud Providers
The recent ending of the Privacy Shield agreement by the European Court of Justice (ECJ) might impact cloud adoption. This article looks at the demise of this agreement, and possible solutions.
-
COVID-19 and Mining Social Media - Enabling Machine Learning Workloads with Big Data
In this article, author Adi Pollock discusses how to enable machine learning workloads with big data to query and analyze COVID-19 tweets to understand social sentiment towards COVID-19.
-
From Cloud to Cloudlets: a New Approach to Data Processing?
The growing popularity of small, distributed clouds, or “cloudlets” is an implicit recognition of the limitations of the “traditional” cloud model, and could signal a major shift in the way that data is collected, stored, and processed.
Sponsored Content
SQL to NoSQL: Architecture Differences and Considerations for Migration
Learn the tradeoffs between flexibility, scale and cost, plus what to consider for successful SQL to NoSQL migrations. Download Now.
7 Reasons Not to Put an External Cache in Front of Your Database
Discover the common pitfalls of using external caching and get real-world examples of companies successfully eliminating it. Download Now.
7 Essentials When Selecting a NoSQL Database-as-a-Service (DBaaS)
This paper outlines 7 key considerations that help teams tap the many benefits a DBaaS has to offer — without falling into some of the common traps that impact agility, productivity, costs, and growth. Download Now.