InfoQ Homepage Machine Learning Content on InfoQ
-
Meta Announces Generative AI Models Emu Video and Emu Edit
Meta AI Research announced two new generative AI models: Emu Video, which can generate short videos given a text prompt, and Emu Edit, which can edit images given text-based instructions. Both models are based on Meta's Emu foundation model and exhibit state-of-the-art performance on several benchmarks.
-
Anthropic Announces Claude 2.1 LLM with Wider Context Window and Support for AI Tools
According to Anthropic, the newest version of Claude delivers many “advancements in key capabilities for enterprises—including an industry-leading 200K token context window, significant reductions in rates of model hallucination, system prompts and our new beta feature: tool use.” Anthropic also announced reduced pricing to improve cost efficiency for our customers across models.
-
Spotify Open-Sources Voyager Nearest-Neighbor Search Library
Spotify Engineering recently open-sourced Voyager, an approximate nearest-neighbor (ANN) search library. Voyager is based on the hierarchical navigable small worlds (HNSW) algorithm and is 10 times faster than Spotify's previous ANN library, Annoy.
-
xAI Introduces Large Language Model Grok
xAI, the AI company founded by Elon Musk, recently announced Grok, a large language model. Grok can access current knowledge of the world via the X platform and outperforms other LLMs of comparable size, including GPT-3.5, on several benchmarks.
-
Grafana Cloud Kubernetes Monitoring with Machine Learning Predictions
Managing cloud costs can be challenging as Kubernetes fleets scale. To address this issue, Grafana Cloud has introduced a cost-monitoring feature within Kubernetes Monitoring. In particular, Grafana Cloud’s Kubernetes Monitoring now offers ML predictions for CPU and memory usage.
-
AWS Unveils Gemini, a Distributed Training System for Swift Failure Recovery in Large Model Training
AWS and Rice University have introduced Gemini, a new distributed training system to redefine failure recovery in large-scale deep learning models. According to the research paper, Gemini adopts a daring strategy by utilizing CPU memory to ensure previously unheard-of speeds in failure recovery, overcoming obstacles related to high recovery costs and constrained checkpoint storage capacity.
-
Mojo Language SDK Available: Mojo Driver, VS Code extension, and Jupyter Kernel
Mojo SDK is available for developers. It contains the mojo driver, the Visual Studio Code extension and the Jupyter kernel. For now, SDK is available for MacOS and Linux.
-
OpenAI Announces New Models and APIs at First Developer Day Conference
OpenAI announced additions and price reductions across its platform at its first Developer Day. The updates include the introduction of a new GPT-4 Turbo model, an Assistants API, and multimodal capabilities, among others.
-
Microsoft Releases DeepSpeed-FastGen for High-Throughput Text Generation
Microsoft has announced the alpha release of DeepSpeed-FastGen, a system designed to improve the deployment and serving of large language models (LLMs). DeepSpeed-FastGen is the synergistic composition of DeepSpeed-MII and DeepSpeed-Inference . DeepSpeed-FastGen is based on the Dynamic SplitFuse technique. The system currently supports several model architectures.
-
Seven Essential Tracks at QCon London 2024: GenAI, FinTech, Platform Engineering & More!
InfoQ’s international software development conference, QCon London, returns on April 8-10, 2024. The conference will feature 15 carefully curated tracks and 60 technical talks over 3 days.
-
Ethical Machine Learning with Explainable AI and Impact Analysis
As more decisions are made or influenced by machines, there’s a growing need for a code of ethics for artificial intelligence. The main question is, “I can build it, but should I?” Explainable AI can provide checks and balances for fairness and explainability, and engineers can analyze the systems' impact on people's lives and mental health.
-
PyTorch 2.1 Release Supports Automatic Dynamic Shape Support and Distributed Training Enhancements
PyTorch Conference 2023 presented an overview of PyTorch 2.1. ExecuTorch was introduced to enhance PyTorch's performance on mobile and edge devices. The conference also had a focus on community with new members added to the PyTorch Foundation and a Docathon announced.
-
Google Cloud Ops Agent Can Now Monitor Nvidia GPUs
Google Cloud announced that Ops Agent, the agent for collecting telemetry from Compute Engine instances, can now collect and aggregate metrics from NVIDIA GPUs on VMs.
-
TorchServe Potentially Exposed to Remote Code Execution
Israeli-based security company Oligo has uncovered multiple vulnerabilities in TorchServe, the tool used to serve PyTorch models, that could allow an attacker to run arbitrary code on vulnerable systems. The vulnerabilities have been promptly fixed in TorchServe version 0.82.
-
Stability AI Releases Generative Audio Model Stable Audio
Harmonai, the audio research lab of Stability AI, has released Stable Audio, a diffusion model for text-controlled audio generation. Stable Audio is trained on 19,500 hours of audio data and can generate 44.1kHz quality audio in realtime using a single NVIDIA A100 GPU.