InfoQ Homepage Machine Learning Content on InfoQ
-
OpenAI Announces New Models and APIs at First Developer Day Conference
OpenAI announced additions and price reductions across its platform at its first Developer Day. The updates include the introduction of a new GPT-4 Turbo model, an Assistants API, and multimodal capabilities, among others.
-
Microsoft Releases DeepSpeed-FastGen for High-Throughput Text Generation
Microsoft has announced the alpha release of DeepSpeed-FastGen, a system designed to improve the deployment and serving of large language models (LLMs). DeepSpeed-FastGen is the synergistic composition of DeepSpeed-MII and DeepSpeed-Inference . DeepSpeed-FastGen is based on the Dynamic SplitFuse technique. The system currently supports several model architectures.
-
Seven Essential Tracks at QCon London 2024: GenAI, FinTech, Platform Engineering & More!
InfoQ’s international software development conference, QCon London, returns on April 8-10, 2024. The conference will feature 15 carefully curated tracks and 60 technical talks over 3 days.
-
Ethical Machine Learning with Explainable AI and Impact Analysis
As more decisions are made or influenced by machines, there’s a growing need for a code of ethics for artificial intelligence. The main question is, “I can build it, but should I?” Explainable AI can provide checks and balances for fairness and explainability, and engineers can analyze the systems' impact on people's lives and mental health.
-
PyTorch 2.1 Release Supports Automatic Dynamic Shape Support and Distributed Training Enhancements
PyTorch Conference 2023 presented an overview of PyTorch 2.1. ExecuTorch was introduced to enhance PyTorch's performance on mobile and edge devices. The conference also had a focus on community with new members added to the PyTorch Foundation and a Docathon announced.
-
Google Cloud Ops Agent Can Now Monitor Nvidia GPUs
Google Cloud announced that Ops Agent, the agent for collecting telemetry from Compute Engine instances, can now collect and aggregate metrics from NVIDIA GPUs on VMs.
-
Defensible Moats: Unlocking Enterprise Value with Large Language Models at QCon San Francisco
In a recent presentation at QConSFrancisco, Nischal HP discussed the challenges enterprises face when building LLM-powered applications using APIs alone. These challenges include data fragmentation, the absence of a shared business vocabulary, privacy concerns regarding data, and diverse objectives among stakeholders.
-
Canonical Launches Charmed MLFlow to Simplify Management and Maintenance of ML Workflows
Based on the open-source MLflow platform, Canonical Charmed MLFlow aims to simplify the task of managing machine learning workflows and artifacts by using alternative packaging system and orchestration engine.
-
Unpacking How Ads Ranking Works @ Pinterest: Aayush Mudgal at QCon San Francisco
At QCon San Francisco, Aayush Mudgal gave a talk on Pinterest's ad ranking strategy. Pinterest does both candidate retrieval and ranking, supported by user interaction data and what they are currently watching. They use neural networks to create embeddings for ads and users, where ads which are close to the user should be relevant. They train and deploy models on a daily basis.
-
Grafana Introduces ML Tool Sift to Improve Incident Response
Grafana Labs has introduced "Sift," a feature for Grafana Cloud designed to enhance incident response management (IRM) by automating system checks and expediting issue resolution. Sift automates various aspects of incident investigation. Sift provides valuable insights into potential issues within Kubernetes environments, helping engineers focus on resolving incidents.
-
AI a “Must-Have” in GitLab’s 2023 Global DevSecOps Report
GitLab has released their 2023 Global DevSecOps AI report, with the key finding that AI and ML use is evolving from a "nice-to-have" to a "must-have". The report shows that 23% of organizations are already using AI in software development, and of those, 60% are using it daily. Furthermore, 65% of respondents said they are using AI and ML for testing now, or would be within the next three years.
-
AWS Unveils Multi-Model Endpoints for PyTorch on SageMaker
AWS has introduced Multi-Model Endpoints for PyTorch on Amazon SageMaker. This latest development promises to revolutionize the AI landscape, offering users more flexibility and efficiency when deploying machine learning models.
-
AI, ML, Data Engineering News Roundup: Stable Chat, Vertex AI, ChatGPT and Code Llama
The most recent update, which covers developments through September 4, 2023, highlights significant pronouncements and accomplishments in the fields of artificial intelligence, machine learning, and data science. Developments from Stability AI, Google, OpenAI, and Meta were among this week's significant stories.
-
Weekly Update on Large Language Models: PointLLM, WALL-E, AskIt, and Jais
The most recent compilation of advanced research, inventive applications, and notable unveilings in the realm of Large Language Models (LLMs) during the week starting September 4th, 2023.
-
Google Announces Ray Support for Vertex AI to Boost Machine Learning Workflows
Google has announced that it is expanding its open-source support for Vertex AI, its machine learning platform, by adding support for Ray, an open-source unified compute framework. This move is aimed at efficiently scaling AI workloads and enhancing the productivity and operational efficiency of data science teams.