InfoQ Homepage News
-
Microsoft Launches Azure Confidential VMs with NVIDIA Tensor Core GPUs for Enhanced Secure Workloads
Microsoft's Azure has launched the NCC H100 v5 virtual machines, now equipped with NVIDIA Tensor Core GPUs, enhancing secure computing for high-performance workloads. These VMs leverage AMD EPYC processors for robust data protection, making them ideal for tasks like AI model training and inferencing, while ensuring a trusted execution environment for sensitive applications.
-
Distill Your LLMs and Surpass Their Performance: spaCy's Creator at InfoQ DevSummit Munich
In her presentation at the inaugural edition of InfoQ Dev Summit Munich, Ines Montani built on top of the presentation she had earlier this year at QCon London and provided the audience with practical solutions for using the latest state-of-the-art models in real-world applications and distilling their knowledge into smaller and faster components that you can run and maintain in-house.
-
University Researchers Publish Analysis of Chain-of-Thought Reasoning in LLMs
Researchers from Princeton University and Yale University published a case study of Chain-of-Thought (CoT) reasoning in LLMs which shows evidence of both memorization and true reasoning. They also found that CoT can work even when examples given in the prompt are incorrect.
-
Java News Roundup: WildFly 34, Stream Gatherers, Oracle CPU, Quarkiverse Release Process
This week's Java roundup for October 14th, 2024, features news highlighting: the release of WildFly 34; JEP 485, Stream Gatherers, proposed to target for JDK 24; Oracle Critical Patch Update for October 2024; and a potential leak in the SmallRye and Quarkiverse release processes.
-
Microsoft and Tsinghua University Present DIFF Transformer for LLMs
Researchers from Microsoft AI and Tsinghua University have introduced a new architecture called the Differential Transformer (DIFF Transformer), aimed at improving the performance of large language models. This model enhances attention mechanisms by refining how models handle context and minimizing distractions from irrelevant information.
-
OpenAI Releases Swarm, an Experimental Open-Source Framework for Multi-Agent Orchestration
Recently released as an experimental tool, Swarm aims to allow developers to investigate how they can have multiple agents coordinate with one another to execute tasks using routines and handoffs.
-
General-Purpose and Compute-Intensive Amazon EC2 Graviton4 Instances Now Available
AWS has recently released the EC2 C8g and M8g instances, powered by the latest Graviton4 processors. The general-purpose M8g and compute-intensive C8g instances are designed to deliver up to 30% better performance compared to Graviton3-based instances, with a cost increase of approximately 10% over the previous M7g and C7g generations.
-
Google Cloud Adds Scalable Vector Search to Memorystore for Valkey & Redis Cluster
Google Cloud has introduced scalable vector-search capabilities to its Memorystore for Valkey and Redis Cluster. This update allows developers to perform vector searches at ultra-low latencies over billions of vectors.
-
Podman Desktop 1.13 Launches with Hyper-V Support and Additional Enhancements
Podman Desktop 1.13 introduces key updates, including Hyper-V support for managing Podman machines on Windows, an integrated image search feature, and redesigned empty state pages for containers, images, pods, and Kubernetes. The release also includes a reorganized Kubernetes navigation and an Image Layer Explorer extension.
-
Microsoft Releases Preview of AI Integration Libraries for .NET
Last week, Microsoft announced the preview release of two libraries: Microsoft.Extensions.AI.Abstractions and Microsoft.Extensions.AI. These packages, referred to as Unified AI Building Blocks, provide the .NET ecosystem with essential abstractions for integrating artificial intelligence (AI) services into .NET applications and libraries, along with middleware to enhance key capabilities.
-
Microsoft Introduces Drasi: Open-Source System for Real-Time Event Processing and Automation
Microsoft’s Azure Incubations team introduced Drasi, an open-source system that simplifies detecting critical events in complex infrastructures. Drasi offers real-time monitoring and automated responses, eliminating the need for manual event handling. With flexible components and integrations, it streamlines change detection across various data sources.
-
Vertex AI in Firebase Aims to Simplify the Creation of Gemini-powered Mobile Apps
Currently available in beta, the Vertex AI SDK for Firebase enables the creation of apps that go beyond the simple chat model and text prompting. Google has just made available a colab to help developers through the steps required to integrate it into their apps.
-
No EC2 or Kubernetes Allowed: Insights from Building Serverless-Only Architecture at PostNL
PostNL shared insights and guidance from its transition from outsourced IT project delivery to an in-house product delivery capability. By embracing cloud-native technologies, with an emphasis on serverless services, the company achieved significant gains in productivity and market responsiveness while reducing operational costs.
-
How a Sustainable Mindset in Software Engineering Can Increase Team Performance and Prevent Burnout
A sustainable mindset in software engineering matters because software is still primarily built by humans, and we must prioritize their well-being, Marion Løken said at NDC Oslo. Integrating the team more deeply into discovery work, discussing feedback collectively, and fostering a culture of psychological safety helped to engage her team and mitigate burnout.
-
Challenges and Lessons Porting Code from C to Rust
In a two-installment series, Stephen Crane and Khyber Sen, software engineers at Immunant, recount how they ported VideoLAN and FFmpeg AV1 decoder from C to Rust for the Internet Security Research Group (ISRG). The series includes plenty of details about how they ensured not to break things and optimized performance.