InfoQ Homepage Neural Networks Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

University Researchers Create New Type of Interpretable Neural Network

Researchers from Massachusetts Institute of Technology, California Institute of Technology, and Northeastern University created a new type of neural network: Kolmogorov–Arnold Networks (KAN). KAN models outperform larger perceptron-based models on physics modeling tasks and provide a more interpretable visualization.

Anthony Alford
on Aug 20, 2024
AI, ML & Data Engineering

University of Pennsylvania Researchers Develop Processorless Learning Circuitry

Researchers from the University of Pennsylvania have designed an electrical circuit, similar to a neural network, that can learn tasks such as nonlinear regression. The circuit operates at low power levels and can be trained without a computer.

Anthony Alford
on Aug 13, 2024
AI, ML & Data Engineering

Google Open Sources 27B Parameter Gemma 2 Language Model

Google DeepMind recently open-sourced Gemma 2, the next generation of their family of small language models. Google made several improvements to the Gemma architecture and used knowledge distillation to give the models state-of-the-art performance: Gemma 2 outperforms other models of comparable size and is competitive with models 2x larger.

Anthony Alford
on Jul 16, 2024
AI, ML & Data Engineering

OpenAI's CriticGPT Catches Errors in Code Generated by ChatGPT

OpenAI recently published a paper about CriticGPT, a version of GPT-4 fine-tuned to critique code generated by ChatGPT. When compared with human evaluators, CriticGPT catches more bugs and produces better critiques. OpenAI plans to use CriticGPT to improve future versions of their models.

Anthony Alford
on Jul 09, 2024
AI, ML & Data Engineering

Meta's Chameleon AI Model Outperforms GPT-4 on Mixed Image-Text Tasks

The Fundamental AI Research (FAIR) team at Meta recently released Chameleon, a mixed-modal AI model that can understand and generate mixed text and image content. In experiments rated by human judges, Chameleon's generated output was preferred over GPT-4 in 51.6% of trials, and over Gemini Pro in 60.4%.

Anthony Alford
on Jun 25, 2024
AI, ML & Data Engineering

Apple WWDC: iOS18 and Apple Intelligence Announcements

At WWDC 2024 Apple unveiled "Apple Intelligence," a suite of AI features coming to iOS 18, iPadOS 18, and macOS Sequoia. Apple’s aim with Apple Intelligence is to seamlessly integrate AI into the core of the iPhone, iPad, and Mac experience.

Andrew Hoblitzell
on Jun 16, 2024
AI, ML & Data Engineering

Meta Open-Sources MEGALODON LLM for Efficient Long Sequence Modeling

Researchers from Meta, University of Southern California, Carnegie Mellon University, and University of California San Diego recently open-sourced MEGALODON, a large language model (LLM) with an unlimited context length. MEGALODON has linear computational complexity and outperforms a similarly-sized Llama 2 model on a range of benchmarks.

Anthony Alford
on Jun 11, 2024
AI, ML & Data Engineering

OpenAI Publishes GPT Model Specification for Fine-Tuning Behavior

OpenAI recently published their Model Spec, a document that describes rules and objectives for the behavior of their GPT models. The spec is intended for use by data labelers and AI researchers when creating data for fine-tuning the models.

Anthony Alford
on Jun 04, 2024
AI, ML & Data Engineering

Stanford AI Index 2024 Report: Growth of AI Regulations and Generative AI Investment

Stanford University’s Institute for Human-Centered Artificial Intelligence (HAI) has published its 2024 AI Index annual report. The report identifies top trends in AI, such as 8x growth in Generative AI investment since 2022.

Anthony Alford
on May 28, 2024
AI, ML & Data Engineering

OpenAI Announces New Flagship Model GPT-4o

OpenAI recently announced the latest version of their GPT AI foundation model, GPT-4o. GPT-4o is faster than the previous version of GPT-4 and has improved capabilities in handling speech, vision, and multilingual tasks, outperforming all models except Google's Gemini on several benchmarks.

Anthony Alford
on May 21, 2024
AI, ML & Data Engineering

Apple Open-Sources One Billion Parameter Language Model OpenELM

Apple released OpenELM, a Transformer-based language model. OpenELM uses a scaled-attention mechanism for more efficient parameter allocation and outperforms similarly-sized models while requiring fewer tokens for training.

Anthony Alford
on May 14, 2024
AI, ML & Data Engineering

Meta Releases Llama 3 Open-Source LLM

Meta AI released Llama 3, the latest generation of their open-source large language model (LLM) family. The model is available in 8B and 70B parameter sizes, each with a base and instruction-tuned variant. Llama3 outperforms other LLMs of the same parameter size on standard LLM benchmarks.

Anthony Alford
on May 07, 2024
AI, ML & Data Engineering

OpenAI Releases New Fine-Tuning API Features

OpenAI announced the release of new features in their fine-tuning API. The features will give model developers more control over the fine-tuning process and better insight into their model performance.

Anthony Alford
on Apr 30, 2024
AI, ML & Data Engineering

Stability AI Releases 3D Model Generation AI Stable Video 3D

Stability AI recently released Stable Video 3D (SV3D), an AI model that can generate 3D mesh object models from a single 2D image. SV3D is based on the Stable Video Diffusion model and produces state-of-the-art results on 3D object generation benchmarks.

Anthony Alford
on Apr 23, 2024
AI, ML & Data Engineering

Google Trains User Interface and Infographics Understanding AI Model ScreenAI

Google Research recently developed ScreenAI, a multimodal AI model for understanding infographics and user interfaces. ScreenAI is based on the PaLI architecture and achieves state-of-the-art performance on several tasks.

Anthony Alford
on Apr 16, 2024

Newer News

Older News

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

News