Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Caltech Open-Sources AI for Solving Partial Differential Equations

Caltech Open-Sources AI for Solving Partial Differential Equations

This item in japanese

Researchers from Caltech's DOLCIT group have open-sourced Fourier Neural Operator (FNO), a deep-learning method for solving partial differential equations (PDEs). FNO outperforms other existing deep-learning techniques for solving PDEs and is three orders of magnitude faster than traditional solvers.

The team described their model and experiments in a paper published on arXiv. By learning a mapping from one function to another, the neural network can represent solutions to a family of PDEs that are invariant of mesh resolution. The model applies a Fourier transform to efficiently calculate a global convolution. When compared to other deep-learning models for solving PDEs, FNO achieves error rates that are 30% lower on Navier-Stokes equations and 60% lower on Darcy flow. FNO can be applied to speed up calculations for predicting weather patterns; in one experiment, the model achieved an inference time of 0.005s compared with a traditional solver's 2.2s, making it a good candidate for research in climate models.

PDEs are used in many areas of physics and engineering to describe a wide variety of phenomena, including heat transfer, fluid dynamics, and quantum mechanics. The solution of the PDE is a function, often of both space and time. For example, the solution to the Navier-Stokes equations, a set of nonlinear PDEs that describe the motion of fluids, is a function that outputs a vector indicating the flow of the fluid, given a point in space and instant of time.

However, in most practical situations there is no closed-form PDE solution, so scientists and engineers resort to numerical approximations, using finite element methods (FEM) or finite difference methods (FDM). These techniques work by creating a fine-grained mesh of discrete points of interest and analyzing the PDE's behavior in a small neighborhood around each mesh point for a short period of time. Within these constraints, the PDE can often be approximated by a simpler system of equations that can be solved by iterative numerical methods. While these methods are the mainstay of many technical fields, they have some disadvantages: the process is time-consuming and must be rerun if any changes are made to the grid definition or parameters of the problem, as when validating different designs of an airfoil, for example.

Neural networks and deep-learning have shown promise in speeding up scientific simulations. For solving PDEs, the goal is often to produce a model which can be used to quickly generate sample data for statistical analysis; this is often applied in the inverse problem of trying to determine a system's initial conditions, given later observations. Here there are two previous deep-learning approaches: finite-dimensional operators and neural FEM. The latter use convolutional neural networks (CNN) to produce a parameterized approximation of a solution; however, the models are not mesh-independent. Neural FEM models, on the other hand, are mesh-independent but only represent the solution for a specific instance of a PDE, and must be re-trained if the parameters of the PDE are changed.

The Caltech team's approach is to build a neural network that can learn a solution operator; that is, it learns the mapping between a PDE and its solution. According to lead author Zongyi Li, the problem is analogous to an image-to-image model; instead of CNN layers, the network consists of a series of Fourier layers. These layers apply a fast Fourier transform (FFT) to their input data, then a linear transform, followed by an inverse FFT. The FFT results in a quasi-linear computational complexity and makes the model invariant to the spatial resolution of the data; however, it does require a uniform mesh.

The FNO work builds on the team's previous graph kernel network (GKN) paper presented at the recent NeurIPS conference. In a discussion on Twitter, Li pointed out that the slower GKN can be used in situations where a uniform mesh isn't applicable. He also noted that:

Models trained on one geometry may not directly generalize to another geometry. But we may use it as a pre-trained model or a preconditioner to help to get the solution on a different geometry.

The FNO code and pre-trained models are available on GitHub.

Rate this Article