Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News University of Washington Open-Sources AI Fine-Tuning Algorithm WISE-FT

University of Washington Open-Sources AI Fine-Tuning Algorithm WISE-FT

This item in japanese

A team of researchers from University of Washington (UW), Google Brain, and Columbia University have open-sourced weight-space ensembles for fine-tuning (WiSE-FT), an algorithm for fine-tuning AI models that improves robustness under distribution shift. Experiments on several computer vision (CV) benchmarks show that WISE-FT improves accuracy up to 6 percentage points.

The algorithm and several experiments were described in a paper accepted at the upcoming Conference on Computer Vision and Pattern Recognition (CVPR). WiSE-FT is an algorithm for combining the weights of a fine-tuned model with the original model's weights. The resulting ensemble model shows better accuracy under distribution shift---that is, when the patterns of input data differ from the training data---while still maintaining high accuracy on in-distribution data. In a set of experiments using shifted versions of the ImageNet benchmark dataset, a CLIP-based image classifier fine-tuned using WiSE-FT outperformed other strong models. According to the researchers,

We view WiSE-FT as a first step towards more sophisticated fine-tuning schemes and anticipate that future work will continue to leverage the robustness of zero-shot models for building more reliable neural networks.

Because training deep learning models from scratch requires large datasets and considerable compute resources, many developers have begun to use pre-trained models such as CLIP or GPT-3 as a starting point. While these models can be used in a zero-shot/few-shot setting, which requires no updates to the model weights, often they are fine-tuned by doing additional training updates to the model weights using a task-specific dataset. However, this can sometimes result in a final model that may perform quite well on in-distribution data, while performing poorly on out-of-distribution data---data whose statistics do not match that of the training data.

Because this distribution shift does occur quite frequently in a production setting, the UW team investigated ways to improve the robustness of fine-tuned models. The resulting algorithm, which can be implemented "in a few lines of PyTorch," is a linear interpolation of the weights of the original model and the fine-tuned one. A mixing coefficient can be used to give one of the two a stronger influence in the final result, but the researchers determined in a wide range of experiments that a neutral mixture "yields close to optimal performance." In addition to the robustness benefits, WiSE-FT requires no additional computation during the fine-tuning process or during inference.

To test the algorithm, the team built an image classifier model based on CLIP, with a final linear layer added to produce the output. The model was fine-tuned using the ImageNet dataset, then evaluated on five different distribution-shifted datasets derived from ImageNet: ImageNet-V2, ImageNet-R, ImageNet Sketch, ObjectNet, and ImageNet-A. Using WiSE-FT, the resulting model outperformed previous fine-tuned CLIP classifiers on both the reference ImageNet test data as well as the shifted datasets.

Co-author Gabriel Ilharco, a PhD student at UW, answered several questions about the work on Twitter. One commenter asked about using ensembles of several fine-tuned models, instead of including the original model. Ilharco replied,

We...find that you can substantially improve the robustness of standard models if you ensemble them (in output-space) with a robust model. If you ensemble two non-robust models, you get no gains in effective robustness.

The code for WiSE-FT and the paper's experiments are available on GitHub.

About the Author

Rate this Article