PyTorch and TensorFlow: Which ML Framework is More Popular in Academia and Industry

Horace He recently published an article summarising The State of Machine Learning Frameworks in 2019. The article utilizes several metrics to argue the point that PyTorch is quickly becoming the dominant framework for research, whereas TensorFlow is the dominant framework for applications deployed within a commercial/industrial context.

He, a research student at Cornell University, counted the number of papers discusing either PyTorch or TensorFlow that were presented at a series of well-known machine learning oriented conferences, namely ECCV, NIPS, ACL, NAACL, ICML, CVPR, ICLR, ICCV and EMNLP. In summary, the majority of papers were implemented in PyTorch for every major conference in 2019. PyTorch outnumbered TensorFlow by 2:1 in vision related conferences and 3:1 in language related conferences. PyTorch also has more references in papers published in more general Machine Learning conferences like ICLR and ICML.

He argued that the reasons that PyTorch is gaining ground includes its simplicity, its simple to use and intuitive API, and (at least) acceptable performance, when compared to TensorFlow.

On the other hand, the author's metrics for measuring industry adoption show that TensorFlow is still the leader. The metrics used were: job listings, GitHub popularity, count of medium articles, etc. He posited that the answer to why the disparity between academia and industry is threefold. First of all, the overhead of a Python runtime is something that many companies will try to avoid where possible. The second reason is that PyTorch offers no support for mobile "edge" ML. Coincidentally, Mobile support has just been added to PyTorch by Facebook in version 1.3, which was released earlier this month. The third reason is the lack of features around serving, which means that PyTorch systems are harder to productionalize than equivalent systems developed using TensorFlow.

In the past year, PyTorch and TensorFlow have been converging in a several ways. PyTorch introduced "Torchscript" and a JIT compiler, whereas TensorFlow announced that it would be moving to an "eager mode" of execution starting from version 2.0. Torchscript is essentially a graph representation of PyTorch. Getting a graph from the code means that we can deploy the model in C++ and optimize it. TensorFlow's eager mode provides an imperative programming environment that evaluates operations immediately, without building graphs. This is similar to PyTorch's eager mode in both advantages and shortcomings. It helps with debugging, but then models cannot be exported outside of Python, be optimized, run on mobile, etc.

In the future, both frameworks will be closer than they are today. New contenders may challenge them in areas like code generation or Higher Order Differentiation. He identified a potential contender as JAX. This is built by the same people who worked on the popular Autograd project, and features both forward- and reverse-mode auto-differentiation. This allows computation of higher order derivatives "orders of magnitude faster than what PyTorch/TensorFlow can offer".

Horace He, the author of the article can be contacted via Twitter; he has published both the code used to generate the datasets and also interactive charts from the article.

Topics

Beyond the Breach: Proactive Defense in the Age of Advanced Threats

Cell-Based Architecture Adoption Guidelines

Launching AI Agents Across Europe at Breakneck Speed With an Agent Computing Platform

Making Digital Accessibility More Than Just High Contrast: Building Truly Inclusive Software

Proactive Approaches to Securing Linux Systems and Engineering Applications

Helpful links

Choose your language

Write for InfoQ

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

Cloudflare Introduces Workflows for Building Scalable Resilient Multi-Step Applications

Cloudflare Introduces Short-Lived SSH Access, Eliminating the Need for SSH Credentials

Microsoft Introduces Modern Web App Pattern for .NET: Accelerating App Modernization to the Cloud

Apache Tomcat 11.0 Delivers Support for Virtual Threads and Jakarta EE 11

AWS Lambda Introduces a Visual Studio Code-Based Editor with Advanced Features and AI Integration

Generally AI - Season 2 - Episode 5: Do Robots Dream of Electric Pianos?

Beyond the Breach: Proactive Defense in the Age of Advanced Threats

Steve Klabnik and Herb Sutter Talk about Rust and C++

Challenges and Lessons Porting Code from C to Rust

Grab Employs LLMs for Conversational Data Discovery with GPT-4, Glean and Slack

Cell-Based Architecture Adoption Guidelines

Software Architecture Tracks at QCon San Francisco 2024 – Navigating Current Challenges and Trends

Making Digital Accessibility More Than Just High Contrast: Building Truly Inclusive Software

What Developers Can Do to Continue to Program as They Age

How Rules Can Foster Creativity: The Design System of Reykjavík

Launching AI Agents Across Europe at Breakneck Speed With an Agent Computing Platform

OSI Releases New Definition for Open Source AI, Setting Standards for Transparency and Accessibility

Being a Responsible Developer in the Age of AI Hype

Optimizing Uber's Search Infrastructure: Upgrading to Apache Lucene 9.5

Improving the Efficiency of Goku Time-Series Database at Pinterest

Expedia Migrates a Massive Cassandra Cluster to ScyllaDB with Zero Downtime

QCon San Francisco

QCon London

InfoQ Dev Summit Boston

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?