InfoQ Homepage Presentations From Research to Production with PyTorch

From Research to Production with PyTorch

Bookmarks

View Presentation

Speed:

Download

46:35

Summary

Jeff Smith covers some of the latest features from PyTorch - the TorchScript JIT compiler, distributed data parallel training, TensorBoard integration, new APIs, and more. He discusses some projects coming out of the PyTorch ecosystem like BoTorch, Ax, and PyTorch BigGraph. He digs into some of the use cases and industries where people are successfully taking PyTorch models to production.

Bio

Jeff Smith is an engineering manager at Facebook AI where he supports the PyTorch team. He’s the author of Machine Learning Systems and Exploring Deep Learning for Language. While working at the intersection of functional programming, distributed systems, and machine learning, he coined the term reactive machine learning to describe an ideal ML architecture and associated set of techniques.

About the conference

Software is changing the world. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.

Transcript

Smith: Let's talk about taking deep learning models from research to production. I am Jeff [Smith], I work at Facebook where we developed PyTorch as a tool to solve our problems but we did it in the open, and now we have this great open source project that I want to share with you. It's a big and a complex space to understand what all can you do within deep learning and how you can navigate it. Here's my idea of a map to how we can get started in this conversation. I really want to be focused on your productivity as a developer, and I'm going to presume that some of you have never used PyTorch before and have possibly never used a deep learning framework or done machine learning before. I really want to dive into what are the tools and parts of an ecosystem that you can take advantage of to help you be productive. I really want to focus on some of the concerns that a mature software engineer with relevant experience is going to have.

Here's my map to what we're going to do. The theme here is around that journey from research mode all the way out to production, and I'll talk about what exactly I mean by that a little bit later.

To begin, I want to just give a brief introduction to why we at Facebook invest in AI. You can see it all over our products, whether or not you recognize it as AI or not. These are things like translation, which help you connect with people who you don't share a language with, some of our AR effects and Spark AR, our ability to build computer vision models for virtual reality and Oculus products, and all sorts of other ways in which we use different forms of AI technology. The example there from blood donations is a social good initiative that is powered by natural language understanding technology. To do that, we need to invest in some technology that really can handle our scale.

What Is Pytorch

Our AI platform runs over 400 trillion predictions a day, and that number is climbing rapidly, which means that it's also deployed on over a billion phones around the world. This is every time a neural network on your device is using part of our technology to perform a prediction operation. What's the technology underneath that? Well, that's PyTorch. If you've not encountered PyTorch before, I want to be a little bit more specific about who we are and what our opinions are and the places we invest in building technology to make you productive. I want to start with eager and graph-based execution. This has to deal with how you, as a user, write your code. Eager mode means Python as you would normally write it, returning back a result just as soon as you invoke a function. I'll show you a little bit more about some of our investment in graph-based execution as a project, and how that works for you, and what that workflow looks like.

Historically, we've had a lot of innovative techniques come out of work done in PyTorch due to our ability to support dynamic neural networks, by which I mean neural networks which contain control flow, if statements, that are defined by you in regular Python. Also, one of the reasons why you might want to use a deep learning framework to implement your solution to a particular technology is that you need to operate over a distributed system. You need a cluster of machine somewhere. We have a very powerful distributed training library called C10D that can handle doing really awesome stuff across clusters of GPU-enabled servers.

Other things you might need out of a deep learning framework include hardware-accelerated inference. This means when you're performing a prediction in real-time, depending on what you're trying to do, you may really care about taking the best advantage of the CPU or GPU underlying your program. Doing that really well involves a lot of open collaboration between organizations like Facebook, Intel, Nvidia, and so forth.

Finally, just to ground you in how we think about how we write software within the PyTorch project, we really prefer simplicity over complexity. We want you to be able to do things in writing your code that you or what you were already naturally going to do. Write Python as you would write Python, look like NumPy when you're doing something that you would otherwise be doing at NumPy, and be very modular and opt-in.

Getting a little bit more concrete, what are some of those pieces when we start to look at PyTorch at a developer level? This is just a little bit of a high-level map to some of the pieces you might use within the PyTorch library. Most people start within PyTorch because they're interested in solving some problem with deep learning, and so that's where all of our neural network capabilities in the nn module come into play. I'm going to spend a lot of time today talking about our JIT, it's called Torch Script. This is really one of our major investments in terms of our engineering focus in building a path from research to production. There's a lot more built into PyTorch, whether it's within the core or within other libraries, and in the ecosystem you can see some of them there. I'll talk a bit more about some of them later. It's designed to be a very big toolbox to support a broad range of things.

I want to get some code up though just to give you a feel for what does PyTorch code look like. If you've not done deep learning before, I'm going to be fairly quick in just showing what are the basics of the code. First, we need to define what is a neural network here using the nn module. Let's get this started right out of the gate. We need to initiate our module. We're defining a forward pass, so that is the computation we want to do in our inference operation. Here, you can see the use of some pre-built layers, ReLU, dropout, sigmoid and so forth. Showing different parts of how you work with data, some more out-of-the-box functionality. Here, you can see a data loader here which is an abstraction which allows you to manage various data sets, and you can see a pre-built data set being brought in from the torchvision package, which is built to serve computer vision use cases - in this case, MNIST.

You can see an optimizer as well, stochastic gradient descent, all of those library-standard functionality. Then here's the training loop. What we're going to do is, we're going to iterate over each of our instances, and then you can see the steps that we have to go out there. We're going to have to zero out our gradients. We're going to apply the forward pass within our neural network. We're going to apply our loss function, call backwards on that, and then move forward in our optimization. Then you can see some checkpointing functionality there with torch.safe. It's a really simple high-level example of common neural network code, not a lot of PyTorch specifics here. I'll show a little bit more of an in-depth view of what the specifics of how you work with PyTorch are in an example a little bit later.

Research to Production

I want to frame that example within the context of the journey from research to production. If you're not experienced within the field of machine learning, parts of this workflow may not be something that you've seen in quite this variant before, but if you do have experience in building ML projects and products, you will recognize a lot of the commonalities to what I want to show here.

This is the very high-level view of how do we take something from an idea all the way out to real use case. First up, we would start off with some high-level plan. We're going to determine our approach to a given problem we're trying to solve with deep learning. Our next, and sometimes hardest, problem is preparing the data. This is where it's nice to have at least pre-built data to get started if you're trying to establish a technique or a data loader to manage the existing data that you may have. Then we're going to do the part that everyone thinks about when they think about machine learning, deep learning in general, that's build and train model. That's that example code I showed just previously. That's an important step, but this next step is one that people don't often talk about a lot, and this is transferring a model out to production.

Just because we've trained a model, all we have right now is an object in memory, maybe an artifact on disk. There's some additional step that we need to take to be able to publish that out, to get it out into the real world of production because we need to deploy this in some live mode and be able to scale it, whatever that means for your application.

High-level conceptual view is common on a lot of ML workflows. What I want to call out is that fourth step is really a very difficult one, and that's very much been our experience inside Facebook AI. We've seen a ton of pain in trying to build a toolchain that supports a productive workflow to go from beginning to end, getting stuck at that transferring from model to production. We want to do everything we can to remove that step by creating an end-to-end toolchain that supports being able to author a model in the same framework that you can then deploy it out to production.

I want to be more concrete about what I mean by production. We as a project think of production in a couple of these properties here. One of them is hardware efficiency, there are some very non-trivial aspects of getting truly state-of-the-art performance on CPUs, GPUs, domain-specific architectures like TPUs and ASIX.

Scalability - this is really being able to run across a whole cluster and being able to operate at extremely high throughput rates. There are some really fun and hard engineering problems here as well. Platform constraints - we want to deploy neural networks to more than a billion phones. We also want to deploy them to state-of-the-art servers. There are different challenges there, and if your production spans both of those things, neural networks running at massive scale on a server as well as running on tiny devices, then you need a toolchain that supports those things.

Finally, reliability - and this gets into the scalability and reliability component here. Large scale machine learning is an extraordinarily compute-intensive job and involves potentially thousands of GPUs operating at the same time, which has real world cost in terms of energy and in terms of spend for your organization. These are the things we think about when building on our production toolchain.

I want to talk about one piece of that production toolchain, and that's TorchScript, or torch.jit. What we want to do with TorchScript is to be able to power your transition from going from research to production. We want you to be able to experiment and then extract out TorchScript from your Python program, which can then be optimized and deployed into production. I'm going to show you a fairly-detailed working code here.

The key point here is, this is about extracting out that information from code you authored yourself, not adapting your programming model to fit ours. What this looks like is in what we call eager mode, this is PyTorch as it has always been, immediately returning out results as you would do in a normal Python program. You can prototype, you can train your model, you can run these experiments. Then you have these two paths to make this transition out to script mode.

One way that I'm going to show you is to use the script annotation there. You also have the ability to trace. They have sort of different properties, and the only way to work through it is just to have a live code example, so we'll do that next. I'm going to do that on Colab. Quick call out to Colab, Colab is a service provided by Google Cloud. We in the PyTorch project have collaborated with that team to bring the best of PyTorch to Google Cloud on Colab. I'm going to show you that live in a browser window right now.

Colab Demo

This is Colab, if you've never seen it, it's a hosted notebook collect service provided by Google Cloud. It gives you free access to CPUs, GPUs, and TPUs. I just hit "Connect," I'm going to hang out at the cloud. It's going to initialize, and you can see it's allocated me a server. I have the ability to choose the different properties there. I'm going to go ahead and run all of this code here live. As you can see, none of it's been executed yet, but I'm just going to run it all to speed our walkthrough of what this code is. One of the great parts of working with the Google team on this is, this just has this incredible impact on your ability to get started.

You can import Torch right away in an environment that supports it, the box running the latest version of PyTorch, and get started. That succeeded, we just "Import torch." Let's get started learning about TorchScript and the path to production.

Here's a very simple example of a cell. What we're using here is the nn module capabilities, and we are defining a simple initialization function, and we're going to define what is our forward pass. In this case, it's just a simple tanh, you can see the result of that. This is the basics of what we might do here, and we're just showing our results. It looks a lot like the code I showed you before. I want to iterate on this a little bit, we'll use this function here, we're going to use a linear construct. Linear allows us to hierarchically compose up our neural network. Again, we can do that in eager mode, step by step, we can just add in new elements to our neural architecture.

Let's get really fun, this is the heart of the thing here. This new thing is a decision gate, and so this is conditional logic. We want to say, "If in some cases the sum of this operation is greater than zero, we want to do one thing. Otherwise, we want to do something else." This is where we get into your flexibility as a developer and your ability to have dynamic neural networks that do different things based on the input they see. This is a really powerful capability, and it's a foundational technique to a lot of the most exciting work going on in deep learning today. Now, can we use that within our linear structure? The answer is yes, we can, we're able to do it. You can see the representation that that's produced here. Our linear contains this new function here.

It’s fully in eager mode here, so this is just Python as you would normally do it that was returned to you. You were able to get that immediately. Here's a little graphic calling out why that's important. It's because we're not requiring a full program to be able to do useful things, like define the backward function and perform automatic differentiation. We're building that graph on the fly every time you, in eager mode, pass out a new operation of some sort, and we can compute those gradients with whatever graph we have. The technique here goes by the name of "tape-based auto diff," by which we mean we're not performing the same sort of symbolic representation that you would do in a graph mode only. In fact, we're just replaying each individual argument so that we can perform differentiation at any point. This is what powers eager mode, and this is what makes eager mode work so powerfully in your workflows.

Let's talk about why we would want to jump over into TorchScript mode. Eager mode is great for being able to author models, but we want to be able to get the power of our production toolchain. We're ready to make that transition, that's where we're at in our development workflow. Here, we can see how we can trace the model here. This is the simplified one, this is just my cell, it doesn't have any of that conditional logic in it, it's a simple pass through. This could be done in a sort of eager mode, or it could be done in graph mode. What we've done here is, you can see this call to torch.jit.trace, that's what allowed us to extract out what is the computation graph of this particular neural network by putting in a given input, seeing what the neural network did, and then recording that out as a graph using TorchScript.

That works great if, in fact, you have no dynamic control flow, but what PyTorch has always promised is that you do get access to that dynamic control flow. Before we get to that topic, I want to show you just briefly, what do we extract. Here's a traced cell.graph, you can see the actual internal representation. This is a TorchScript IR here, the intermediate representation. It shows us what is our computation graph as the runtime understands it. It's not the most readable thing in the world in a presentation like this, so I'll show you what it looks like if we map it back out to more Pythonic code representation. You can see, it's pretty simple here, we can see what the operations are. We call the ADMM optimizer, you can see the tanh application. Pretty straightforward, readable Python code if you want to understand what is TorchScript extracting for your program. It did all that without requiring you to do anything other than say, "Trace it."

It's probably a good time to talk about why would you want to extract out TorchScript. Briefly, the thing that we care about is that by extracting out that graph, we now have a language-independent representation of what is the computation you want to perform. We can do certain things with it like write optimization passes that can help your performance. It also allows us to export out to deployment environments that don't have Python in them.

It's worth showing that when we do that, our trace representation, our TorchScript-extracted version, does precisely the same thing on the same inputs as your original program did. We've extracted out a new program from that, the computational graph, which you can see here. When you call my_cell, you've invoked in eager mode. When you call traced_cell, you're operating on the computation graph that has been extracted. Exactly the same results.

Let's do the fun part, I've been eager to get to this part. Let's actually deal with control flow. If we have control flow like this and we try to perform tracing, PyTorch will do its best but, in fact, we're in a more dynamic form, so this is not going to work right. It's not going to be what we really want. Here, we can say we've got that same control flow, we've got our decision gate function, but when we trace it, we've only traced it on a single instance, just one input. The trace of that is the program without the control flow. It has no ability to capture that in a pure trace because it only saw one instance. That's not what you wanted, and so your code is dropped out. That's why we provide you with yet more powerful tools than simple tracing.

What are those more powerful tools? That's the script method you can see here. Here's what I've had to change to my code. By doing that, we can say that this is actually something that needs to be scripted, that we're not going to use pure tracing, that we actually want to use script mode here. What that script does is, it tells the TorchScript JIT compiler to extract out what is this part of our computation graph, don't just trace it but actually understand what it is, map it to TorchScript, and give me the fuller representation of that computation graph. If you can see there, we have new code extracted as a result there. That's this "If" statement here. You can read them in if Boolean and so forth, down in this section here. We have accurately mapped from our representation in plain Python out to a computation graph which can take advantage of all of those optimized static graph components which are built for production mode.

As we can see, it works just the same way, whether it's been run in eager mode or extracted by a TorchScript. That is, at a code level, what that transition from research to production looks like using the power of TorchScript.

Something that may not be obvious about seeing that at a code level is, this is a really powerful breakthrough for folks who have had to live in this research and production divide for a long time, which matches my personal experience and the experience of Facebook AI. That code that you just saw allows you to operate within a shared codebase for a given domain on top of a common technology so that a research team and a production team can be using the exact same tools and have a path from research to production. This is really important for us at Facebook because we often have really deep research things going on within Facebook AI research, our fundamental research function. That needs to be connected up to the ways in which we deploy PyTorch models to production at scale and take advantage of all those capabilities we have.

Here's a concrete example of that, this is the PyText library. This was developed at Facebook, really focused on natural language understanding and some of the specific problems of working with text. We've done some great work in being able to use the technology coming out of research and putting into production with high text. It has real-world production requirements, it needs to operate in real time. A common use case of this library is that we're going to be performing recommendation predictions inside a Messenger session, inside Facebook Messenger. It needs to scale, Messenger operates in hundreds of languages that takes that many models. It needs to operate around the world on billions of devices, and so forth. There's a lot of complexity to the end-to-end picture of that.

I want to put that in an architecture diagram of what the workflow looks like. In research mode, we need to be constantly finding new ideas developing running experiments and developing new techniques which require this great flexibility. That flexibility is great, and at some point, we need to evaluate out those things, maybe sweep through some parameters, but then we probably have something we want to put to use on some level. We're going to export a PyTorch model at that point. We're just going to do this in eager mode, we're going to stay in Python land, and we're going to deploy to a simple Python service which allows us to get a little bit of small-scale production metrics. This isn't a full production deployment, this is for us to feel comfortable that things are in good shape.

When we do feel like we have found a successful new technique, we can now make that step that allows us to deploy from research production, still using the exact same toolchain by just doing the same sorts of steps we just saw, by exporting the TorchScript that allows us to take those follow-on steps with that optimized model. That PyTorch TorchScript model is now exported in a Python-free way so that it can be used inside of our highly optimized massive scale C++ inference service that can serve billions of people. Folks who are working on both ends of this problem are within the exact same codebase and using the same toolchain.

h2 class="expanded">Personalized Cancer Vaccines

I've shown you some fair bit of specifics around how we used PyTorch at Facebook. I want to show you just a glimpse into the larger PyTorch community, a couple of production examples here. The first one is from Genentech, Genentech's working on a pretty important problem, which is personalized cancer therapy.

The biology of cancer is such that each individual cancer is unique. Your body's response is going to be unique, and there's a lot of deep data problems in there that they're attacking with AI. The approach that they're working on is they want to leverage your own immune system using AI, built in PyTorch, to fight cancer. Specifically, they're doing things like identifying peptides which can be bound to expose some part of a molecule that looks like the specific cancer within your body out to your own immune system, to teach your immune system how to fight against it, a sort of personalized cancer vaccine. They've seen some great results.

Predictive Driver Assistance

Moving on to a very production-oriented example, I want to talk about Toyota. Within their research group, Toyota Research Institute, they've been able to have this really amazing journey from research production using PyTorch. They're concerned with driver safety, more than a million people die in traffic accidents every year. The statistics are staggering, and Toyota, as the largest car manufacturer in the world, really wants to have an impact on this. They're investing in technologies like autonomous driving cars, but what we can deliver today, potentially to cars, are driver-assistive technology, so predictive driver assistance that takes advantage of some of the capabilities that we would be building inside of an autonomous car, instead of a car that's driven by a human to prevent crashes.

They've been able to collect real-world examples of crashes and they've been able to map them up into a digital world. What we're going to see here is a crash being avoided by a car that accelerates out of the way from other drivers which are losing control of the car behind them. This is a simulation they built from real-world data. We're able to model it into this simulated world using PyTorch. In the world of cars, when we shift to production, we don't just mean simulated worlds, we mean cars on the road. Once they've validated this model, they need to then actually get cars out on the road, power them, and try to replicate the exact same examples, see if the machine learning model is able to determine that the cars behind it are about to lose control and accelerate out of the way, just using a new intelligence capabilities but nothing more than the same capabilities built into the car today to accelerate. It’s a really interesting look at how we can put AI to use in driver safety.

Libraries: Optimization

I want to talk a little bit more about the broader ecosystem. I'm going to talk about a few libraries that came out of Facebook that are focused on optimization problems, and they're both focused on techniques derived from Bayesian optimization. Bayesian optimization is a statistical technique distinct from deep learning. The example problem I'm going to talk about here is hyperparameter optimization. Inside building deep learning models, there are all of these magic numbers like the learning rate, and various other hyperparameters, the number of epochs, and so forth, that determine whether or not you're going to be successful in training out a model that is performant and does the right thing. They're magic numbers, they're established by heuristics, previous experience, looking at someone else's paper, hoping for the best, running too many jobs. There are better ways of doing this using Bayesian statistics, and so we built out a toolchain to do that using PyTorch.

First layer is the BoTorch library. BoTorch is really around pure Bayesian optimization. It's built on PyTorch and it uses some of the probabilistic modeling capabilities exposed out by GPyTorch, a Gaussian process library, also an open source, but it's really a very unframework, highly modular way of exploring techniques that allow you to perform Bayesian optimization.

Moving up the stack, one of the ways in which we put that to use is our adaptive experimentation framework called Ax. Ax tries to generalize some of those concepts of how we can use Bayesian optimization techniques to develop domain-agnostic abstractions around trying to optimize for particular goals, deploy that all out, and really make that happen very autonomously.

Here's an example of what this looks like, this example is showing News Feed ranking. In this case, there are all sorts of models that we want to deploy out as well as other components of data that we call the configuration. In some, all these pieces of data determine what is the basis for which item should be ranked within your News Feed. We want to make changes to this all the time, we want to continually make this better. Online simulation is the gold standard for getting good labels and feedback that our machine learning model is doing well.

Unfortunately, online experimentation is a scarce resource. We only have so many users working at a given time, and we don't want to expose them to untested models. The way that Ax plays a role in this is that we can build a multitask model which unifies real-life testing data from users with a much much larger amount of offline simulation data that wasn't exposed out to real users. It gives us an ability to understand whether or not we're accurately statistically modeling the properties of the system and choose which new model to deploy.

Both of these libraries, Ax and BoTorch, are open source. We just released them about a month ago at F8, and I like to call it that they're part of a larger ecosystem, which I'm going to show you a few more examples of. This ecosystem, in particular, this part of the ecosystem, BoTorch, PyText, are translate platform built on Fairseq, our horizon reinforcement learning platform. These are all things that we built to solve our own problems at Facebook, and we really want to make sure to give back to the wider community of developers to allow you to work with these things, use them to solve your own problems.

Developer Resources

I want to talk specifically about some more of those developer resources that can help you get started in becoming someone who can take models from research to production and use them to solve real-world problems. From the perspective of PyTorch and Facebook AI, we really care about the whole stack of things. We want all of these to be existing in open source, and that means everything from our Open Compute platform project where we're open-sourcing our data center hardware designs, all the way up to our compilers investments where we're collaborating in open source on things like Glow and TDM, PyTorch itself, and then higher level frameworks. As you get closer to having more specific machine learning needs, you probably care about things like pre-trained models, and we publish a lot of those, I'll show an example of that soon, and then even data sets to allow you to develop new techniques yourself because this is part of the larger PyTorch ecosystem.

There's a specific part of pytorch.org, the ecosystem page, I would encourage you to go directly to if you want to see what are some of the projects that you can get started with on some of the domain-specific libraries. Some of these are going to be for things like NLP, some of these are going to be for things like vision, robotics. There's a rich community of people collaborating today in PyTorch. I'm going to do one more bounce across screens just to show you a good example of something I really like there.

An example of some of the great things you'll find in the ecosystem here is Papers With Code. This records computer science papers which develop new techniques and it links them up to actual implementation so that you can get started here. You can see the link here, we can go up to the PyTorch Hub for this paper. PyTorch Hub is our collection of implementations of models in PyTorch, and you can see not just the code of this thing. We can see an explanation, some of the resources, this is all directly loadable within your own code. Here, we can just click the button, launch it up on Colab and we can see that we have the ability, with a single line of code, to bring in this specific model from the PyTorch Hub. People are adding out new models all the time, this is an open collaboration, it's not just models coming out of Facebook. It's all sorts of PyTorch users sharing their work and helping others be productive.

I think you've heard me say a lot about developer productivity. To call out here, I've shown a few examples of some of those tooling today, but there's a lot. We work very closely with Amazon, Microsoft, and Google to make sure that your productivity as a developer who wants to have access to the best of the tools the cloud can provide support PyTorch out-of-the-box, things like "Import torch" or having it supported within VS code, and things like that.

This is a fun example of something that just launched something like six weeks ago. This is called an AI Platform Notebook, it comes from Google Cloud, and this is a really exciting new toolchain that is a notebook-like environment connected up to the most sophisticated production deployment technologies that Google Cloud is developing. As with my Colab example, you can just "Import torch" and get running, and they have examples built into it, out-of-the-box.

Continuing the love, this is another collaboration we've done with Google. This is TensorBoard, probably the state-of-the-art, everyone's-favorite visualization tool for working with training deep neural networks. This is an embeddings view, visualizing and embedding space, but there's all sorts of things like learning rates, and various other ways to understand what is going on inside your deep learning program. TensorBoard itself is open source, and it supports PyTorch out-of-the-box and inside Colab as well.

Education

Part of being productive is knowing what you want to do in the first place, and so, we care a lot about developer education. We've been overwhelmed by the enormous growth of the community. Last year we were the second fastest growing project in all of open source. We've had to think a lot about how to bring new people up to speed, and also to point to leading alliance in the community and ask them to support them when they try to teach others. Here are some examples of two books written by folks who were in the PyTorch community, they're not Facebook employees, they're really great. "Natural Language Processing with PyTorch" and "Deep Learning with PyTorch." Both of these books are great places to start in understanding how you can solve real-world problems using the best of PyTorch technology.

I think I mentioned this a little bit in the panel, but we have some great courses as well. Udacity has worked with us on a whole bunch of courses. You can see I'm scrolling there all the different topics you can study in Udacity using PyTorch. We care a lot about this, some of the more recent steps we've taken is we've collaborated with Andrew Trask who is the developer of Open Mined and PySyft in the development of a privacy and AI course deployed on Udacity. We're also funding scholarships for people to continue their studies on Udacity in learning more about deep learning. To call out that privacy course is, in fact, the library implemented in there. PySyft is a PyTorch library that contains a lot of powerful techniques for working in privacy-preserving ML techniques across the range of things known within computer science.

Bouncing over to another great educational partner, fast.ai is an online AI school that has extraordinary stats around the numbers of people they reach, and their global reach, and their ability to reach people all over the world and teach them the absolute latest in artificial intelligence techniques. They've just launched new courses in PyTorch, and they've done some great development of having easy-to-use beginner libraries that help someone become productive as a learner. Those courses and those libraries right now, the latest ones, are around audio and vision, major areas of deep learning activity.

What does it mean if you walk away from this talk and want to get started? I pointed to the "Get Started" page on PyTorch. Some of the ways that you can get started are to just click the button on the docks, and that will launch you out to a live Colab instance that has the code from the examples there. You can also get started in the cloud, on Azure, on Amazon, on Google Cloud. You can also install locally, just hit install, start playing around with it on a laptop. You don't need a massive GPU back server to start writing code, but then once you do want to go ahead and move up to that scale, you know that you have that support there, both within the PyTorch technology itself but also within the larger ecosystem of open source libraries and cloud partners that help you be productive in moving your models from research to production.

That's all I really want to have to say. I just want to invite you, if you are interested in this part of open AI collaboration, I want you to join in the community. We live out here in open source, we want to talk. We want to learn more about what you're doing, and understand what we can invest in and spend our time building to make you successful in taking your deep learning models from research to production.

Questions and Answers

Participant 1: Great talk. You used deep learning and machine learning interchangeably. Do you see any distinction between those two, and if so, what is it?

Smith: I think that's probably just me rushing through things a little bit. For folks who are unfamiliar with how you would normally break down the terminology, I would say, deep learning can usually be classified as a subset of all available machine learning techniques. Part of that fluidity there is PyTorch, as a technology, embraces more problems than simply deep learning. Deep learning is one of the ones that we work on, but we also do a lot of work within scientific computing that can sometimes use other techniques. For example, the discussion of Ax and BoTorch, those are non-deep learning-based techniques, but they aren't built on PyTorch. We work at the intersection of both.

Participant 2: For someone who has a lot of experience in Python and has a lot of interest in machine learning but no experience with PyTorch, what would be the one definite source that you would say is the first that someone should look at?

Smith: It depends on how much time you have. Some common answers are, you can do the 60 Minute Blitz tutorial if you've got an hour, that's on the PyTorch docs. If you want to commit to a deeper learning process, the Udacity and fast.ai courses are great. I also like the books as well for people who feel like they have enough grounding to be able to be productive in working through a book. I think the depth of the examples and the ability to dig deep, and some of the things you can do, specifically, in a book are nice as well. Those would be my three main starting points, the tutorials, the courses, or the books.

Participant 3: Just quick question on how this platform, PyTorch, is compatible for something like Edge devices like Raspberry Pi. Is there a lighter version of PyTorch?

Smith: This is, I think, a moving target within the field of deep learning in saying, how can we take something that's huge and how can we make it small and work really well? The story of what works today and that works well is really focused around mobile, for us specifically. There are paths to take a PyTorch model and export via Onyx, which allows you access to various Onyx runtimes. I didn't talk about it in this talk, but Onyx is an open source standard collaboration that we and Microsoft and others created around the open neural network exchange format. There are some toolchains for that, and there are as well other ways to take PyTorch models and run them on things like the Caffe2go mobile runtime. It's an important area, without saying anything super specific, obviously we care and we will keep doing more stuff. This is, I think, a moving target for the whole field.

Some other pieces of the PyTorch stack worth calling out, if you happen to be that specialized, if you're working, are our FBGM and QNNPACK, which are quantization libraries. Quantization is a necessary technique in shrinking down a very large deep learning model and making it possible to run either efficiently on a server or efficiently on a mobile device, and both of those are open source projects that we released last year to support the larger PyTorch ecosystem.

Participant 4: Something that we saw earlier today in the TensorFlow talk was that they had a functionality in their new API for being able to take pre-trained published models, and then use those as input layers in your models that you make that are derived from that. You talked about PyTorch models that Facebook provides and other researchers provide. What's PyTorch's support for things like, if I wanted to use pre-trained language classifier or image classifier, and use that for some domain-specific or proprietary modeling on top of that?

Smith: This is an area that I think is emerging that people who really want something that helps with their productivity. I think, in the old days where there used to be things like static models and people published up, "Here are some things from these papers," and they just sat in the folder and no one ever did anything with them, these days, what we care about doing it within PyTorch is the PyTorch Hub. I bounced over and showed a little bit of that. PyTorch Hub is where we would like people to share their pre-trained models and do that sort of connection up to, "Where is the paper this is derived from? How does someone use this?"

It's pretty easy to get started. We've even collaborated with a bunch of small startups that have been able to put up their work and get it onto the hub. It really helps with the reuse of a model because it's a single one-liner. You just call it in as "hub download" or something like that.

Participant 5: I do have some machine learning projects going on, but they are more traditional machine learning things, like KMIS, SVN. I don't know if I can get benefits if I port all this kind of stuff to PyTorch. Can I do this and can I get any kind of benefit from porting KMIS or SVN?

Smith: I think I understand the question. If I've got an existing implementation of something that uses a non-neural network-based technique, is there any benefit to trying to work with PyTorch? I think it depends. Some of the things that I would typically be looking at are, do you have the ability to take advantage of sufficiently large datasets that you would like access to things like GPU acceleration? We do have a portion of the community of users that I would generally group in scientific computing, working on problems that are not deep learning, and they take advantage of PyTorch as a GPU accelerator Tensor library, so NumPy on GPUs. If that sounds useful to you, then maybe we could talk and find whether or not there's a way to put that to use for your use case. It tends to be a little bit domain-specific though.

Participant 6: I have a very short, light-hearted question. I wonder what happened to the old logo?

Smith: You have the old logo on your stickers. As of PyTorch 1.0, which was announced at F8 of 2018, and then delivered at PyTorch Dev Con of 2018, PyTorch 1.0 now reflects the union of the PyTorch technology, the Onyx technology, and the Caffe2 technology. Caffe2, briefly, for anyone who's unfamiliar with it, is a deep learning framework developed at Facebook, also an open source, and deployed to production inside Facebook. We chose to unify those, really, to achieve that research to production story, being able to have all those broad capabilities. When we did that, we created the slightly more futuristic logo that you see now.

Participant 7: I wanted to ask the opposite question of the talk that was on TensorFlow. Why would someone use PyTorch over TensorFlow?

Smith: I did have a slide where I tried to be fairly specific about our philosophy as a project, and so I'll speak in a largely affirmative mode. I wanted to paint a picture for someone who wants to take something, and if that sounds like the sort of journey that you want to do, and the ways in which you want to do it, there's a very specific approach at the code level, which is why I try to make that as clear as possible, around how little you need to adapt your programming model to match the capabilities we're trying to provide out to you. If you take a look at the tutorials, you open them up, you start running things, and that is the way in which you want to work, we would love to collaborate with you on that. We want to support people who appreciate this flexible and highly modular approach that allows you to opt into the pieces that you care about. I've got nothing bad to say about TensorFlow at all, they're great guys as well. I appreciated Brad's [Miro] talk, it was very worthwhile.

See more presentations with transcripts

Recorded at:

Aug 20, 2019

Jeff Smith

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?