Key Takeaways
- GPT-3 is the largest language model trained today.
- The basic operation mode of GPT-3 is to generate text responses based on the input text. Eg to answer a question or to write an essay based on a title.
- OpenAI now provides a developer API to interact with GPT-3 and build applications on top of it.
- GPT-3 is a few-shot learner. It requires priming with a few examples to work in a specific context.
- Once primed correctly, GPT-3 could perform math calculations and generate answers in programming languages, although it has not learned either explicitly.
The first wave of GPT-3 enabled applications have stunned the "developer twitter". They offer a glimpse of our AI future.
The GPT-3 (Generative Pre-Trained Transformer-3) is OpenAI's latest and greatest natural language prediction model. Simply put, it generates text in response to any input text. It is a program that responds to questions or statements.
The GPT-3 is pre-trained with a large amount of natural language text from the Internet (45TB of training text with 499 billion words). It cost at least 4.6 million US dollars (some estimated as high as $12 million) to train on GPUs. The result model has 175 billion parameters.
InfoQ covered OpenAI’s GPT-3 announcement back in June. It is 100x bigger than any previous language AI model. In the official GPT-3 research study, the OpenAI team demonstrated that GPT-3 achieves state-of-the-art performance out-of-the-box without any fine tuning. But how does it work in the real world? Is it just another toy or a serious threat to humanity? Well, a month after its initial release, the first GPT-3 powered applications are emerging. Now we can see for ourselves.
We feel most developers can absolutely build projects using GPT-3 very quickly. - Yash Dani & George Saad
In this article, we interviewed many of these creators and entrepreneurs, and reviewed some of these applications. Developers and journalists alike have described GPT-3 as shockingly good and mind-blowing.
How it works
The GPT-3 model generates text one word at a time. As a hypothetical example, let's say that a developer gives it the following words as input.
"Answer to the Ultimate Question of Life, the Universe, and Everything is"
The AI model could generate the word "forty" as the response. And then, the developer appends the generated word to the input and runs the model again.
"Answer to the Ultimate Question of Life, the Universe, and Everything is forty"
This time, the AI model could generate the word "two" as the response. Repeat again, and the next response should be the period sign, hence completing a sentence.
"Answer to the Ultimate Question of Life, the Universe, and Everything is forty-two."
GPT-3 can do this because it has seen this particular pop culture reference many times from the text in its training. So, its neural network can guess the "next word" with a high degree of statistical certainty.
However, in natural language, predictions are not always so clear-cut. The word that follows an input often depends on the context. That is where GPT-3's strength as a few-shot learner shows. Few-shot learning is to prime GPT-3 with a few examples, and then ask it to make predictions. That allows the user to give the AI model a language context, and dramatically improve accuracy. Figure 1 shows examples of zero-shot, one-shot, and few-shot learning to prime an AI model to generate foreign language translations.
Figure 1. Three types of learning for an AI translator. Image courtesy Language Models are Few-Shot Learners, Fig 2.1
Few-shot learning is remarkably similar to how human babies learn languages. The learner learns from language examples, not from grammar rules. As we shall see, by priming GPT-3 with different examples, developers can create very different applications.
Jay Alammar wrote a great article with visual animations to show how GPT-3 works. Check it out to learn more.
Does it pass the Turing test?
One of the first questions people ask about a language AI is whether it can pass the Turing test and fool humans into thinking that it is a human? Well, some argue that GPT-3 can already fool humans. Figure 2 shows an essay generated by GPT-3. According to the GPT-3 team, less than 12% of humans can tell it is written by a machine.
Figure 2. An original article was written by GPT-3. Image courtesy Language Models are Few-Shot Learners, Fig 3.14
With a little priming, GPT-3 can mimic the writing styles of famous people. The Learn from anyone project allows users to pick a famous person and then to provide a topic. It primes GPT-3 with known writings of this person and then uses the topic text as input. It returns a 200-word essay subsequently generated by GPT-3. The results speak for themselves. One tweet shows how it easily fakes Elon Musk to talk about rockets. Now imagine Thomas Jefferson commenting on Mars exploration! Or to generate a Dr. Fauci quote about COVID-19 and mask-wearing. Can we trust anything on the Internet in the age of GPT-3? We will come back to this point later in this article.
Of course, besides celebrities, GPT-3 can also emulate anyone! Developer Ravi Vadrevu created a service to write a business email snippet for any user. The application primes GPT-3 using the user's past email writings. With a specific intent as input, such as recruitment, networking, or sales, GPT-3 writes an email on behalf of the user. Services like this bet on GPT-3 passing the Turing test.
Translators and lawyers
Of the 499 billion words GPT-3 learned from, some of them are not English. By associating words with context, it appears that GPT-3 can do translations (also see Figure 1). In the GPT-3 paper, the authors gave examples on how to prime GPT-3 to do English to French and Spanish translations.
The ability to understand English words in the context brings interesting possibilities. For example, Revtheo is a GPT-3 based dictionary that gives users the meaning of a word based on its context.
But perhaps more interesting is for GPT-3 to do paragraph-based English to English "translation"! That is to rephrase a paragraph of English text to make it simpler or more rigorous. Legal tech entrepreneur Francis Jervis primed GPT-3 to "write like a lawyer" and give it everyday English statements to translate into legalese. The results are quite promising. Obviously, it is difficult to take machine generated legal language at face value, but even legal experts note that GPT-3 could be an assistant to attorneys and increase attorney productivity. On the flip side, investor Michael Tefula primed GPT-3 to translate complicated legalese to plain English. In both cases, only 2-3 examples are needed to prime GPT-3. The results are not perfect but quite close. Remember that GPT-3 is not trained on legalese. It was just primed to do in a few simple examples.
Accountants and designers
One of the fascinating findings in the GPT-3 paper is the ability for the AI to "learn math" from language. The AI is never taught the underlying structure and theorems of math. Yet, it can generate the correct answers to math questions. For simple two number additions, GPT-3 is almost 100% accurate despite that it has never learned what numbers mean. Figure 3 shows some examples from the GPT-3 paper.
Figure 3. GPT-3 does math. Image courtesy Language Models are Few-Shot Learners, Figs G.42 to G.48
Combining this math capability with the fact that GPT-3 has seen a lot of structured data in its training, it seems possible to prime the AI to respond to English inputs with structured data output such as JSON or XML.
Developers Yash Dani & George Saad primed GPT-3 with eight examples to turn the English description of a transaction into a Python data object. Here is an example of their training data.
- Input: I bought an additional $1200 worth of inventory which I paid for immediately.
- Output: [["add", 1200, "Inventory"], ["remove", 1200, "Cash"]]
They then wrote a Python program to process this object and insert its content into an Excel spreadsheet. The result is an automated accountant who can update financial statements based on casual descriptions of transactions.
If we can use natural language to manipulate and edit Excel files, maybe we can do the same for PowerPoint? Sure enough, here is a PowerPoint generated demo from twitter user nutanc.
Developer Jordan Singer took a similar approach to build a Figma plugin. This plugin allows users to describe the user interface (UI) in English, use GPT-3 to generate a structured JSON representation of the UI, and then use a computer program to render the JSON content in Figma. Read more about Jordan's developer experience.
In these examples, GPT-3 outputs structured data, which is then processed by another computer program to complete the task. This seems to be a very promising modality of natural language AI applications.
The no-code programmer
If the GPT-3 can generate structured data used by computer programs, maybe it could go one more step and generate computer programs directly. Is this possible? The answer appears to be yes!
- Generate Latex equations from the English description by Shreya Shankar.
- Generate SQL queries that work on actual databases by Faraaz Nishtar.
- Generate a 3D scene in JavaScript based on a description by Antonio Gomez.
- Generate AWS CLI commands for managing servers by Suhail CS.
- Generate complete React application UI by Sharif Shameem.
Several developers told InfoQ.com that priming is critical for the performance of GPT-3 in generating structured output. The system needs to be primed with just the right examples. Like other deep neural networks, GPT-3 is mostly a black box to humans. That has made it challenging to come up with the correct examples to prime it for exact outputs. It is a trial and error process that could take days.
It requires some examples to catch the pattern and once that is done. It works like magic. - Tushar Khattar
Developing GPT-3 apps is not to write algorithms in traditional programming languages, but to come up with natural language examples to prime the AI. It requires a new type of no-code skills that will create new jobs in software development.
Human describes, AI builds, human debugs. - Vitalik Buterin, Creator of Ethereum
Machine-generated code could be a fascinating (and profitable) area of research going forward. We could even purposefully build AI systems that generate programming language outputs, such as Facebook Research's TransCoder project, which use unsupervised learning to translate from one programming language (e.g., Python) to another (e.g., C++).
An AI we can understand
Despite the OpenAI name, GPT-3 is neither open source nor open access. It provides a simple web services API for developers to prime the model, and then send in a text to get a response. The API is simple, but there is currently a waiting list.
It has been amazing so far. Being able to see how quickly GPT-3 learns is incredible. The API couldn't be simpler, with one core endpoint you interact with. And the idea of text-in, text-out allows for you to do anything your mind can come up with. - Jordan Singer
The barrier to GPT-3 access is deliberate. As we have seen, GPT-3 is a powerful piece of software. However, since it is a black box, we cannot easily predict or control the text it generates. As we discussed, priming for the exact output is mostly a trial and error process. Given the amount of racism, misogyny, and other hateful content that exists on the Internet, and hence probably in GPT-3's 499 billion words training data, an unsupervised GPT-3 could generate text that is biased or hurtful. For example, just think about the kind of convincing-sounding fake news articles GPT-3 could generate.
The developer community must use powerful AI systems responsively. It probably will require us to have a deeper understanding of how language models work, as opposed to just putting up banned word lists.
People get mad when AIs do or say politically incorrect things. What if it's hard to prevent them from drawing such conclusions, and the easiest way to fix this is to teach them to hide what they think? That seems a scary skill to start teaching AIs. - Paul Graham on Twitter
While it is tough for humans to understand, much less to explain and control, the reasoning inside the AI black box, could the AI explain itself to us? Software developer Daniel Bigham asked GPT-3 to correct grammar mistakes in English sentences and to come up with reasons for the correction. The results are quite surprising, as GPT-3 has never been explicitly taught grammar rules. It derived understanding from examples. Here is an example. It is a tiny, yet tantalizing, step toward AI explainability!
- Input: Its time to get cracking!
- Output correction: It's time to get cracking!
- Output explanation: The contraction "it's" should be used instead of the possessive "its".
We anticipate developers will keep pushing boundaries on what's possible for GPT-3 to create and explain!
A black box
While the GPT-3 has shown great promise, it still exhibits some issues that have long plagued neural network AIs. Specifically, it does some tasks amazingly well while fails miserably on other common sense tasks. Does the AI really understand the tasks given to it?
At the philosophical level, it might not matter. After all, the AI can do math, translation, and grammar checks. Does it matter that the AI was never taught the concepts of math and grammar? GPT-3 was able to derive math and grammar rules and apply them. But for developers who build GPT-3 applications, it is troubling not to know the boundary of the AI’s "knowledge" and need to watch out for cases the AI cannot handle.
This article showcased some impressive examples of GPT-3 applications. But developers also told InfoQ about the need for carefully "priming" the system in order to generate those results. So, in a sense, those results are cherry-picked evidence. Worse still, since GPT-3 is a black box, it is hard for developers to understand why certain priming works while other approaches fail. As mentioned in the previous section, explainability is probably one of the most important limitations before GPT-3 can be widely adopted.
An AI for everything
GPT-3 demonstrates that AI performance increases with the model size in a power-law relationship. The ever-growing model size will produce more powerful and more accurate AI. Could this be the Moore's law of our time?
Deep learning pioneer, Dr. Geoffrey Hinton, extrapolated GPT-3 and joked that an AI that can answer the ultimate question of the universe will need not 42 but 4.2 trillion parameters. That is only 25x from GPT-3.
Extrapolating the spectacular performance of GPT3 into the future suggests that the answer to life, the universe, and everything is just 4.398 trillion parameters. - Geoffrey Hinton on Twitter
Now recall the hypothetical question we asked at the beginning of this article. The answer 4.398 trillion is now on the Internet and will be part of the training data for GPT-4. How would GPT-4 answer the ultimate question of the universe then?
Disclaimer: This article is written by Vivian Hu, a human being. It is not written by GPT-3.
About the Author
Vivian Hu is an open source enthusiast and developer advocate in Asia. She is a product manager at Second State. She cares deeply about improving developer experience and productivity through better tools, documentation, and tutorials. Vivian writes a weekly newsletter for WebAssembly, Rust, and serverless at WebAssembly Today.