Transcript
Morgan: I'm going to talk to you today about building AGI. So not just AGI as science fiction but actually what it might take to build it. If I do nothing else, I hope I'll spark your imagination a little bit. Basically the first question we have to ask is, what is intelligence? Who here has actually asked themselves, what is intelligence? It's a very, very deep question. Yes, a few. Well, that's good. So we're starting to think about it as a society, I guess.
And then I'll move on and we'll talk about the hardware basically, looking to biology for inspiration because that's our existence proof? It's us. We're intelligent, supposedly, some of us. And then we'll say, "Well, how come we haven't built them yet? If all it takes us three pounds of wet matter in our skull what's so clever about nature? What are we missing? How far along that road are we? And that's really the crux of my talk.
Who here has heard of deep learning? Hopefully, all the hands go up. This is what people think of as AI today, and I'm going to show you it's not AI at all but it's narrow. But that's okay. Maybe some people claiming it might get us there, but we'll have a look at what we're all doing. Maybe at work, we're using TensorFlow, that kind of stuff. Will that get us there? Is that part of the journey? Is it nothing to do with the journey? Then finally the heart of the talk will look at AGI developments, what's being done, what's being built, who's out there doing it, and how far along the road are we actually. Have Google built AGI, have DeepMind, have I, who's doing it and where are we?
The TLDR - too long, didn't read it. I basically believe that we can build AGI systems in the next few years. That's a bold claim so I'm going to have to spend the rest of my 40 minutes backing that up, and we're going to use a particular theory to do that. Why do we want to build AGI in the first place? Who the hell would want to build the Terminator that's just going to turn around and kill us? Why would we want to do that? Well, that's one scenario. That's the gloomy one. That's a dystopic one. But basically, humans are driven by curiosity. Even if we said it's going to kill us, somebody in their bedroom would keep doing it, because we have this insatiable drive, it's called curiosity.
But also, we're always pushing ahead. We're still here. We haven't wiped ourselves out, so there's some sort of innate good in us. The reason is if we build these intelligent systems like ourselves, generally intelligent, then we can use them to solve cancer, to accelerate the pace of discovery in science, cure Alzheimer's. Then if we get the algorithms right, you can see quite quickly how we will scale, just fill data centers full of these general intelligence algorithms on the right kind of hardware. And suddenly we've got super intelligence. It sounds a bit "Star Trek", but this is kind of the road we're on. I'm not sure if anyone's thought about that or maybe just watched a dystopic movie. They all seem pretty dystopic. But this is really the road we're on. It's the reason DeepMind was set up in the first place. They have about 800 people now. So yes, we're sort of going for it. It's sort of unstoppable, if you like.
Who here thinks that AlphaGo, speaking of DeepMind, is general intelligence? Not many. I won't even bother asking you the rest. We've got all these amazing achievements, AlphaGo, AlphaStar, StarCraft, Deep Blue chess, IBM Watson. They're superhuman. They're exceeding human level intelligence already. Unfortunately, they're all narrow AI. So as incredible as these things are and they're well televised and they're big deals, they're just narrow AI, just. The algorithms they're using, if we use the algorithms that beat Go, we can't turn around and use it to play StarCraft.
What is Intelligence?
What are we missing? What we're missing is the big picture, which is what is intelligence? I'm going to answer that question for you now. It's not really a mystery if you think about it. There are nine types of intelligence and this guy, Howard Gardner at Harvard, I think in the '90s or '80s, he's made his whole life career, he's written about seven books, you can buy them on Amazon, he's sort of come up with this definition I quite like. So just basically, he categorized them into nine types of intelligence. What we see Go, AlphaGo and the Deep Blue, the chess, the Go, the StarCraft, and the Jeopardy, these are all examples of logical/mathematical reasoning. Jeopardy, you could argue is language as well, but that's only two parts of intelligence.
This thing up here does a lot more than that. We have spatial, we can navigate. I'll just go through them quickly, because this is important to bear in mind for the whole talk really. Bodily, we move around, can we build robots like us? Not yet. How does that work? We have linguistic, we have very creative writing, reading, writing, that type of stuff, Shakespeare. Interpersonal, I think we've got that one cracked. So, how we relate to one another, how we relate to animals, all that kind of stuff, the sort of between agent to agent communication. Musical, creating great symphonies, operas, rock music, whatever type of music we like. We can create that. How does that work? Is that the same as language? Is that the same as math? Or is it something completely different going on? Or is it the same?
These are the questions we have to ask and solve and then build. This may be the first time you're thinking about it at this level, but that's good. Because this is what people are doing who are working in this field. Existential, why are we here? I believe any researcher even from DeepMind if they came along, or Google or Amazon or whoever, Microsoft, IBM, who told me, "You know what? I've built a system that asked why it's here." That hasn't happened yet as far as I know. So, what's going on there? What part of the brain is doing that? What is that? And why? Why can we do that?
Have I missed any out? I'm just having a quick look. Yes, so that's it. This is all we have to do. Small problem. But maybe there's something in common about all those nine things to unify them together. Maybe there's one theory. We don't know yet. But by the end of the talk maybe I'll have answered that question as well.
How far have we come? Well, I've talked a little bit about that. Yes, I'd say I put these figures down, sort of put my finger in the air, waved it around a little bit and said, "The logical/mathematical, I think we're about 50% there. Because we solve calculations, computational systems can easily out-computer, so they can beat at Go, they can beat us at chess, the world's best. No doubt about it. But they're not that creative. Can they come up with a new game? No. The answer is no. I mean, tops, I think, out of all those nine, we're about 50% and that goes down to 0% with the existential bit I just talked about. We haven't even started on that death journey. So clearly, we're missing things.” If you go through the list, that's kind of where we are.
How are we going to get there? It takes a village to raise a child. Hopefully, we've all heard of that. It takes a village to create an AGI. So we're going to need computer scientists, software engineers, like you guys in this room, but we're also going to need physics, we're going to need neuroscience, we're going to use psychology. We need the whole shebang, because we are intelligent. How do we understand ourselves? We have psychologists. How do we understand what's going on the brain? We have neuroscientists. How do we understand how the neurons work and fire together? We need physicists. And we need computer scientists to build the things. This isn't one tiny, little group of people, a bunch of physicists, neuroscientist, computer scientists. This is everybody in the room talking together. It's a big problem.
It's basically the challenge, isn't it, of the 21st century or of the human race, in fact, I think. Obviously, worth a lot of money. If you do solve this and start building it and you build super intelligent systems, it's kind of like, that's the GDP of the world, because we replaced ourselves. That's kind of game over in a sense, in a nice sense. What do we do? Well, we have to be creative and think, "What do we do? Who are we? What do we want to do to fulfill ourselves? But that's a question for another talk, but that's what we'll be thinking about in the years to come.
Intelligence in Physical Systems: Biological
Where are we with the hardware, the feet back on the ground, nuts, and bolts? We're going to look at biology for inspiration, because that's what we've got, and then we'll go on to the non-biological hardware. But what's biology done? Biology has been very clever. It's come up. It's had a little bit of time. It had a headstart on us 4.5 billion years. One could argue we could go back to the Big Bang and trace the evolution. But, amoeba, I think, were about a million years ago, single-celled organisms that could wiggle around and move. I'm arguing they're intelligent because intelligent agents survive and reproduce. These things can do it. They might not be able to come up with the general theory of relativity, they haven't got a neocortex, they've only got one cell, they haven't even got a central nervous system, but they're still intelligent at some level. There's intelligence here.
The first creature or system with a central nervous system is C. elegans, 302 neurons, and that's a simpler system we have. We have actually recreated that in a lab. We've got 300 neurons. The human brain has like 100 billion, we're not quite there, but the point is, the principles might be similar. So we've passed that landmark, only a few years ago, but we have passed it. The bumblebees have got about a million neurons and we've got, like I say, 100 billion.
What can we learn from nature? The first thing we learn is that nature is hierarchical. We start off as molecules, neurons, collections of neurons. It's a hierarchical scale. We've got about seven layers here up to the central nervous system. The Human Brain Project, I think, they look at six layers of a hierarchy and they have different teams in a point, and the trick is to integrate all these levels together into a system.
With computer architecture, as we talked about distributed systems here, maybe the CNS is like all the Amazon data centers connected together, but within each data center you've got clusters, within each cluster, you've got server racks, within each server, you've got CPU, GPU, ASICs. And within them you've got loads of sub-processors, like the Intel has many, 32, whatever it is, processors on a chip. Nvidia has something like 1,000, 5,000, 8,000 processors on a chip. Again, it's hierarchical, similar thing but not generally intelligent.
What are we missing there? I pretty much covered the next few slides, I think, what I just said. So hierarchical, if you look at the individual neurons, they're super complex. Do we have to go down to that level? We're not quite sure yet but nature has done some pretty tricky, clever things, but we don't know if we really need to go down to that level. We'll see.
Next level up. This is a level we probably will have to get right, is the neurons themselves. We have sensory neurons that detect information coming in because after all, I could say "all we are," in quotation, we are just information processing systems. We process information just like any computer we've ever built, any processor, von Neumann architecture, non-von Neumann architecture with the memory and the CPU separated or together. These are information processing systems. My iPhone's an information processing system. Similarly our brain, and similarly the brain of the bumblebee, the rat, these are all just information processing systems.
So how do we build a generally intelligent information processing system? Well, here's how nature does it. It has neurons. Essentially, they basically process information coming in, whether that's visible light, sound waves on our eardrums, vision, light on our retinas, taste and smell, chemicals on our tongue and on our nose, the olfactory system, and touch, the sense of touch. So that information is processed by some sort of interneuron and then output into action, muscular action, and that's the bodily part.
That's all we have to do really is to say, "It sounds simple, doesn't it? So how does nature do it? Can we do it? What are we missing?” Because clearly, we're not there. The other thing to notice is that neurons, there's about a thousand different types. Then also, there's some substructure here. They seem to repeat about 2 million times, these cortical columns. We've come a long way with neuroscience measurements, fMRI scans, all the different types of scans, EEG, MEG, fMRI. So a lot of this, like any science, even the semiconductor -transistor, all the physics, chemistry, biology, it's observation. Where do we start? We use the scientific method, we look with our eyes. We measure stuff.
We're a little bit dependent on our measurement resolution becoming better and better, so we can see the neurons, what they're actually doing. We're down so we can see single neurons firing and what they're doing. We can see thoughts. We can see people dreaming. We can put them in CAT scans. We're getting down to the single neuron level, but we still can't build a generally intelligent system, or no one's done it. We are missing the theory, you might say. The observations are getting to the point where we can trace individual thoughts, but we also need a theory, and eventually, to build and to guide.
Where are we there? We'll look at that at the very end of the talk. Final scale, a connectome, and then after that, the central nervous, the whole body. Society's social systems are hierarchical, as well. If we're looking at computation, building it in silicon, we're thinking about swarm robotics, cloud robotics, but that's the last layer of the hierarchy.
Intelligence in Physical Systems: Non-Biological Hardware
Let's see. We've covered biology, let's look at digital. Moving right along, there's a timeline. As software engineers, computer scientists, I'm sure we're kind of familiar with the timeline so I won't go into too much detail. Computation isn't a new idea, it's not even a hundred years old, it's about 5,000 years old where we started building abacuses to compute stuff, to help us count grain, and trade, whatever we used to do 5,000 years ago. The abacus, funnily enough was in all different regions of the world. It didn't just happen in one place and spread. This is about the same time over centuries in China, in Africa, and Middle East. We all were building computation around the same millennia.
This is part of the track that I talked about, we're just on this journey. It's like this force that is bigger than us. Curiosity. We need to compute, we need to understand nature as a world, we need to build things that help us do that. So we've come from there. Charles Babbage, "The Difference Engine." You can go and see it at the London Science Museum. "The Analytic Engine," Ada Lovelace, supposedly one of the first programmers. This was around 1830s-ish. These dates are approximate. Vacuum tubes around 1900s. Alan Turing laid down the theory of computation basically in the '30s. Alonzo Church at Princeton, he went over to Princeton, they worked together with the Church-Turing thesis. So that work's kind of been done, the computational theoretical work. Von Neumann also during the war. And after ENIAC, the first big transistor computer, really. Some might consider that the start of the computer age. What’s that about, 70 years ago?
So haven't been doing it long, from 5, 000 to 70. We've only been doing this 70 years. So maybe we shouldn't be too hard on ourselves that we haven't solved this problem of general intelligence in seven years. We've come a long way. We've got the ENIAC and the iPhone times a billion or something. So we've done well in that respect. We've done well at narrow AI, the transistor, Bell Labs, a Nobel Prize for that. Intel 68, that's only 50 years ago. It's not long at all, right? Less than a half century, half-century or so. 1990, that's very recent history, less than 30 years. Nvidia, and ASICs, Google TPU which is used for deep learning. Machine learning, they just built that a few years ago.
That's bringing us right up to date. We still haven't built AGI though, so what are we missing? Again, we've just got different types of processes for different types of information processing. Word processing, the CPUs are good at. GPUs are good for graphics and matrix modification, which just so happens deep learning in these machine learning algorithms are just basically multiplying massive matrices. GPUs are exceptionally good there. But it's not general intelligence. That's what we use to play Go, and also Google's TPU. Just checking on the time. Google's TPU, that is just narrow AI, but that the point here is that we build the processor for the actual use case. I think I have to speed up a little bit.
This is a bit of a whistle to historic journey. There's the Cray 40 years ago, 1976, 160 megaflops. Today, state of the art, that I9 series, TPU, as I mentioned, the graphical IPU, which is basically just a matrix multiplier. So this is all digital technology in the Nvidia V100. Then there's state of the art, 100 petaflops, supercomputer. Those are just 8 racks of TPUs. That's where we've come, state of the art. It won't give us general intelligence, but it's really good at playing Go, and other things, StarCraft, that type of stuff. Exceptionally good, superhuman level, but it doesn't ask itself, "Why am I here? Why am I sitting in this data center? Why am I a Google TPU core? Why? Who made me?"
The state of the art three exaflops, so that's we're up to, way more than what the brain does, about petaflops. We went out to 1,000 X that, so it's more horsepower that brute computation isn't going to get us there either. What are we missing? The brain's about three-pound petaflops, about 30 watt. Like a light bulb. That thing there, that's 30 million. That's a million X. That's a million light bulbs. We do all of that - well, we don't do all that. We do different stuff on 30 watts, so it's very energy-efficient and we do all these other things too.
So, perhaps we should be looking at biology, looking at bioplausible architectures. It seems obvious now I'm saying it. But maybe not so, if you haven't really thought about it so much.
Introducing your morphic computing. Now this is a type of computing that is bio-inspired and it's available on the cloud. It's called the SpiNNaker Project, started by Steve Furber 20 years ago where he left ARM. He's one of the co-founders of ARM, based at the University of Manchester. These SpiNNaker systems are based up in Manchester. They've been folded into the Human Brain Project. I suggest you log in. You go home during the break and log in and set up an account and use them. So they exist. They're not these weird, far off in the future things. They've been available to the public for about three years now. You have to present a case, whether it's an academic case or a business case, but they want people to use them. They want to test the hardware, because they've spent 20 years building it. They want people to use it.
Now, these things are literally a thousand times more energy efficient. So they use 30,000 watts instead of 30 million, they're definitely bio-inspired. Instead of having a von Neumann architecture, where you have the compute and the memory apart, you have them together just like the brain does. The neurons, both compute systems and memory systems together.
This might get us to general intelligence if we put the right algorithms on them. So that's what they look like. It's five cabinets in a data center in Manchester. They're real. That's what they look like. They've been built. They've been processed. They've been put off to fabs, built, brought back to Manchester, put together, they work. We're up to about a billion neurons which is about a mouse brain. Not quite human level, but they can do things that mice do. They can navigate through mazes, they can play Sudoku like we do. They do crossword puzzles. They do things that AlphaGo cannot do and never can do and never will do. So this architecture is very important.
That's how you scale them. There's the brain scales effort, that's a German effort. I forget which university they are, it might be Munich, but there are several of these. IBM have one called TrueNorth, you may have heard of that, and there's a rack of Google TPUs. One's really good at playing Go, the other one can do a lot of different things like humans do. And about a thousand times more energy-efficient. So yes, so the hardware is important.
What about quantum computing? Probably not. The brain is wet, and mushy and warm in its room temperature. That's there, so we'll skip to that. Impressive. I'd love to spend the rest of my talk on them. I am giving a talk tomorrow on quantum computing, by the way. Come and see it at 3:00. IBM, 50 qubits, Google, Bristlecone, and 72 qubits. That's what a quantum circuit looks like. It looks a little bit like a classical one but it uses the laws of quantum mechanics.
Probably not going to be needed, although, quantum intelligence, what does that even mean? The point really is that, in summary, there’s not just one stack that's there. There are four. There's neuromorphic, there's classical, which is digital, there's quantum, and there's biological. So different stacks, different types of processors, which ones are we going to need, which ones won't we need? That's what they look like. There's a 7 qubit quantum computer processor from IBM. That's what biology looks like. That's what neuromorphic looks like at the microscopic level. That's what digital looks like.
There's three stacks and that's what the data center of the future will look like. It'll be filled with quantum computers. You can log into a quantum computer today at IBM. Go home and log in. You can log into neuromorphic computers. In 5 or 10 years, the data centers won't just be full of racks of CPU, GPU, TPUs. They'll be full of neuromorphics, full of quantum chips.
AGI Overview
Deep learning, I'm going to argue we don't need it, so we'll skip through these slides in the interest of time. The Fourth Industrial Revolution which we're in, very exciting times, will be open source. Which is great. That's accelerating progress too. Fantastic. And so AGI, what is it? What do we need? I mentioned active entrance. What the hell is that? Are we building it yet? Have we gone beyond the SpiNNaker neuromorphic? Where are we on that journey?
I am going to argue, right here and now, on this stage in London, today, that we are needing physics. Whatever, machine learning or statistical stuff, isn't going to get us anywhere. It's just statistical. AlphaGo. Fantastic, very clever, not undermining it, taking anything away but it's not based on physics at all. Zero physics. It's just statistical. It's very damn clever. Superhuman level but only narrow. And that's why - it's just using statistics on a very narrow data set. We're going to need physics. What is physics? I did a PhD in physics in postdoc level, so I know what physics is. And what physics is, is that we use something called the principle of least action and that will be the last part of my talk. By minimizing this thing, it's just, here's what it is. It's the integral of a Lagrangian. Who's heard of Lagrangian? So it's basically that.
Now, the tricky part is to come up with the right Lagrangian. We basically want the Lagrangian of the brain. Now, what the heck does that look like? Well, that's where we have to turn to neuroscience. Remember when I said, "It takes a village?" Well, we have to ask the neuroscientists how does the brain work. And so this is exactly what we do. So the pinnacle of human achievement, we've written down the laws of physics for everything, except for dark matter, which came along unexpectedly. What the hell is that? But there it is. That's the whole history of physics in one equation. That long pinnacle of, I would argue, the pinnacle of human achievement.
How do we apply that to the brain? Well, there are nine types of intelligence. So we need a system that can explain and understand the world, it can imagine things, it can problem-solve, and plan actions, and it can build models of its environment. That's what intelligence is, true intelligence. Something that can model its environment, not just a game of Go, but anything. Climb a mountain, and understand that, get in the car and understand that, go shopping, come to a conference, write some code, write a book, draw a picture. It needs to understand its environment. Do all those things.
We're going to have to look at neuroscience, we're going to have to get the theory from there, the observation from there. We're probably going to have to use neuromorphic processing, which we have, waiting for them to scale. There are a bunch of theoretical approaches, so these are all super clever groups and super clever people, and they've all been spending at least 30 years on this stuff, Bialek's at Princeton, Friston's at UCL up the road, Tishby's in Israel, Schmidhuber's in Switzerland, same as Hutter but he went to Australia. So all these super intelligent groups. It's not just DeepMind working on this, it is a whole bunch of people. Extraordinarily, for me, the most interesting question, you can ask. It honestly is. The field's attracted a lot of very bright people already.
Active Interference
Professor Friston up the road at UCL has come up with something called active inference, and he uses physics and information theory as a starting point. See, that's why I'm choosing his theory in particular, but you can look at other ones. You can go look at Schmidhuber, Bialek, you can Google that on the internet. They're all fascinating, what work they're doing. But I think Professor Friston has probably got the right theory in my opinion.
It's called active inference. It gets quite technical, and I've only got a few minutes left but it's basically, physics is entropy, free energy, action perception, that's how we set it up. We have our internal states which is inside the information processing system, whether that's a chip or a neuromorphic processor or our brains, biological brain. We have the external world, which is what we call the external state. So we have what's inside us doing the processing, the information coming in from the external world. If anyone's ever done reinforcement learning, it's very similar, agent action in the world, except reinforcement learning's based on statistics. This is based on physics. In the middle, we have the Markov blanket, and that separates the internal states from the external states.
The math gets ugly real quick, as you might expect. The math does get ugly real quick. Who here read machine learning, deep learning, reinforcement learning paper? Yes, the math gets like that. It looks like that, except it's based on statistics. This has got physics terms in it, it's got free energy, it's got entropy, all that good stuff. You drill down, you unpack that, and you get loads of integrals and yes, statistical functions, and stochastic differential equations, and all that good stuff. But if you look at the theory of general relativity, on the surface it's like one beautiful equation, but if you dig down, it gets messy real quick. So this has to be expected. You can't just expect someone to walk off the street and go, "Yes, here's how the brain works and here's the equations." It's hard.
Let's see. This is it, applied to the brain, that's the last equation of the day, I promise. But the big point here is that it works from cells to the brain. You have internal state of a cell, external states, Markov blanket separating the two. Internal states of the brain, external states, Markov blanket separating the two. So it's very general. It works at all scales, all time scales, all length scales, societal scales. It's a very general theory. It's like the general theory of relativity. It works on all scales, all time scales, all distance scales.
Building AGI
So can we build AGI? Yes. We have the theories, we have all these very smart people, we have great theories, that one in particular. We have algorithms and software, we have a lot of math. You saw the math, there's code to go along with that math. We have hardware, neuromorphics and ASICs, and we have data sets. We have the internet, we have all the data we need. So we're getting there. We're not that far away. This is the reason for my talk and probably why Martin invited me. It's that we're not that far away, as far away you might think. It's a process, it's a procedure, but I believe we have the key elements, the key understanding to do it.
When? Let's see. Well, I think we're at the beginning now. There are a whole bunch of projects. There's some, you can go and Google them. So, in conclusion - there are a few more slides, but almost there. It's obvious to most that deep learning is lacking the foundations needed for a general theory of intelligence. So don't let anyone tell you that deep learning is any part of intelligence, it's not. It's statistics - it's based on statistics, not physics. Active inference is such a theory of general intelligence. I'm getting you to sort of start thinking through this process. You choose your favorite theory but it's the same process.
Deep learning research groups are now finally turning to biology for inspiration. Bioplausible models are starting to appear. Some people were working on this 30 years ago, but the deep learning community in general, the S-curve's flattening. It's like the mileage just ran out, basically. So what do we do now? This huge community, you go to NIPS, there's 10,000, 20,000 at this conference where they used to be a few hundred. It's like, "We're going to have to look for something else." The community is starting to think along these terms, bioplausible systems, and so that's where we are.
The TLDR again. We're going to build them soon. I'll just finish off with Geoff Hinton who is the godfather of deep learning, has been doing this for 30 years. He used to work with Carl Friston at UCL, then he went to the U.S., did a postdoc there, then he helped set up the Gatsby lab where DeepMind came from, along with Friston and a few others. Then he ended up in Canada. Let's see what he has to say.
“Geoffrey Hinton: So what's kind of interesting at present is that it looks so we might be in a phase of normal science where business as usual is going to make a whole lot of progress, you know? Assuming the computer industry can keep producing better hardware and keep doing computations, burning less energy. If they can keep doing that with Moore's law for another 10 years or 20 years, I think business as usual is going to take us a huge way. Obviously, if we get big breakthroughs, big conceptual breakthroughs that'll take us further.”
I think one of the big breakthroughs that's going to come is we're going to understand the brain. This is just a bet. My personal belief is we're going to get close enough to how the brain really does do these things, that suddenly it all begins to click and we kind of fall into a minimum where it's just obvious how the brain is actually doing this stuff. I think we might be quite close to that, and that will be another revolution because that would affect all sorts of things like education, and our sense of who we are. That would be very exciting. I'm hoping I get to see that.
Morgan: Great. He said this back in 2016, that's three years ago. So yes, we were close back then, we're closer now, trust me. I wrote a report for O-Reilly, there it is. I am trying to commercialize this stuff with Professor Friston. There's my disclaimer. I have a company called Turing AI. And come and see me at the AMA at 10:35. We can talk more about this very fascinating subject. Thank you very much.
See more presentations with transcripts