Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Presentations Restoring Confidence in Microservices: Tracing That's More Than Traces

Restoring Confidence in Microservices: Tracing That's More Than Traces



Ben Sigelman talks about rethinking distributed tracing in terms of the most vital organizational problems that microservices introduced, and makes the case that distributed tracing should be about much more than distributed traces.


Ben Sigelman is the CEO and co-founder of LightStep, co-creator of Dapper (Google’s distributed tracing tool that helps developers make sense of their large-scale distributed systems), and co-creator of the open-source OpenTracing API standard (a project within the CNCF).

About the conference

Software is changing the world. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.


Sigelman: I'm pleased to be here. I have to say, I feel somewhat frustrated at this conference because there's so much stuff I want to see. There's a person who's much smarter than me doing quantum computing. If we can find a way to have my particles spread out into five rooms at a time, then possibly I could consume all the content that I want during this conference, but I'll have to watch some on the way home.

I'm here to talk about confidence, microservices, tracing, and traces. I'll start off by talking about confidence. So that's interesting. Let me make that go away. This is a picture of a maniac. This is all Alex Honnold. Some of you may have watched this film. I'm afraid of heights, so this literally makes me nervous to even look at this image. But as you can see, he's very confident. He's able to scale these giant walls without ropes or equipment beyond his body and some modest clothing. It turns out part of the reason he is so confident is that his amygdala literally doesn't work correctly. He has been put in FMRIs and it doesn't work. It doesn't have the normal blood flow. He's a very good rock climber. He's is also incapable of feeling fear, which I think is helpful if you want to feel confident. But let's just say he's very skilled. He's special. He's very special, and I admire him.

One way to achieve confidence is to practice rock climbing a lot, remove your Amygdala, and then climb up giant walls and don't fall. One of the interesting things about rock climbing as well is that if you do become nervous, the natural inclination is to pull yourself closer to the wall, which actually reduces your leverage and then you fall. So, you really don't want to get scared when you're up there. Another way to achieve confidence is to use tools. So you combine some level of skill with tooling, and this woman is using a rope, crampons, and ice axes.

This seems very pragmatic to me. I would not attempt on anything like this with this equipment or without this equipment, but if I was going to do it, I definitely want the axes and the ropes. I'm a big tool guy when it comes to feeling confident, and I think in general in this industry, and I will eventually talk about software very soon, we celebrate genius a little bit too much. If we're relying on people to be brilliant all the time, especially in an emergency, we're in for a rough ride. I think it's important to feel confident both through our abilities and our training, and also through the tools that we have.

There are many tools that we use of course. These are things that make me feel confident when I'm a programmer. I haven't been a programmer for longer than I'd like to admit. It's a very sad thing about my job, but I remember looking at green tests and feeling confident. When I see this screen, I literally feel confident. I added a green background to make it even more confidence inspiring, but I think you understand what I'm saying. Another thing that makes me feel confident - so this is when I'm writing software, when I was writing software. If I'm operating software, here's a graph on a dashboard showing everything looking very healthy. That also inspires confidence in me. I'm curious, does this make people literally feel confident? It feels good. It's like, okay, things are working, it's great. It's confident. So that's confidence in software.

Leverage and Complexity

Now I'm going to step back for a second and talk about the history of software development in three slides, and very low fidelity, but high order bits should be correct. I think about it in terms of leverage and complexity. This is some software I wrote last night. I don't mean Excel, but I mean the actual spreadsheet. I wrote some software to compute how many people are coming to my party. Unfortunately I have no friends, so these are all made up names. But they have a party. Then as you can see, I've managed to accumulate these numbers with the aggregation and I can get my party size.

In my mind, this is a very simple software that I've written. But it is software and this is the simplest kind of software, Excel is or Google spreadsheets, whatever, this is the simplest type of software development that people do, but there's a lot of it out there and it's important, but it has small leverage. Last night, when I was leaving the speaker dinner to get back to my hotel - I'd never been to London before. Lovely city, I'm into it except that you need to flag the buses down, which is very weird to me. I don't flag buses down in America. So the bus didn't stop for me twice. Then I was like, “This is …”- I won't swear on camera - but I was frustrated, like, “F it, I'm going to open up Lyft” and then I realized Lyft doesn't work here either but I used Uber. But it was a very good experience. I said I want a ride and the person came and picked me up. They stopped, which is nice. Then I got in the car and got to my hotel.

So that was a very high leverage thing for me. I was tired and I had to prepare for this talk. I obviously have modified my talk out of gratitude for car-hailing services. It was very high leverage. There's really a continuum here, in my mind. On the left, we have things like programming, simple spreadsheets, small leverage. The UX of the person who is going to use my software, in this case, maybe my wife and I are collaborating and we're trying to figure out who's coming to our party. The UX is pretty clumsy in that you could hit the wrong button and literally destroy the software. But it works and it takes one developer.

Then on the right, we have software that's very high leverage. It's affecting the real world. I press a button and someone, who I don't know but whom I can trust because of the star ratings, comes and picks me up and brings me to my hotel and explains on the way how you actually do get on a bus in London, which I now know. So I learned that as well. And then the UX is just amazing. Totally amazing, mind-blowingly amazing. And it takes a lot of developers. This is the trick. This is the whole software aiding the world thing. Everyone here knows this, but it's really true. What we're talking about are delivering magical experiences to people that have very high leverage and take a huge number of people. It's like this total iceberg situation. The more magical, the more developers. Job security for everyone in the room. Congratulations. We got lucky with our career choices, total fluke.

This is a dependency graph for a monolith. It doesn't matter what it was, but this is real. These are not functioning calls. These are packages. Scary, scary stuff. This is one of Sarah Wells's. Sarah Wells's presentation, her keynote on Monday was so good. I've thanked her in person because this is much better data than I had. I'm just going to delete my slides and use hers. This is her slide showing their release tracker spreadsheet. I remember these very well as well from monolith days. We're doing a monthly release. It's like this nightmare of coordination, and you're a good, your team is good, and then some other teams screws up and you have to roll back and QA. And then in the interim, someone commits something to your piece of the code base and then you delay this whole deal. It's just ridiculous.

So, there's a crisis of confidence, and so that's the thing. We've added a lot of developers and now there's a crisis of confidence. I know people at Lyft, and when they were scaling up very quickly, they still had a monolith. So they had this crisis where they couldn't do releases anymore. That was a big problem because we need to feel confident in our jobs. And people lost confidence in their ability to deliver software quickly.

This is a slide that I used in a different presentation earlier this week. I apologize for the duplicate. But this is this idea that you're going to ship your org chart. I think it's inevitable. There's nothing you can do about it. You need to think about that, realize it's inevitable, and then structure your organization and your software to facilitate this instead of running from it. If you want to, if you're going back to this problem with having lots of developers, what you really need to do is find a way for the developers to communicate with a smaller group of people. So you structure your organization and your software in the same way and that. Crystal took a picture of me presenting this slide earlier. If people are having Deja Vu, I apologize, so lame.

This is where we started. What we need to do is break this giant group of developers into smaller teams and they can each operate independently. And we have restored our velocity by delivering microservices. This is real, this is very real. Again, to steal some of Sarah's slides, this was their chart of monthly releases with their monolith. If you recall from Monday, this is the monthly releases with microservices. So it really worked. It was real. That's why people are adopting this stuff. It actually delivers on the promise.

A Microservices Architecture

That's great. Let's look at microservices. Here's a diagram of microservices. You can see that they are, in this case, a dozen or so of them. In reality, there are many more in most architectures. Let's focus on a particular service which has a team associated with it. Pair the org chart mapping to your services, and let's think about whom they need to communicate with. They need to spend a lot of time communicating within their team. That's great. That's the whole point. You might also communicate with the people who are adjacent to you in the service graph. If you were going to make a change, you'd probably want to let them know. Hopefully, they would do the same for you. You have a narrow scope of understanding and it's proportional to the number of services you work with and your own team size, not proportional to the entire organization.

We've solved the velocity problem, and everything seems good. We have faster releases, less friction, and that's the thing. The trouble is that you do depend on the rest of the system. You really do. These arrows are transitive and the application wouldn't work if you didn't depend on them. And that's the problem. So, when things are slowing down in service E, that's quite problematic. You depend on service E and so you slow down too. But the very design of the system, literally the organizational design, it was designed to make it impossible to figure out what the F is going on. You have no idea because you can't communicate with them. You don't even know who they are. So this is a serious problem. Show of hands, people who actually have microservices at their organizations? Yes. Is this familiar as a problem? Yes, so this is ubiquitous.

And It's worse than this. It's actually quite a bit worse than this. If E is slow or having an issue, it might be that E had a release. It's possible. It's actually, I would argue, more likely if it's downstream dependency, those things are usually pretty stable in terms of releases. It's probably that someone else deployed software that's overloading E. So really your issue is that service Z did a release, it's generating 100x as much traffic, that's hitting E, creating a bottleneck, and that's your problem. Now your challenge is to understand this diagram with a communication pattern that is totally local. So we have a new crisis of confidence, which is what I would call a hypothesis overload.

There are so many different ways that your system can fail, that you are overwhelmed checking them. This is a significant crisis. I think the transition to microservices, which I do think is inevitable because I think the big software, high leverage software thing is inevitable. That requires a lot of developers and a lot of developers require microservices or serverless, doesn't matter what you call it. This is a significant problem for our industry in the next decade or so.

I would also argue that the transition to microservices is much larger than the transition to cloud, because architecturally the transition to cloud, the box diagram for your system didn't change that much. It was kind of client-server before or client-server afterwards. I think it's more like the transition from software in the mid-1990s when you would buy a CD-ROM at a store and bring it home and install it, to moving to client-server software. It's a big transition. This diagram in my mind is a huge problem for our industry as microservices become more layered and proliferated.

How are we going to resolve this? This is a diagram of a bunch of microservices. This is a transaction in that microservice architecture. This is a distributed trace. I think people in the audience- this is a great conference, people are very sophisticated. I think everyone knows what this is. It's just the idea that a single distributed trace is just a record of how one user transaction- or it doesn't have to be one - how one transaction propagated from service to service and the causality graph along the way.

Tracing Conundrums

There are a couple of problems with tracing. One is the volume of data. We start with the transaction rate at the top of the stack. So this is your business. This is hopefully going up and to the right, if your business is doing well. You multiply that roughly times the number of microservices involved in these transactions. That's growing with the size of your organization. You multiply that times the cost of centralizing this data and then you multiply that times a couple of weeks of retention. What you're left with is just simply way too much money. You can't afford to store every transaction trace for every service for many weeks. There's no solution to this problem that I know of, at least for applications running at scale. It's just kind of a fundamental thing about tracing. So this is the first big problem with tracing.

In Dapper, we address this by sampling very aggressively immediately. So, as soon as the transaction initiated, before you even knew if it was slow or fast or what it was going to do, we would flip a coin and we would throw out all but 1 in 1000 transactions. We thought that was good. Even with 1/1000th of the data, when we tried to centralize that globally, it was still too extensive, so we did another 10 for 1 sample before we centralized globally. So ultimately in Dapper, 1 for 10,000 transactions were retained in a global repository that we could run MapReduces on and things like that, which is pretty crippling in terms of what you can do with the data, frankly. There are other approaches where you can retain more of the data for longer, which has some benefits. But you still have to contend with sampling at some point. Hopefully, you can be smart about it, but it's just part of the deal with tracing.

The other big problem is the amount of signal. I thought it would be helpful for people to see. So this is a single distributed trace. There's nothing fancy about this. This is showing a web application that's calling a proxy, calling a server. You can see there's a shared timing diagram. You can expand and collapse things. This is just like Dapper or Yeager or Zipkin or whatever. Nothing particularly interesting about this visualization of a distributed trace, but what I would like to point out is that there's kind of a lot of data.

There are all these things, these rectangles, these timing elements are called spans. Each span has tags on it, which are key-value tags. It also has all these other details which are kind of debugging information. It's possible for it to have little logging type things attached to it. All told, the trace, this is actually just demo data. A trace in a production distributed system, it's like order of a megabyte or two of data often, for a single trace that goes across hundreds of services. So it's kind of a lot of data and it's not just a storage issue, it's a human perception issue. It's difficult for human beings to process that much information. That's a really serious problem with tracing. We can't expect an operator, especially in an emergency, but even not an emergency, to look through all that data and make sense of it. So that's the other problem with tracing.

In summary, one trace per transaction, you cross microservice boundaries, you can cross into the client. They're necessary. I think they're absolutely necessary. If we're thinking about the diagram of the services being broken, you can't expect to understand that without distributed traces being in the picture in some capacity. And yet, they're really not sufficient on their own. They're too big for our brains. I don't think that gets talked about enough. People talk about sampling and blah, blah, blah, blah, blah, which is the second point. But there is a perceptual problem that we aren't smart enough to understand them fast enough to actually make sense of our systems.

This brings up traces and tracing, it’s kind of part of the subject of my talk. Traces are just structs. That's all they are. I mean, there's structs that are recursive, but they're just structs. There's nothing very exciting about them on their own. In my mind, they are totally table stakes for microservices. You have to kind of bite the bullet and get some instrumentation going early on. But people have recognized that's necessary. Having distributed traces is helpful, but they're really not that helpful. The distributed tracing as a practice is where you take that raw data and do something with it.

I think today, we as an industry have confused the traces themselves with tracing. Tracing is any way of applying the traces to some actual problem you're having. The traces are just the raw data. And because most tracing is just the raw data, it's basically indexing traces and then displaying them. We're confusing these two things. They're very different. That's something I'm trying to explain in this talk, that the traces themselves, not that useful. Tracing is something that can actually restore confidence, given the crisis that we're having around complexity.

Traces, Tracing, and Confidence

How do we make them valuable? We use them to restore confidence. And that brings up last portion of my talk - I need to talk about SLIs for a minute. This has come up a number of times at this conference, so I bet everyone here is familiar with it. But it's the idea that if you're running a service, your consumers probably only care about a couple of things. You probably have one to five API calls that people really hammer and depend on. Those are the things you need SLIs around, and they should be pretty straight forward.

Another Sarah Wells slide that I loved. I was able to actually grab a shot of this one. So this is a picture of me presenting a slide, that was a picture of her presenting her slide for Monday. I really encourage someone to take a picture of this and put it on Twitter and mention me. Hopefully, you can get me in the shot too and then we can have a doubly recursive shot of me presenting Sarah's much better slide. Unfortunately, at this point, it's already illegible, so I did grab it for us. This is what her slide said. This was her slide about SLIs, which was perfect in my mind. You should just keep it simple. This is sometimes called the golden signals. You need to monitor the request rate, you need to monitor the latency, and you need to monitor the error rate, probably as a ratio against the request rate, in order to understand the health of your service. And that's what SLIs are.

In my mind, confidence, at least for someone operating microservices, is about controlling your SLIs. It's about having control of your SLIs. If you can control your SLIs, that means you are doing your job as a service and you should be able to sleep well at night, almost by definition. There are two ways that we control them. One is to gradually improve them over time. So this is, I have a week, I have a month to make a change and improve latency, resource utilization, error rates, whatever. Or we want to rapidly restore an SLI. So yes, this takes days or weeks. Or we're more of a rush. It's usually, unfortunately, seems like this always happens at night, and your SLI is very far from healthy and you want to fix it as quickly as possible. This typically involves finding some other service that has done a release and rolling it back. That's usually what this boils down to. But they actually have similar needs in terms of the data. Obviously, a very different level of urgency, but confidence is about having control over these two things.

We do this by measuring SLIs very precisely, and I'll talk about that. Then we do it by explaining variance. At this point, I need to take an aside. I do feel somewhat uncomfortable with the remainder of the presentation but I've labored over this and I think it's the right thing to do. So someone sent this tweet a few days ago, just coincidentally. "I've heard your hesitancy on self-promotion." It's true. I despise self-promotion. I hate it, especially when I'm doing it. “But as CEO I should make it more fluid.” This is totally an unsolicited tweet that I saw. And I was like, okay. So I wrote back and I said, "Thank you for the feedback. So LightStep will make all your wildest dreams come true." Okay, I did it. So that's the end of my sales pitch.

I'm only showing this because it's actually a problem we've had with our designers. I'm friends with a lot of other vendors, even our competitors. And I think a lot of us have trouble mocking up fake data. The fake data, it just doesn't feel right. I don't know how else to say it. If you have fake time series data, fake latency data, it looks too smooth. So we've had a really hard time showing compelling stories about what we're trying to build without getting real data into the mix. I will at times show things from LightStep but this is absolutely not a sales pitch. This is meant to be nothing but an explanation of what can be done with trace data if we have a distributed tracing system. I'm sure that there are other things in the world that will do these things as well. And I hope that's fine. I really do hope it's okay.

Confidence about Variance

This is going back to the service-centric approach. We have some service and we want to start with an SLI. We want to find variance in the SLI, which I'll talk about, and then we want to explain that variance. That's basically what we're trying to do. Let's start with variance. So we need to feel confident about the variance. The conventional best practice is to measure high percentile latency to understand latency. You don't want to just optimize the median, you want to optimize the outliers. Here are two actual services where I measured in real time the median latency and then high percentile latency.

I'll give you a few seconds to look at this chart and try to think about, what do these numbers mean? What do these distributions feel like? What do they represent? Do you feel confident about your understanding of these two services latencies? I will offer that I feel much more confident than if I just had P50, or average, similar idea. And I feel somewhat confident that there's more variance on A. I feel confident that B is faster, that sort of thing, but I don't really know that much about the shape.

These are the histograms for those two things. The same distributions. I think histograms are a really nice way to represent latency. You can actually see visually some things that are pretty obvious. On service B, there's actually this totally distinct mode that's very fast. That's off on the left. If it's not clear, the things on the right are slow, the things to the left are fast. This is not a time series, this is a histogram and we're showing the frequency distribution. This is a log scale, I should have said - this is all real data from real systems. This is not like a demo data set. Most in fact, almost every service I've ever seen that does anything interesting, has these sorts of multimodal latency distributions. It's not the exception, it's the rule. If they're doing something interesting, you'll see many different modes. And by modes, I mean the little bumps in the histogram.

I find those to be very puzzling and interesting. If I'm trying to understand latency, I want to characterize what they are. It's difficult to do this if you're looking at percentile latency on its own because it simply doesn't convey the information. So part of confidence is feeling confident about what the variance actually is. Here is another service. I just literally last night took some screenshots of service latencies. These are completely at random. I wasn't able to find one that was a bell curve or a power law distribution. They're all kind of weird in one way or another.

Let's talk about latency variance versus temporal variants. Here is one of those histograms. We also want to know whether things are varied in time. So there's variance across the latency spectrum. There's also variance across time. We can represent that by looking at the shape of the histogram now, versus how it was a day ago, an hour ago, week ago, that sort of thing. To be clear, there are different many products that do this. This is not, again, an ad for anything in particular, but I think it's really important if we're trying to understand latency to think about it as a distribution, because you can actually sometimes see specific modes moving around, and that's something that you should be curious about as well.

So far, we're just talking about SLIs. We haven't talked about understanding them, we're just trying to represent them with precision. So variance over what exactly? These are all SLIs, but the thing I want to emphasize is that those are distributed traces. That was all variance over distributed traces. It's specifically the spans in the distributed traces and they are filtered by tags, but whatever. So when we think of distributed tracing, a lot of us think of those waterfall diagrams. Actually, distributed tracing can include this sort of information, which I think is actually a lot more useful in terms of understanding per aggregate performance, than the traces themselves.

Confidence about Bottlenecks

This is a different topic, confidence about bottlenecks in the system. I've anonymized this person. This is a presentation at Reinvent, which blew my mind, about Netflix's architecture. This is supposed to be I think, at some level, bragging rights about how complicated their system was. I agree it feels complicated. I don't think this is a good thing. This is a really difficult way to understand a system. Granted, this presenter wasn't trying to make a point that this was an easy way to represent the system, but we need to get something that's a little bit more digestible if we want to understand bottlenecks in a distributed system. A full system diagram is actually very overwhelming if we have dozens or hundreds of services.

I wanted to show bottlenecks. The way that we would typically resolve bottlenecks in a distributed system would be to open, let's try to do it just with traces on their own. So I'll just sort these chronologically. I'll open a fast one and a slow one. Here's a fast one. What we want to do is understand the critical paths. So that is to say we, we track through this thing and try to figure out where it was slow. So it looks like it was slow here, it was slow here, and we continue to do this until we've come up with our hypothesis. This was one that took almost five seconds. Again, we're looking to see where the request bottoms out and we can find, okay, it seems like this API server was slow.

This is valuable. I would actually say that I have high confidence about the bottleneck for this specific transaction. I also have high confidence about the bottleneck for this specific transaction. That's really something. That's valuable. What I don't have confidence about is the bottleneck for the system in general. That's an important thing. If we're thinking about this distribution, what I think would be more useful is to say, well, let's look at the high throughput data, so the bottom 50% of this distribution, and we want to see what’s slow in the high throughput area of this particular aspect of my system. And we can use distributed traces in the aggregate. We can analyze them all at once and we can annotate a diagram with where the latency is coming from. This is not difficult to do. It's actually very easy to do. You can do an automated critical path analysis of these traces. You can take a couple of hundred of them and then you can draw an aggregate diagram that shows where the critical path is in your system.

The key thing I want to emphasize is that this is the critical path specifically for these requests. So here we have an SLI, we've restricted it to something that's high throughput and fast and then we're able to understand where the bottlenecks are in our system. We're also able to understand where they are not. We know that if we make an optimization to this service or that service or this service, it will not affect our performance at all. At Google, we did see Dapper used effectively to save a lot of cycles. People used to spend multiple quarters making their services faster. Those services were not on the critical path. As a result, nothing happened. If you make an optimization off the critical path, it will not affect anyone's life with your product. It just wastes your time. So that's helpful.

The thing that's interesting is that if we instead focus on outlier latency over here, so these are just the 98th percentile and above, we can get a very different diagram. If you recall the diagram was previously a bottleneck on the database. This conference Wi-Fi situation is not so great, really not so great. I could use my phone. Well, I had thought this might happen, so I have the screenshot.

If we're looking at outlier latency, then we actually have a very different bottleneck. It's the same thing we were just looking at. The bottleneck, in this case, is in the service that we were querying itself. And what I would like to emphasize is that if we'd gone and optimized the fast path, it would have actually had no effect whatsoever on the slow path. This is the sort of thing that we can waste a lot of cycles on if we're not careful.

Confidence about Hypotheses

The last thing I want to talk about is hypotheses. Finding bottlenecks is definitely a challenging thing. It's actually not as hard as narrowing the set of hypotheses. I was talking earlier about how, in this slide, that traces have too much signal. It's megabytes of data. As a human being, we don't have enough brain power to really process that. I guess to make this kind of real, I would maybe pose this as an exercise. So we've restricted our analysis to outliers here. I'll open up one in a new tab just so we can see it.

The challenge is we need to look through everything, this entire trace. So all of the tags, all of the spans, and figure out what's correlated with high latency. It's an overwhelming exercise statistically, totally overwhelming. The thing is it's not that overwhelming for a computer. With the raw trace data, a tracing system should be able to take hundreds of traces, do statistics on all of them, and then surface attributes of the larger distributed traces that are correlated with slowness.

We have an objective function with an SLI. It's a very clear objective function, and so we should be able to correlate against that. It's not difficult. It's not a difficult thing. I think we should ask more of distributed tracing. If we do that analysis, we find things like this. We're now overlaying just the set of traces that go through US west 1. You can see there's a very high correlation with slowness for US west 1. We can also see that if a particular span downstream is involved, there's also a lot of correlation with latency. We can also do anticorrelation analysis and say, if we touch this caching service, which means we hit the cache, then things were actually quite a bit faster. So there's an anti-correlation with slowness.

This sort of analysis is not difficult to do, given the distributed traces and some simple statistics. And yet most of us don't have the stuff. So I think my challenge to people is to think about distributed tracing in terms of confidence. To me, I feel much more confident looking at that sort of analysis than I do looking at individual traces. I think we're actually getting closer to understanding what the systems do in general, instead of what specific transactions do in isolation. So that was my screenshots in case that didn't work.

Wrapping up, I guess I hope I've made my overall point clear. We are in an industry that's undergoing an architectural sea change, for good reason, so that when I don't get my bus, I can get a Lyft or an Uber within a few minutes. It's very valuable to me that I was able to get home quickly last night. But the cost of that has tremendous complexity, tremendous complexity. Distributed traces, I think, are necessary to understand that complexity, but we will never, ever actually get to the bottom of these issues, at least not in short order unless we employ much more intelligent analyses of that raw data than we're doing today. So that's really my thesis. I hope it makes sense to people. We want to be like Alex Honnold but with ice axes. That would be amazing. And thank you very much.


See more presentations with transcripts


Recorded at:

Apr 03, 2019