Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Presentations Security & Psychology: Demotivating Persistent Threats

Security & Psychology: Demotivating Persistent Threats



Jarrod Overson breaks down the workflow for effective threat mitigation of sophisticated attackers into four distinct stages: 1) Classification; 2) Research and generate an actor profile; 3) Counter attack, and 4) Rinse & repeat until all threats are cleared.


Jarrod Overson has been developing on the web for over 15 years in both startups and global companies and currently works at Shape Security. Previously at Riot Games and Napster, he has worked in every corner of web technology and is an active proponent and contributor to open source, creator of Plato and co-author of Developing Web Components.

About the conference

Software is changing the world. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.


Overson: Thank you, everyone, for coming, the first track in security talk. I know in a general developer conference, the security track is not always the highest attended track, so thank you very much for coming. My name is Jarrod Overson. I work at Shape Security. Historically, we've been a very secretive company and we're starting to now pull back the curtain a little bit and I'm excited to talk about the stuff that I'm talking about. This is only the second time I've given this talk, though. I would love some feedback as to things I should dive into more, things I should dive into less, anything else towards the end.

A lot of the conversations revolve around one question. How do you engage with attackers while you are under attack? There's a lot to unpack in that question. What are your rules of engagement? What teams get involved? Who gets notified? What is the priority of the attack that you are currently going through? When we started asking those questions, we learned that we actually needed to ask teams one question first. How do you know you are under attack in the first place? Many teams when presented with that first question, don't have an answer because they don't know when they are under attack in the first place. What are the signs that you look for to know that an attack is currently going on, and that you should be dealing with it in either a way you've dealt with it before, in a completely new way, what kind of attack it is? There's a lot there.

We're going to go through a bunch of things here. We're going to start off with attacker sophistication, their trajectory, some example attacks. Then I want to jump into the economics of attacks, the cost versus value, and then how to flip those in your favor and then go through some case studies.

Imitation Attacks and Attacker Sophistication

Imitation attacks, if that's a term that you haven't heard before, it's probably because in the early days you just called them HTTP requests. When you had a request come into your server, you served up the content, because that's what you were designed to do. That's a good internet citizen. Then as we found that we were being exploited, taken advantage of, we started putting hurdles and walls in place. We started adding duct tape and Band-Aids to the web and tried to make things a little bit safer for everyone. And all of that was doing was training attackers to look more and more like legitimate traffic, which we still wanted on our websites.

Some of those hurdles were really bad. IP rate limits and IP blocks. Has anyone here actually blocked an IP address for any reason? It's the stupidest thing to do. It's like if somebody on your front lawn and you yell at them and say, "Hey, you stop," and they look at you and say, "Okay." They'd move a little bit and then you lose visibility on them completely. It's ridiculous. We had those for years and we would just block and block and block and block and block and block. Over and over again, this whack-a-mole got us nowhere and just taught attackers to look more and more like legitimate traffic.

So just like with kids and dogs and whatever else, if you want to change the behavior, you have to change the incentive. If you have a dog jumping up on your counter and eating your loaves of bread, you move the loaves of bread and you immediately solve the bad behavior that you were frustrated with. This is the type of tactical operation that we don't have embedded into a lot of security culture. When I'm talking about cost and value, when I talk about incentive, it's really a cost versus value equation. There are multiple ways of cutting that. The way that I'm going to cut it for the purpose of this talk is slicing it between manual and automated attacks. Manual attacks are sufficient when the value is extremely high for you. Oh man, I picked a very long name. Sorry.

If I were attacking you and I was able to get $10,000 worth of value out of you in two hours’ worth of attacks, I'm going to do that all day every day because I'm going to be a rich, rich man after the end a few weeks. But if that value drops to a dollar, I'm not going to leave you because there's still value to be extracted. I just need to figure out how to scale my attack 10,000X in order to get that value out that I had previously. Now, a 10,000X scale sounds like a big job, but computers do that really well. When you're going from manual to automated anything, you scale up brilliantly, very quickly.

The secret to defeating all attackers is to just drop the value to as near zero as possible while raise the cost to an astronomical level, and then you solve all your problems. The problem there though, is that a lot of businesses involve many other teams, teams that inevitably want to make some money and have users, and this equation is the exact opposite of what they're actually thinking about. They want to make sure that the value that you're able to deliver is as high as possible and the cost of using the service is as low as possible. And then you get all the users in the world and then everything and everyone is happy. Now, working with teams like that from a security perspective is difficult. You have to figure out where on the gradient you want to live in order to make sure that you have justified everything you need to justify.

The Economics of Attacks

Jumping into the economics of attacks, it's a little bit difficult without diving into a specific example. I'm going to be diving into credential stuffing specifically. I'll go into a little bit more detailed about what credential stuffing is in a minute, but this is the path a data breach takes to real-world fraud through the path of account takeover. A lot of companies see this in terms of exploitation, damage, whatever it looks like in a company. It often looks different from company to company, industry to industry, but it's all fraud that comes from an account being taken over, and mass account takeovers occur from credential stuffing, which is enabled by credential spills, which is just a type of data breach that spills credentials.

There are a bunch of other different types of data breaches that spills different things. I'd love to talk about those. I could talk about anything I'm talking about here for hours if anyone is interested later. But I'm going to be talking about the path here because this is on the tip of the tongue for a lot of security professionals, because it is causing so much pain for us on a day to day basis.

It's important to know that there is a line through this path of things that you can control and can't control. You will never be able to control some stupid startup in Silicon Valley losing control of all of their databases because they thought security was a problem you dealt with when you scaled. But what you can control is when an attacker lays their finger on one of your services or servers for the very first time. It's important to know when that actually happens so you know what to look for. It's too late by the time fraud occurs. You have no chance at mitigating anything at that point. All you can do is clean things up.

What we're seeing now is that web applications and APIs are the softest targets that a lot of attackers are targeting because historically, web security has been really bad. We have lots of really bad technologies pasted on top of each other that don't do as much as we want and are annoying to use. We've largely ignored a lot of major security problems on the web and APIs, because we just deal with them after the fact on the service side. What we're seeing now is that with so many attackers trying to emulate legitimate traffic and going through the same path the legitimate traffic is going through, we need to tighten up the security a lot there in order to start to regain control.

Credential stuffing, for anyone who's not intimately familiar with it, is the automated replay of breached credentials across dozens, hundreds, or thousands of other sites to find out who is reusing their passwords. Spoiler, if you're reusing passwords, stop today. It's a serious, serious problem. These steps to go through in order to engage in a successful credential stuffing attack is you get credentials, you automate the login, you defeat whatever defenses already exist, and then you distribute globally.

Now, getting credentials in the first place is a lot easier than a lot of people probably think. You can get your first credential list right now by picking up your phone or computer and Googling "combo list." Combo list is the term that the criminals on forums use in order to refer to a credential and password pair list. If you do actually do that, I would recommend, if you do dive deeply into this and start investigating a bunch of sites, you probably want to do that in a trusted environment, like a virtual machine or something that's not your primary surfing device. As you browse websites that are clearly tailored towards the criminal spectrum end of the spectrum, you'd probably want to take care of yourself a little bit more.

Next step is to automate the login. Now, there are many, many tools out there that will do this for you if you're not dev-minded and can't do it yourself. They come in many shapes and sizes. This is Sentry MBA, which is a pretty old tool at this point now, very common. It's relatively basic in its core capabilities, but it is very extensible and plugs into a lot of services and enables these attacks to be generated relatively quickly.

There are things like FroadFox, which you can see it there. It's actually got a very beautiful website, if you do check it out, right now. These services are clearly lucrative enough where they can pay designers in order to have very fancy websites. What you see up here, it's a virtual machine-based solution to defeat browser fingerprinting. So that tells you at the start it's very scalable immediately. You could just get your virtual machine. You configure it as you need. You put the scripts in it that you want and then you deploy it to whatever virtualization service that you want and you are immediately scaled to the clouds.

There's another tool, this is browser Antidetect. This is an older version that's got a bunch of different things that you can configure in order to randomize the properties of a browser environment, in order to obscure the fingerprint. And that's what a lot of these tools have in common. When you're dealing with an attack, a large scale attack, one of the first things that you have to do is to classify and identify what attacks are or what campaign. And if you are obscuring the fingerprint of all your traffic, it's very difficult to see the pattern in that attack. You might know you're under attack, but are you under attack by one skilled person or a thousand? Those are important questions to answer in order to figure out how to actually defend yourself.

Next step, you have to defeat whatever defenses already exist. You've probably seen this before a million times, this is Google's reCAPTCHA, Version 2 of their reCAPTCHA, and it attempts to automatically assess how human-ish you are. If it can't do that, then it falls back to a challenge in order for you to show Google how human-ish you are. The fallback challenge is selecting pictures that look like road signs or cars or railroad tracks or things like that. And that obviously feeds back into Google's massive, legendary AI systems in order to improve whatever other services they have.

The problem with solutions like this, especially generic, commoditized, widespread solutions, is that it incentivizes services that bypass them, like Death by Captcha. This is a captcha solver, one of dozens. You can Google "captcha solvers" in order to get a whole list of captcha solvers that you might want to try. This one is a very cheap option. It's a thousand solve captchas for $1.39. It's got APIs and libraries for all the popular languages that you might want to use. Ruby, Python, Java, JavaScript, anything you might want. It shows right up here the average time to solving a captcha, which at the time of the screenshot, was about 10 seconds. The average accuracy rate was 90%. And if you do visit that site and you stay long enough, you get also a customer service window. So if you're having trouble, there'll be able to walk you through the attacks you're trying to execute.

Google, of course, is incentivized to continually iterate on their product. They have just released reCAPTCHA V3, which I think was released actually October 29th, last week. This, of course, is a revolutionary change in the bot detection and mitigation. This shows a screenshot of 2Captcha, which is another captcha solver service. Their blog post about six months ago, I think it was June, talking about how they're already planning on bypassing reCAPTCHA V3. What they do is they farm out the score that reCAPTCHA 3 is getting to a bunch of workers so that you can actually get the score that you desire in order to bypass a Version 3 just by requesting it from 2Captcha's API. Now inevitably, of course, these things will evolve over time and get better, but there is incentive to bypass these and there will always be means to bypass them.

And then your fourth step is to distribute globally. That is because of the horrible hurdle we placed in front of bad traffic years and years ago, where if you're making hundreds of thousands of requests from a single IP, the first step is to of course block or limit that IP. Then you just have to distribute that across a bunch of IPs. Over the course of the past 5 to 10 years, we've seen an explosion in cloud services. Now, it is extremely easy to get as many IPs as you want extremely cheaply. If those services aren't enough and you need some better IP addresses, there are billions upon billions of poorly secured internet of thing devices that are all part of these lovely botnets sitting in residences with residential IP addresses, which are just so valuable to criminals because it is mixed in with all legitimate traffic, and not just hosted on some server somewhere.

Talking about the cost of these things. You can get a combo list starting at $0. It might not be the freshest combo list. It might be a little bit old, but it still has email addresses and passwords. If you're just going to start exploring the world of criminality, you might as well start cheaply. How many developers do we have here? Oh, this is good. Okay, well you could write your own tool. You could configure it yourself. You're probably all very smart people. But since you're a criminal and you want to get to value really quickly, you might as well pay for somebody do this for you. Fifty bucks is what we see is the going rate to configure tools like Sentry MBA and some others.

To do a proper good credential stuffing attack, you're probably going to want about 100,000 base requests. What we see is a success rate of 0.2 to 2% of these combo lists. So people change their passwords over time, for whatever reason they don't reuse passwords. Not all the passwords are going to get are going to hit. It's going to be a very small subset. So in order to get about 1000 accounts, we're going to want to make 100,000 requests. So we spend about $139 at Death by Captcha and then we can get 1000 global IP addresses for about $2, which gives us the ability to space out these attacks, about a hundred per IP over the course of a few days. Very hard to actually identify with traditional means. And then we get 1000 accounts out of it. So it costs us about $200 to get 1000 accounts. And the value of out of that is dependent on the accounts that you have actually taken over.

We've seen low-value pop star forums accounts sell for very low fractions of a cent, other services down to 20, 50 cents. And then bank accounts, other accounts for upwards of $8.50 to $10 to hire depending on the types of accounts that are being sold. So for $200, you can get probably between $500 to $10,000 worth of return. That's pretty good. Especially if you're just exploring this on a weekend and you just want to see whether or not the criminal life is suitable for you, you can get started easily and see where it goes.

Now, this is also just the accounts, not the damage you can do with those accounts after the fact. Criminals are generally good at about one or two things. So you get the breachers and then you get the people who write the tools and then you get people who perform the attacks. Then you get the people who perform the targeted fraud. There's a big economy around this and they all sell things to each other. So this is just getting the accounts and then selling them off to somebody else.

Flipping the Economics in Your Favor

How do you actually make that more expensive or reduce the value? This is hard and complicated and I'd love your feedback on this section to know how deep or shallow I should go. At the very least, you're going to want separate detection and mitigation. Your ability to detect and gain visibility on your attacker is critical. If you lose that, then you lose all ability to track this attacker throughout your services, over time, to reconcile actual account takeovers with a particular campaign, and your mitigation needs to never compromise your ability to detect.

Again, back to the IP rate limit example. If somebody is performing an attack from an IP address, it is very easy to see everything they do with regards to that attack, because as long as they want responses, they have to give you a routable IP address and you can just look at that. As soon as you start blocking that, you're teaching them to distribute, which ruins your ability to detect. Obviously, this is a very simplified example and this is much more nuanced depending on how the attacks are occurring, but that detection and mitigation needs to be separate and not compromise one another.

Now, your ability to detect should have lots of backups. When you're dealing with websites and APIs, you'll probably want to look at a bunch of different things in order to assess what is most appropriate for your business in order to classify and track your traffic over time. Now, I'm not necessarily talking about ad tech tracking or anything like that. This is not necessarily per user. This is classes of attacks and patterns in traffic. You can look at browser properties, which then get randomized by the tools we're talking about. But if you're good and your applications help you and your product development teams are working in concert with you, you should be able to develop some stuff that'll allow you to track traffic over time. You can profile hardware. You can look into anything, any clues the OS gives you in order to have these abilities to track patterns in your traffic.

Also, one internal aspect that you have that attackers will never have, is your own understanding of how your traffic acts with your site on a day-to-day, week-by-week and month-by-month basis. How quickly do your users upgrade their browsers? Who uses Chrome? What time of day? Which regions of the country use what types of browsers? All of these things, if you're able to separate the malicious traffic and good traffic, give you clues as to when things are out of line, and age or anomaly detection systems to let you know that something is up and you need to deal with it now.

One thing that once you start to get a little bit more advanced and start to build some of these things in your product, you want to have alerts as to when attackers are trying to pry into your systems. Again, you don't want to mitigate on these things, but this is a clear indication that you were dealing with a more sophisticated attacker who is on to some things that you're doing and you need clues in order to alert you early enough, so that you can deal with them during the process of retooling and investigation before they've actually done damage.

Now, when you think about the counter-attack and actually dealing with an attacker while they are actually trying to attack you, you need to target different parts, depending on the type of attacker. For novices and the script kiddie-type class of attackers, you can do pretty well just targeting their tool. Probably they're not sophisticated enough to develop their own tools, to configure their own tools, and that adds a lot of cost to them, because they have to go back and figure out what's wrong and probably pay or get help from somebody else in order to do it. When you're dealing with more sophisticated attackers, though, that's probably not going to be sufficient. These are people who are dealing with their own software to attack you or are experts at configuring them or manipulating open source software, so you need to target them differently.

It might be their architecture, their infrastructure, the services that they're using, or depending on the class of attacker, their reputation in the dark web or otherwise. Now, this ends up being more complicated and requires teams to actually research and identify who these people are and track them outside of their attack. We'll get into that in just a minute. It's a lot easier to think about all these problems though if you associate all of them with your standard software development lifecycle. A lot of us are developers. If you're dealing with attackers who are developers, they go through the same stuff that we do.

First stage is the planning and requirements gathering phase. You generate your user stories and you put a bump on your Jira board. In this case, the attackers are going through what URLs do we need to attack? What tools will work? What tools won't work? What prior art exists in the space? What data am I going to be using? And you go through the development phase, which is your standard development phase. Now, a lot of you are developers right now. When you are developing, chances are you're using some sort of a framework. If you're dealing with Java, it might be Spring or something like that. If JavaScript, it might be React or something like that. I'm going to refer to that in a few slides, but get the developer mindset in your head, because that's very relevant here.

Then you go through testing. You want to make sure that you're accounting for all the edge cases that your services provide, any errors that they're giving you. If you're dealing with an attack that spans over the course of weeks or months, you want it to run by itself. You don't want to have to babysit it. You don't want it to break and have you be notified in a couple of weeks. You want it to run reliably. Integration on the attacker side is actually getting everything ready, working with your botnets, proxies, services, making sure that production API keys are actually being used. Then you engage in the attack, which is the release phase for an attacker. Everything is all done. You execute your attack. You walk away and you'll reap the value later. Now, why is this important?

These first four stages are all the cost-incurring stages. The last stage is where value is recognized. We know all that. We get yelled at by executives and product managers when things aren't done yet because as long as it's in the first four stages, there's no value being received whatsoever. So in order to damage attackers who are going through this process with you, you need to reduce the value generation stage to as close to zero as possible and prolong the cost-incurring stages to as long as possible.

Now, think back to when I was talking about the developer mindset. So you're developing in Spring or React or whatever else. On the attacker side, it might be Sentry MBA or Selenium or Headless Chrome or anything like that. Now, you might have 8,000 different ways of detecting and blocking that, but if all those are deployed at the front, all you've done is during the requirements gathering phase is tell an attacker, "Hey, don't worry about it. Don't use these tools. We've got them covered." That saves them so much time and cost and you've just boosted their process. What is much more valuable is you hold on to everything that you have until it's necessary, and only throw it out when the value generation phase has started.

In this case, they go through it. They're using Selenium. They go through the development phase. They deploy an attack. You block on just the bare minimum of what you need to do to block it, and they're like, "Oh, it broke." They go through it. They see what you did and it's like, "I should have accounted for that." Go through, they deploy it again, you block. It goes through it over and over and over again, and you start to piss people off.

This is really what you need to do. You need to professionally piss people off as a service. And that gets to the heart of how you de-motivate these attackers. You think about what pisses you off as a developer and then you start to attack those nuances and you drive the cost up, the value down for long enough and then there's no value to continue the attack.

Case Studies

Now I'm talking about all a lot of very fuzzy things. What I've learned is that it's all irrelevant unless I can actually give you some examples of how this is actually done in the real world. The damaging reputation one is one of the more complicated ones to actually wrap your head around. I'm going to go through one example where we went through this with a big U.S. bank and a well-funded scraper. So scraper, I mean somebody who's accessing an API or website, gathering data that either they shouldn't have access to or the owner of that data thinks they shouldn't have access to or for whatever reason.

We saw this particular attacker going through multiple properties, which is what we see when we're protecting mobile and web. We'll see an actor go through whatever is more comfortable for them at the start. Then as defenses ramp up, they'll cycle off to the next softest target, and then they'll go through there until they circle back to where they started, try to dive a little bit deeper, then repeat until they settle on one.

We saw this particular actor settle on iOS due to some version lag with the SDK that we have deployed as a company, due to whatever reasons, politics, teams, holidays or whatever else. So defenses were not quite to the premier level on there. We saw the actor stick around there and cause us quite a bit of pain. We went into a cycle where it felt like we were whack-a-moling just like back in the IP rate limit days. We would block. They would get past. We would notice it. We would block. They would get past over and over and over and over again. This is clearly getting to be a very sophisticated attacker who was not dissuaded by this constant back and forth. You get these scenarios when you have these actors funded. Their livelihood is not dependent on this. Their mortgages, their dinners at home for their family, is not dependent on the success of these attacks. They have jobs to do this.

One of the things we noticed is that the cycle started to follow a traditional work schedule. So we noticed this over the course of several weeks and it became obvious that we were dealing with somebody who is not going to be dissuaded very easily because it was job security to go through this on a day-by-day basis.

We saw there was a regular working schedule. We also were tracking the success of this particular actor, and we were witnessing their public reactions to success or failure on the internet. We recognized that right after being successful, they would not announce their success until they were sure that the success was going to last. Then when that success was announced and downstream consumers were consuming that scraper's data, if the service was disrupted, that was more painful. This pain provoked a lot of distress that we were able to witness publicly via posts that we were finding.

What we did, we certainly, of course, targeted our offense out of working schedule. As a developer, that would piss you off. We'd turn on defenses specifically when we knew the most eyes we're going to be on them, when the downstream consumers were using the app and the most eyes were on the actual data, and then we would turn it off. We would turn off access to the data. We would engage our defenses. And then when we knew that the attacker was back on their working schedule or had started to engage with us again, we would turn off those primary defenses and then cycle through a series of secondary defenses. Even before the attacker had fully retooled and got around us, we would also take off some of our defenses.

Now, as a developer, variable feedback is poison. If you're trying to troubleshoot something and all of a sudden it works and you didn't do anything, it's like, "Okay, I guess I'll go do something else now," and then it stops working again, this is the type of stuff that causes extreme frustration. After a couple of weeks of going through this, we saw that they gave up. In this particular example, this particular scraper was not necessarily doing anything criminal, but it was a security problem for the customer we were dealing with. They weren't acting appropriately. They ended up engaging in a contract with each other in order to make sure that this was manageable over time. The same strategies work though, with any sophisticated actor, because we're all developers and we all think the same way.

The second case study is with an actual criminal actor. This is somebody who was engaging in a credential stuffing attack and then was actively taking over accounts for a big U.S. retailer. This was early in our days, this is about a couple of years ago now. It's still useful as a case study, even though we've evolved. This is at a time, though, where we were so determined to do everything possible for our customers that we were shooting ourselves in the foot without necessarily fully realizing it.

We were using a lot of what we're using to detect in order to mitigate. We thought we had enough, that we would just block on some of these signals we were finding, and the attacker would move on just like all the older ones would do. But this one didn't. This one kept on coming back, kept on retooling, and we were down to a point where we didn't really have that much left, but we knew we had to hold onto it and we couldn't mitigate on it. So we were in a position where we could see the bad traffic, but we couldn't do anything about it.

Now, when you're a vendor and you're talking with the customer, that's a strange and awkward conversation. It's like you see the attackers, but you can't do anything about it. We had to learn how to talk our way through these things. And now we've evolved a lot, and we evolved in a way- in order to adapt quickly with these attackers, we built our systems to be supremely flexible and adaptable. We have our reverse proxy that's fully programmable. The JavaScript that we send down is all composed and generated and bundled on the fly. So this enables us to essentially build product features without new deployments. We can develop all these things very quickly. So what we did in this case, we started to flex that to degrees that we hadn't before and we developed a feature that enabled us to send down custom JavaScript to this particular actor.

Now, when you have the ability to segment the traffic to that point and deliver a custom payload there, the possibilities are intoxicating. This was a bad person who's doing bad things and they're just taking your code and running it. That's wild. So what we did is we delivered a payload that interrogated their systems to a degree that allowed us to see exactly the code that they were using in order to get around us. For people who deal with JavaScript, Function.prototype.toString is a way to inspect the code inside of a function. We were essentially just delivering that backup to us when it was changing, so that we could see how the actor was actually rewriting JavaScript in order to get around us.

This gave us the ability to see what they were trying, how they were working, how sophisticated they were, what they were poking at in order to get around us. Well, actually, one of the things that we found was that one of the functions that we had inspected had a commented outline that had a typo in it. Typos are very good to Google or search on GitHub for because you're going to get very few hits. So Googling for this typo led us to an open source plugin that was used by this attack tool that we had seen before and we did have ways to mitigate.

What we found is that this particular actor was a competent developer, but what they were doing was largely just trial and error and just getting extremely lucky. Now, that's good for us because it made us more confident, because we weren't entirely sure exactly what we were dealing with. But now we had supreme visibility on precisely what we were dealing with and we knew how to respond.

We've certainly built up the defenses around that particular tool just to make sure we had enough backed up in case it got to the point where we needed to deploy more defenses. We also were able to now provide variable feedback during literally the retooling phase. Now, as a developer, that is nightmarish. What if something you're writing works when there are three log statements in it, but not two and not four? That starts to get crazy. We can now do that because we have this ability to inspect what's going on. And this is all stuff that you could do. We've had to build a lot of defenses in order to protect our code in order to make it less obvious what we're doing. But this is the type of stuff that you can do in order to just really bug the crap out of people.

We continued to build out this system of adaptability that can be built in any company. I work at a company, we're a vendor, we try to do this for people because it's complicated. This adaptability is important and this requires a lot of collaboration from the security and the product development teams.


The recap. If you take one thing away out of this, detection and mitigation are separate and they need to have separate strategies and you can never compromise detection for mitigation. It's a shortsighted play and it just shoots you in the foot. You need to protect the data that you use to detect. So you need to obscure what you're looking for. Even if somebody is able to see exactly what it is you're looking for, you need to make it clear or you need to make it less clear what parts you are actually acting on. So you need to over-collect or under-collect. Let's say if you're accessing properties in the browser, or maybe you're just looking for everything that starts with a P and not particular properties. So then whenever you get back, it's not necessarily entirely obvious what it is that you're using to detect an attacker.

You need to understand what is incentivizing your attackers. So this certainly means working with fraud teams, working with marketing teams, working with product development teams, understanding how these accounts are being taken over and what people are doing with them in order for you to craft better defenses. And you need to work with product. I know it sucks, but it's something that is necessary in order to make sure that you are able to react when you need to react. It's hard because there's a big UX trade-off with security, and it requires different conversations to be had in the companies that we’re really not used to having. We are oftentimes the people who come in and just piss off others. We say no to things. We say, "If you want to be secure, you have to go be way over here." And they are like, "No, we're over here." We can meet in the middle and ramp things up and down as necessary. They can be at perfect UX until there's an attack and you might need to ramp it up or down as necessary.

This is a quote that I love because it defines so much of what I feel about security right now. "Tactics is knowing what to do when there's stuff to do. Strategy is knowing what to do when there's nothing to do." Security is very strategic right now. We have no end of penetration testing, vulnerability testing, exploit databases, dependency scans, all patches, RSS feeds that tell us everything. We are planning for stuff that we don't know is ever going to happen, but we have sacrificed our ability to react when something is actually happening. We need to ramp that way up because this is not stopping. The past five years, we've seen a dramatic rise in sophistication and that is not stopping. Thank you.

I did also ahead down to my office while I was here in Mountain View and I stole a bunch of webcam covers if anyone is interested. And there's a free whiskey tasting tomorrow. If anyone is around, please come on up. For the webcam covers, if you're asking, "Do I really need those?" If you ever ask a security professional, "Do I really need to do something?" the answer is almost always yes. Questions?

Questions & Answers

Participant 1: It seems like these tactics are equally applicable to the attacker and the defender. Have you actually seen that in practice?

Overson: You know what? No. I think it might be a little bit early in the game for us to have seen things like that. I think as a company - I'm biased, of course - I think we're relatively ahead of the game. Even just the ability to segment traffic and deliver our custom payloads down to an individual in order to do anything on their particular computer, I think that is something that we've now been doing for long enough that we feel comfortable talking about it in the public because there are likely people who actually understand that that is what we're doing right now. So it's no longer necessarily something we need to keep so secret, but it's something that I think it will happen. I think everything, everything horrible that you think will happen with security will happen with security. It's just how well it scales and whether or not there are easier ways for people to do what they would get out of that.

Participant 2: How do you see machine learning and blockchain playing their role in improving the state of security? Any takes on trends that you might see in the future where we might have more effective tools?

Overson: We will always have more effective tools, but so will attackers. Similar to what was just mentioned, I think right now it's starting to be on a decline. But over the course of the past few years, we saw so many security companies pop up with the magical silver bullet AI or blockchain solution. And what seemed to escape a lot of the media is that if we are using AI, the attackers will just use AI. What we are starting to see now, especially on mobile, is malware being installed on mobile devices and learning the behavior of the people who use those devices in order to deliver learned behavior to attack targets. Both sides will be using both those, or, well, mostly AI, blockchain.

Participant 3: How do you cope with scenarios where the attacker starts using the variable feedback tactic back on you?

Overson: If I'm not mistaken, that's kind of what the question was earlier. You know what, I might need to think about that more, because I'm thinking about it in a very specific way. I think the variable feedback actually is already happening at a naive way, and what will happen is a more sophisticated way. So the variable feedback with some of those tools I was saying, that is, I guess, variable feedback. It's obscuring the fingerprint by changing the behavior of these tools and making them act differently on every request in order to make it less clear what's happening. That is, there are a bunch of naive ways of doing that now. What we will inevitably see are ways that further emulate real human behavior, and do all the weird things that humans do.

Participant 4: Great talk, by the way. Thank you very much. You mentioned two things to do to make the attack not so effective, right? You elaborated more on how to drive the cost of the attack up. So it is not so clear for me personally what is applicable to drive the costs of those stolen data or the breach, whatever, down. Can you elaborate a little bit more on that? I understand that it's more on the application or production side of the business maybe, but at least a couple of examples to make it easier for the developers to understand.

Overson: That's a big question. One way that we're targeting this as a company, and I don't want to get too sales-pitchy, but there's a product that we have. Do you know "Have I been Pwned?" Troy Hunt's website, pwned passwords? You can go find out whether or not your data has been breached. You can find out whether or not your password has been used in these breaches. We have a service that's similar, but we tie it into our automation detection and deflection service. So we are able to see early cases where new credential pairers are being used and they aren't on the dark web sources. So that's how we're trying to devalue credential values in total because if they're made invalid on the first attack, then the value drops substantially and they become less valuable for any other attack.

Again, I mean, that's largely just within our network as a company and that's where I probably should stop because it gets kind of pitchy after that point. But I think internally to a company, that's a question that's probably worth having over a beer or two. I think.

Participant 5: In terms of analyzing the attacker, I've heard some interesting tales of malware scans like VirusTotal being used as a testing tool by malware developers. Can you remark on that or is that something you've seen?

Overson: Can you rephrase the question?

Participant 5: There are some tools available now that will run - I guess maybe this is more in the antivirus space- but that will run different malware scans on sample files. In some cases, these tools are used or have been observed being used as part of the SDLC for a malware developer.

Overson: That is exactly what I was talking about with the keeping detection and mitigation separate. A lot of those antivirus tools are looking for fingerprints in malware, and if you use that to detect and to block, then you allow the attackers to figure out what you're using to detect. Then once they've gotten around the detection, they've gotten around the mitigation. That's certainly something that is done. I focus mostly on the web, which is, I guess, less relevant. But that strategy in general, the fingerprint whack-a-mole, is not a long-term strategy.

Participant 6: Great talk and very interesting to hear the variable feedback perspective. One question I have on that is don't you have to babysit this to provide the variable feedback, or is this something that is baked into the tool that you were mentioning?

Overson: Internally to the company, and I think this is actually probably a good practice that anyone should apply, is that when you find something generalizable, you generalize it and make it automated. And then as you find new edge cases, you tackle those by hand until you find a way to generalize them, and then you generalize them. So babysitting is not a term that I would use, but you do need humans who are analyzing anomalies in traffic in order to identify when someone is getting around you. That's really, I think, one of the failures of AI as a silver bullet right now. Machine learning does a good job at finding similarities in the things that we have seen similarities in before. In adversarial situations like what we're talking about, adversaries will always try to get around that and you can have anomaly detection systems that alert you that something is up. It's hard to have those systems automatically adapt foolproof quickly. So if having humans involved is babysitting, then yes, babysitting. But I think it's more just fleshing out a team that should exist and doesn't exist.

Participant 7: Awesome talk. Thank you so much. I was curious about the aspect that you mentioned whereby monitoring forums and seeing what's going on in some of those spaces, you could kind of track the psychological impact of what you were doing on the attacker. I was curious if that's something that you do proactively, like monitoring forums, keeping tabs on big players, if you will, or if that's something reactive where you identify who you think might be the target, and then go searching for that information?

Overson: We try to be proactive, and we have teams that monitor dark web things in general for a bunch of different purposes. What we sometimes find are posts that we find out are referencing our technology from one way or the other, and then we start to look into those people, those posts, their interactions, where else they link and then start to build out that network there.

It's not something that is easy. It is hard. I think one thing that I've seen a lot of companies do nowadays, is proactively scan dark web forums and marketplaces in order to look for their own data. That is an easier way to automate this process, because you can have a variety of canary-like entries in your databases. So you can just automatically search for whether or not those entries exist in dark web data. If you do know that, then you know you've been breached or something has happened and you can go back. To do this stuff that we're talking about, we can do it because we're a service and that's kind of wrapped up into the service. It's hard and cumbersome and fraught with dead ends.


See more presentations with transcripts


Recorded at:

Apr 06, 2019