Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Podcasts Johnny Xmas on Web Security & the Anatomy of a Hack

Johnny Xmas on Web Security & the Anatomy of a Hack

On this podcast, Wes talks to John Xmas. Johnny works for Kasada, a company that offers a security platform to help ensure only your users are logging into your web applications. Johnny is a well-known figure in the security space. The two discuss common attack vectors, the OWASP Top 10, and then walk through what hackers commonly do attempting to compromise a system. The show is full of advice on protecting your systems including topics around Defense in Depth, Time-Based Security, two-factor authentication, logging/alerting, security layers, and much more. 

Key Takeaways

  • While there are sophisticated web attacks out there that use things like PhantomJS or Headless Chome, the vast majority of the web application attacks are the same unsophisticated scripted attacks that you always hear about. These are simple scripts using tools like curl and BurpSuite with Python or JavaScript. These simple scripts are still incredibly effective.
  • OWASP Top 10 really hasn’t changed all that much in the last ten years. For example, despite being the number one approach used to educate defensive engineers on how to protect their apps, SQLI (SQL Injection) is still the most common attack. We continue to repeat the same mistakes that have exposed systems for a decade now.
  • Phishing is by and far the quickest way to compromise a system. Defensive in Depth, security boundaries, limiting local admin rights are all things that corporations can implement to minimize the blast radius.
  • Attackers have hundreds of gigs of actual username/password combinations that have been exposed from all the breaches over the past few years. These are often a first step when attempting to compromise a system. It’s more often likely that they will figure out a valid email pattern for a company and then feed actual names into that pattern to go after the username. From there, brute force attacks with those usernames against libraries of passwords is a common approach.
  • A common approach is to go after an email login. While the email can be a treasure trove of information, it’s more about using those credentials in other places. It’s pretty common, for example, to use those credentials to get into a network with a VPN.
  • Captcha/reCaptcha is not very effective and preventing these brute force attacks. There are a large number of bypasses and even Mechanical Turk companies that are available to bypass these tools. What can be effective is Time Based Security because it slows the attackers down. If you can slow them down, you can make the attack say long to succeed that they’ll go somewhere else.
  • Once inside the network, most companies often have little security on internal systems. Multi-factor authentication, not just on the front door, but on internal systems is a huge step in the right direction. Monitoring not only for failed login attempts but, in some situations, valid login attempts (such as when a domain admin logs into a domain controller) should absolutely be used.
  • When it comes to application security between services within a network, the best advice is to make sure developers really understand what is trying to be accomplished by something like JWT (JSON Web Tokens). Often its the lack of understanding of what they’re actually doing that leads to system vulnerabilities.

Show Notes

What does "ensuring only humans access web apps" mean? -

  • 01:15 We used to talk a lot about bots, and I found this gave the mentality of botnets of compromised computers in the cloud.
  • 01:35 What we've developed is something beyond that - targets the use of automation as an attack tool of any form.
  • 01:45 When you're being attacked by a human being, they are likely to be writing a lot of scripts and automated tools to do a lot of the monotonous work.
  • 01:55 The main flaw with humans is that there are a lot of monotonous tasks that need to be done within a certain amount of time
  • 02:05 Humans get bored very quickly in comparison to computers - that's why we use computers to do those monotonous things.
  • 02:15 It's not just the botnets that are out there launching one-off vulnerabilities, but also the intelligent attackers who have a complex toolkit built out.
  • 02:35 While we ensure that these botnets are keeping their hands off of your stuff, we're also ensuring that these one-off bots are also staying out of the way.
  • 02:50 As a result, only humans should have access to the web apps.

Do you deploy as part of a CDN or a reverse proxy? -

  • 03:05 We have our own cloud infrastructure that can be dropped in at any part of the stack.
  • 03:10 We do recommend going behind the CDN which allows you to control URL specific rules for filtering.
  • 03:20 You don't necessarily want to process every HTTP request that comes in to determine if it's a bot or not.
  • 03:25 The reason you have a CDN is to offload a lot of your asset delivery (images, videos etc) - so if you've got the CDN doing that anyway, you can let that handle the attackers.
  • 03:40 If you're dealing with an intelligent attacker brute-forcing on a login form, they're often not downloading assets anyway.
  • 03:50 If you put us behind your CDN, or we can operate the CDN, then we get those requests and figure out what's good and bad - the good stuff gets forwarded to you and the bad stuff dropped.

What are the sophisticated attacks today you are seeing? -

  • 04:20 It's almost misleading to use "sophistication" - there are some extremely sophisticated attacks, but the vast majority of them are still the same old scripted curl/Python/NodeJS hammering.
  • 04:55 Nearly no-one is defending against that sort of thing, and that predominantly comes from ancient technology.
  • 05:10 Relying on IP blacklisting is completely useless - IP addresses are freely and easily rotatable; you can get thousands of them cheaply.
  • 05:20 There's been nothing on the defensive side to cause the attackers to up their sophistication - if there's no reason to do the work, they're not going to do it.
  • 05:30 That being said, there are sophisticated tools out there, such as PhantomJS or Zombie JS – most people will be familiar with Puppeteer combined with headless Chrome.
  • 05:40 That's actually controlling a real browser that's able to do everything - handle cookies, process JavaScript - that's extremely devastating that prevents most detection systems out there.
  • 05:55 At Kasada, we have developed a great defence from Puppeteer and other tools as well - but for the most part, companies don't have that sophistication in defence.
  • 06:05 The upside is that most attackers don't resort to that unless they need to - and when they need to, it is surprising how many attackers aren't more aware of those kind of sophisticated tools.

It doesn't seem like the top ten OWASP vulnerabilities have changed much in the last decade. -

  • 06:30 A good friend of mine, Jason Street, likes to like use the phrase that SQL injection is the most common vulnerability on the internet today.
  • 06:45 It's incredible where SQL injection still works, given that it is the number one tool used in training people how attack methodologies work.
  • 07:05 It's prolific - I can't believe it still works.

What does an anatomy of an attack look like? -

  • 07:20 The most common methodology is phishing - I'll skip over that one, because it's almost cheating.
  • 07:30 A lot of our clients won't let us use phishing as an attack methodology because it always works.

So the weakest link is the people that operate the system? -

  • 07:45 Absolutely - phishing is incredibly effective; it's so effective, that many of my clients wouldn't want me to use it because they know they are susceptible to it.
  • 08:00 As a result, they don't look at it as worth their money to use it when there are other things which need to be investigated.
  • 08:05 A lot of people in companies are getting tired of the awareness training that involves the anti-phishing stuff.
  • 08:10 Everybody knows what phishing is and how it works - in IT, in my family - everyone has been subject to the security awareness training.
  • 08:25 We are in an age where everyone is given an e-mail address, including the cashier at a retail store, so you get the awareness training.
  • 08:30 The bottom line is that most of the awareness training isn't very good: they're not training people properly, often because the training wasn't designed by people who know psychology.
  • 08:40 In the end, the attack that most people don't care about are going to fall for these phishing attacks because their mind is elsewhere.
  • 09:00 If you try long and hard enough, you're going to be able to get someone to open that e-mail, click on that link, download that executable - protections aren't inside and the executable is going to give you an in.
  • 09:10 The reason I call this cheating (as part of discussing the anatomy of a hack) is that you're just targeting the people with the highest level of access - IT admins or the helpdesk.
  • 09:20 The helpdesk is my favourite - it's often very inexperienced people with very high levels of access within the company.
  • 09:30 Once you can get one of them, you're in - and you mostly have the keys to the kingdom, and it's a short hop, skip, and a jump to whatever you have defined as the goal for that engagement.

How do you limit the blast radius? -

  • 09:55 It's very difficult in a huge corporation with 10s of thousands of endpoints, possibly since the dotcom boom, it's difficult to remediate some of these things.
  • 10:10 I would go in and observe that everyone in the corporation has local admin rights on their windows machines.
  • 10:15 That's absolutely devastating - that's a critical thing that you need to stop right now.
  • 10:20 Then it turns out that they can't just turn it off, because so many people have built workflows around requirement of local admin and it's going to take years to find out who really needs it.
  • 10:30 Even when they need local admin, you can put controls in to make sure they're not constantly using it - but then you get pushback from the teams, then managers, then the IT policy doesn't change.
  • 10:45 If the people who have access to the most things within most things also have local admin, and I can compromise one of those people, then it's game over.

What is the next most common attack vector? -

  • 11:05 The second most common attack from couch to compromise methodology is fingerprinting of external environments, and you can test login credentials.
  • 11:20 Login credentials are something that every attacker out there has a gargantuan amount of - hundreds of gigs, potentially username and password combinations (in some cases, confirmed).
  • 11:30 These come from breaches that you hear about on the news on a daily basis.
  • 11:40 If you hear about a company getting breached, with X number of customer leaked, then that often means a database was found that contains that many records.
  • 11:50 It often contains usernames and passwords, often plain text - sometimes the passwords are not plain text.
  • 11:55 In the non-plain text case, those researchers will go through a lot of effort to convert those hashed passwords into plain text.
  • 12:00 There's competitions for such things, and then they are shared around different users - there's now a terabyte of password lists.
  • 12:15 The trick is: figure out a place where I can see who is using what passwords.
  • 12:20 I then use a username and password list to see if I can break in with those.
  • 12:25 If we're breaking into a web login, then the usernames will often be e-mail addresses.
  • 12:35 Using some tools I'm going to figure out what the employee address is going to be for your employees - all I have to do is find a few e-mail addresses of people who work for your company, and I'll be able to figure that out.
  • 12:45 The chances are that your company will use something like firstname.lastname or firstinitial.lastname, so I'll take the people names that I have and will build usernames and e-mails based on that logic.
  • 13:05 Once I've got those, I'll start firing those off in combinations with the password lists that I've got, starting with the most common passwords first.

How can you slow those things down; backoffs? -

  • 13:30 They aren't very effective - you can go to GitHub and search "captcha bypass" and "recaptcha bypass" - there's an ocean of results for this.
  • 13:40 They are all very effective and they work.
  • 13:45 Captcha doesn't work at slowing down automated attacks, but it does annoy your users.
  • 13:50 It works to keep the casual attacker at bay, who doesn't want to put in any work into developing a targeted attack on your site.
  • 14:00 It does keep the botnets at bay, that were crafted one type of attack when they find one type of application and move on when it doesn't work.
  • 14:15 The really motivated attackers, who have the desire to compromise your specific website or company, those will be able to get past really easily.
  • 14:30 Failing automated tools, if you have a form of a captcha that has not had any type of bypass tool created for it, there are countless mechanical turk companies that will sell you human beings that will solve the captchas for you. [Obligatory XKCD: - ed]
  • 14:45 They have APIs and howtos and instructions, and you can drop in your script that you are writing, and it will offshore your captcha solving.
  • 14:55 I was looking at one in Indonesia, where for 50 US cents you can get a thousand captures solved.
  • 14:50 The payout was something like 50 cents per five thousand captures solved - but you can work from home, and it goes a long way in those areas.
  • 15:20 In the end, captchas don't work, because I can pay a human being to solve the captcha for fractions of a dollar and get them to solve it for me.

What about exponential back-off? -

  • 15:35 Those are effective, because you're slowing the attacker down - anything you can do to slow down the attacker or make it make more time is going to be an effective form of defence.
  • 15:45 We call it "time-based security" - it's based off of the popular book by Winn Schwartau [Free download from the author's site at - ed]
  • 15:55 It's excellent for everyone who is working in technology.
  • 16:00 The bottom line is you will never have perfect security - the trick is to make the attack so long to succeed that either: you're going to catch the attacker; or they are going to give up.
  • 16:25 If you can limit the number of attempts that cam be made over time, then that's going to be more effective.
  • 16:35 It's not a devastating defence, because as I mentioned, computers have infinite patience and can try as long as the lets them try.
  • 16:45 They can try while the human is sleeping, they can try when the human is eating - they just keep on going.
  • 16:50 If the human figures out what the algorithm is behind how many attempts you can make before you get locked out, which is often trivial, they write that into the script.
  • 17:00 For example: if you get locked out with five failed attempts in a minute, then try four attempts, wait for a minute, try another four attempts and so on.
  • 17:05 They can randomise the delay between attempts to avoid being fingerprinted as a bot - if someone tries every 25s, but it may come in at 10s, 21s, 14s - then it's erratic.
  • 17:30 Randomisations can help avoid detection as a bot.

What happens after you have a way in? -

  • 17:45 The whole point of validating these credentials is not necessarily trying to gain access to someone's inbox - although an inbox can be a treasure trove of information.
  • 17:55 You're looking for valid credentials, because they can be used in a lot of other places.
  • 18:00 One of my favourite places is VPNs.
  • 18:05 I'll be pegging your inbox to get a valid set of credentials, because maybe the mail server is the least protected against brute force attempts.
  • 18:15 Once I've got your credentials, I'll hop onto your VPN, and now I have access to your network.
  • 18:20 This sounds like I'm making it easy for the sake of the explanation, but I'm not.
  • 18:30 This is an extremely common attack vector that I've used: brute-force the mailbox, then use those credentials on a VPN and then you're in.
  • 18:40 These are things you can defend against: multi-factor authentication goes a huge step of the way, especially if you're doing active directory logins.
  • 18:50 Most companies don't implement that - it's getting better now though.
  • 18:55 It's expensive and difficult to implement - especially with Fortune 1000 with the number of computers and services that they have to implement this on.
  • 19:10 In the end, it's most the most successful defence against these sort of attacks.
  • 19:15 Once I've got these credentials, and I'm able to VPN in, I start looking around again.
  • 19:20 I see what access I have from the VPN that I've been dropped in to.
  • 19:25 In most cases, there's no network segmentation - I'm able to fingerprint the entire network, see all the servers that are running out there, and what they are running on.
  • 19:35 I'll see what I can use valid login credentials on, especially since I've gained valid active directory credentials, because I can use those on every Windows (and potentially Unix) system.
  • 19:50 I'll try to identify server groups and subnets (usually based on DNS names) because those are the ones that likely have the most useful information or access.
  • 20:05 It might possibly be the end point where you have the customer information or credit cards.
  • 20:15 I'm going to hop the network again, doing what I just did - trying the logins that I have.
  • 20:20 Once I'm in, the defences that people have externally against brute force don't exist internally.
  • 20:30 No-one is throttling login attempts internally on a web application, and that blows my mind.
  • 20:35 Often, especially in the medical realm, databases contain private patient data.
  • 20:45 They think because this web application isn't exposed to the internet, I don't have to put any protections on there.
  • 21:00 Nobody thinks about the malicious attacker being inside the network, but we hear about the ATP that has been inside for years and you didn't know.
  • 21:10 At the same time, they don't attempt to do anything about it.
  • 21:15 Adding multi-factor authentication to your internal apps is a huge step in the right direction.
  • 21:20 Most of those web apps are protecting something you don't what leaked out of your company.
  • 21:25 Monitoring is important for not only failed login attempts, but also valid login attempts in certain situations.
  • 21:35 If you have an application which shouldn't be logged in on a regular basis, or never at all unless maintenance is done, you should have monitoring on.
  • 21:45 If you have domain controllers, you should have an alert go off every time your domain controller is logged into for your security team to review.
  • 21:55 Every time a domain administrator logs in as a domain administrator anywhere, an alert should go off - no-one should log in as a domain administrator login unless they're administering a domain controller - those would be both very rare and very suspicious.
  • 22:10 Most of the time you'll find it is someone doing legitimate work for legitimate reasons, and you mark it off.
  • 22:20 I recommend that any time a DC is logging in, there's a change control ticket that is associated with it.
  • 22:25 Alerting on good logins goes a huge step of the way.
  • 22:30 I'm hopping round your network with a valid username and valid credentials.
  • 22:40 Normally, if you're just alerting on 20 failed logins in a 60 second period, that's good - but it's not going to stop me.
  • 22:50 If you see the domain admin logged in once, that should never happen, that would stop me.
  • 23:00 If I've created a domain admin account, an alert should go off.
  • 23:05 It's things that are easy to implement but we don't, because we have this mentality that the internet and intranet are separate, and that the internet is where the bad guys are.
  • 23:15 For some reason, we don't take into account that it only takes one person's account to be compromised and now the internet and intranet are one and the same.

So it's defence in depth, layered security, subnets, bastions, 2FA? -

  • 23:40 To coincide with time-based security is defence in depth - you put so many layers in the way that eventually they are going to hit one that they can't get past, or they are going to take too long over.
  • 24:00 Time is one thing that humans cannot generate more time for themselves.
  • 24:05 It goes beyond patience - perhaps the data has a limited useful timespan, and if it takes longer than that, then it isn't worth it any more.
  • 24:15 Anything that increases the amount of time an attack takes, even if you're dealing with an automated attack, that's going to help you.
  • 24:25 Defence in depth is critical - effectively, is answering the question: "what happens when that fails?"
  • 24:40 Eventually you'll hit a point of "well then, we're screwed" - but you bounce that off the risk assessment.
  • 24:50 It means that in order for this to have failed, this amount of time has passed, and the chances of all these things failing is part of your risk assessment.
  • 25:10 You get your defence in depth so that your risk assessment is acceptable.

How do you build defence in depth with micro-service architectures? -

  • 25:35 When you're dealing with SHA and hash validation, the most common failures I see is people designing these systems without understanding the underlying functionality.
  • 25:50 They might understand that they have to do it this way, but they might not understand why they have to do it this way.
  • 26:00 You end up with situations in which the same SHA can be generated for every single transaction; it comes down to understanding why you're doing what you're doing.
  • 26:15 You don't have to understand how the SHA algorithm generates the hash, but you do need to understand why you're generating the hash.
  • 26:25 At that point, you can understand why that point is important and why you should do it.
  • 26:30 I find too many instances where we're able to get around JWT based on guessing encryption keys.
  • 26:35 Maybe someone used a garbage encryption key, like 'puppy' to base things off on dev, and it now exists in prod.
  • 26:55 I've found situations where I'm able to directly locate or guess passwords, encryption keys just based on comments in a web application.
  • 27:05 I'm able to find keys left in git repositories - things like that.
  • 27:15 It's not so much "Is there something really critical when we're implementing JWT" or "Is there a whole school of thought about micro-services"
  • 27:30 The problem right now is the problem with OWASP top ten: people still are not wrapping their heads about the basic concepts of security with what they're doing.
  • 27:40 Talking about advanced topics isn't what's necessary right now.
  • 27:50 Even three or four years ago, you wouldn't hear micro-services being discussed - that was only used by huge companies - and now it's something everyone's looking at doing.
  • 28:05 It's like Kubernetes - now that it seems like everyone is using it, people are jumping on the bandwaggon.
  • 28:20 The bottom line is that stuff is coming out so fast that we're not learning the security of it because of the same problem that has always been: security is always an afterthought.
  • 28:30 If we're being pushed so hard to get these projects done on time and in budget, that security is the easiest thing to ignore.
  • 28:40 The mentality is: ship it!, and we'll get the pen-testers in later after it's shipped - and that is the second worst possible scenario.
  • 28:55 The worst scenario is, of course, to never have it tested.

Any final thoughts? -

  • 29:15 Security isn't complicated - we may think that it is, but it's not.
  • 29:20 The most basic problem that we have - many of which we had in the early 1990s when we were first connecting computers together - still exist.
  • 29:30 Learn about these things as a developer, as an architect - learn about the OWASP top ten; not just reading it, but be the attacker.
  • 29:45 You learn so much more by doing it - that's the way most developers learnt their trade.
  • 30:05 Learn the attacks - there's no end of information on the internet that allows these attacks.
  • 30:10 That's one of the reasons that these basic attacks are so common - it's because they're the ones people learn first.
  • 30:15 Learn how to do SQL injection, learn how to brute force passwords, go try it (against a system that you're authorised to!)
  • 30:25 It's so much easier to stand up a virtual network in your own home for literally no money to launch these attacks against.
  • 30:35 Learn these attacks - that will give you such a better insight into how to defend against them versus just the six sentences you were taught from a text book in college.

More about our podcasts

You can keep up-to-date with the podcasts via our RSS Feed, and they are available via SoundCloud, Apple Podcasts, Spotify, Overcast and the Google Podcast. From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.

Previous podcasts

Rate this Article