Despite not yet being an official standard, HTML5 has grown rapidly in terms of adoption and influence over the past few years. Whether it's the Web, mobile, or even SOA, everything seems to have a strategy for integrating with HTML5. However, HTML5 is more than just an update to the original markup language, as it encompasses other aspects such as JavaScript and WebSockets. We have heard a lot about WebSockets recently, including introductions to the technology and whether or not it has any impact on REST. However, recently Lori Macvittie makes the argument that WebSockets may result in a less secure Web as people trade off security for performance. As she points out by referring to a report from 2011, many people are used to doing that already and a survey found that ...
... while 91 percent of the respondents were not only making tradeoffs between security and performance, a full 81 percent were actually disabling security features.
But what has this to do with WebSockets? Well according to Lori, because WebSockets removes HTTP headers amongst other things this opens up vulnerabilities that existing virus and malware checkers require:
You know, things like CONTENT-TYPE. You know, the header that tells the endpoint what kind of content is being transferred, such as text/html and video/avi. One of the things anti-virus and malware scanning solutions are very good at is detecting anomalies in specific types of content. The problem is that without a MIME type, the ability to correctly identify a given object gets a bit iffy.
Of course relying on HTTP headers is no guarantee against malicious content, but as Lori mentions:
[...] generally speaking the application serving the [data] doesn’t lie about the type of data and it is a rare vulnerability that attempts to manipulate that value. After all, you want a malicious payload delivered via a specific medium, because that’s the cornerstone upon which many exploits are based – execution of a specific operation against a specific manipulated payload. That means you really need the endpoint to believe the content is of the type it thinks it is.
And Lori goes on to say that the extensibility aspect of WebSockets (subprotocol extension), which allows for additional wire formats and protocols to be defined, presents further problems that prevents firewalls from alleviating the problem:
[...] there’s no way to confidently know what is being passed over a WebSocket unless you “speak” the language used, which you may or may not have access to. The result of all this confusion is that security software designed to scan for specific signatures or anomalies within specific types of content can’t. They can’t extract the object flowing through a WebSocket because there’s no indication of where it begins or ends, or even what it is. The loss of HTTP headers that indicate not only type but length is problematic for any software – or hardware for that matter – that uses the information contained within to extract and process the data.
There have been reports of security flaws with WebSocket implementations and protocol before, just as there have been with other Web protocols through the years. And of course security in distributed systems predates HTML by several decades, especially over binary protocols. So it is possible to secure binary systems, but Lori's point seems to be that whilst we appear to be driving headlong towards a higher performing Web, we shouldn't ignore the facts that the vast majority of the Web infrastructure is based on HTTP and removing or replacing it should not be taken lightly, as things will invariably break or even worse. Since adopting WebSockets seems inevitable, is it time to step back and consider what a world based on it should look like, at least as far as security is concerned?