InfoQ Homepage News New HTML Parsing Rules in IE 10

New HTML Parsing Rules in IE 10

One of the major changes in HTML 5 was the standardization for non-standard HTML, or more specifically, mal-formed HTML. Browsers are notoriously lenient when it comes to accepting HTML that contains flaws such as missing end tags. This leniency is widely credited for the continued success of HTML in the face of rival standard such as XHTML.

The HTML Living Standard reads,

This specification defines the parsing rules for HTML documents, whether they are syntactically correct or not. Certain points in the parsing algorithm are said to be parse errors. The error handling for parse errors is well-defined: user agents must either act as described below when encountering such problems, or must abort processing at the first error that they encounter for which they do not wish to apply the rules described below.

According to Tony Ross of Microsoft, Internet Explorer will start abiding by these new parsing rules in the recently released version 10, platform preview 2. While it is always best to use valid HTML, sites that cannot be fixed before IE 10 is released will need to run in legacy mode.

Another change is the removal of the following features:

Again, this only applies when not running in a legacy mode.

