BT

What is the Web?

| by Mark Little Follow 14 Followers on Dec 07, 2014. Estimated reading time: 5 minutes |

Recently Mark Nottingham, chair of the IETF HTTP Working Group, tried to answer the question of What is the Web? As he says, it should be relatively straightforward to do, since we all use it, and yet it turns out to be fairly contentious. Mark begins with:

For the longest time, the most accepted definition of the Web has been anything that has a URL. This is the “web as information system” view. It’s long been observed that of the three pillars of the Web — identifiers, formats and protocols — URLs are the most universal and stable

Unfortunately, as Mark then shows this view has its limitations. For instance, by this definition WS-* was on the Web yet many people would dispute this. Likewise most RESTful services would fall under this definition as they all have URLs and benefit from HTTP. However, neither of these approaches can typically be used within a browser in a meaningful way:

[...] at best you’d get a blob of XML or JSON back, without using links well. While there are exceptions (such as the folks industriously trying to do hypermedia APIs), they’re the exception.

This then leads to a more restricted definition of what it means to be on the Web: anything that you can do with a browser. Obviously this includes anything that has a URL definition but excludes things that use URL, HTTP and HTML if the browser is not involved.

The “what you can do in a Web browser” view has evolved into what’s now called the Web Platform  [...] and it’s a central plank of what’s going on in the W3C. There, the focus is very much on turning the Web into a platform that can compete with those that threaten the Web — namely, iOS and Android

However, as Mark points out this definition does not sit well with everyone:

Whilst Mark agrees with Roy to a degree and that there will always be useful non-browser users, the Web Platform is interesting especially when talking about interoperability and standards.

That’s because if you assume that browsers define the Web as a “platform” lots of questions that we thought were settled get thrown up in the air again.

For instance, the definition of extension points in standards typically requires some form of coordination. Such as the IETFs use of registries. However, if browsers are the central component of the Web Platform then ...

[...] you don’t need registries or namespaces; you just edit the spec — provided that the spec is faithfully reflecting what the browsers implement. The argument goes that in a browser-ruled Web, other software using the specification doesn’t want to diverge from the behaviour of a Web browser, because doing so would cause interoperability problems and thereby reduce that software’s value. So, just make sure the browsers are walking in lockstep and document what they do in the specs; you don’t need no stinking registry.

As he points out, this has been used in the WebCrypto specification for documenting which algorithms are available. Because browser implementers want to be compatible and interoperable, it is in their interests to make sure they implement the specification faithfully and completely. Since this approach has worked well here, it is likely to for other browser-oriented specifications/APIs. However, there are situations where it is not so clear cut, for instance:

in HTTP-land, for example, we acknowledge that other user-agents like Web crawlers and testing tools want to stay close to browsers, while other uses of the protocol don’t involve a browser. Link relations are too broadly-based to make editing the HTML5 specification practical, so it uses a wiki, and we’re talking about doing the same for the IETF spec.

According to Mark, another area where a browser-focused approach to the Web produces an interesting solution is versioning. Many browser implementations have moved to much shorter release cycles than existed in the past, coupled with automatic updates. Therefore, releasing specific versions of HTML etc. does not help because what is important is what is implemented currently.

This leads to specs becoming “Living Standards,” [...] — i.e., constantly updated documents, based upon not only natural evolution of what they document (whether it be an API, format or concept), but also incorporating bug fixes, improved examples, and better alignment with the reality of what the Web actually is.

The problem here though is that this approach is likely not going to work well for many non-browser stakeholders:

For example, people who want to define conformance criteria for a government will find this maddening. It also makes referring to documentation problematic

As usual one-size rarely fits all and Mark believes that the approach to versioning "on the Web" will be approached on a case-by-case basis, with careful versioning for some things, such as HTTP, yet not for others, such as HTML. Interestingly it appears as though the W3C is moving towards publishing snapshots of Living Standards for those users who need to refer to some specific "in term" version of a document. Of course this does raise the problem of how to handle breaking (incompatible) changes:

The answer in the case of HTML is that they either a) won’t, or b) they’ll coordinate it and eat the intro problems (presumably because they already had an interop issue, and it was judged the lesser evil). For other things like API changes, releasing things under new names (even if it does end in a digit) allows things to be rolled out incrementally — keeping in mind that the spec for that new thing is still likely to be “living."

What does this definition of the Web mean for the humble browser and also for the approach of following the browser as a definition of the Web? Well as Mark discusses, it's been a long time since browser was the dominant client "window" on to (in to) the Web.

[...] we now have phones, TVs, cars and much more becoming part of the Web. This pressure to include more kinds of devices — along with the emerging non-traditional browsers out of places like China and India — are, I suspect, going to put the “follow the browsers” model under a certain amount of stress

As Mark originally started in his article, the initial question of What is the Web? may seem simple and we all believe we know the answer, but the reality is somewhat different. As the Web continues to evolve, with further changes to protocols such as HTTP, the browser and more influence from mobile and internet-of-things etc. being "on the Web" and a definition of "What is the Web?" may well change too.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

I like Wikipedia's definition... by Jean-Jacques Dubray

"The World Wide Web is a system of interlinked hypertext documents that are accessed via the Internet [by User Agents]"

It happens that lots of people tend to reify all kinds of "interesting" concepts on top otherwise perfectly well defined, albeit conceptually limited, semantics, but that would not change its definition.

As a matter of fact, as soon as the User Agent disappeared from the architecture (Native Mobile Apps), the HyperText disappeared, even when you call a bunch of data records a "document.

HyperText is the keyword, if not the keystone of the Web. We seem to forget it (HyperText Transport Protocol, HyperText Markup Language). Anything that is not "hypertext document oriented" is pretty much wishful thinking and should not be calling itself Web XXX (that includes Web Services, Web APIs, ...).

What is the World Wide Web? by Steven Eastman

The World Wide Web is an interconnected collection os servers, routers, and PCs. Servers publish copies of documents and send them to the PCs over the web of routers. The entities that receive these documents own that copy, not a right to copy them again, that is they don't hold the copyright.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

2 Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT