Future of HTML and the Web, part 1
By glazou on Tuesday 8 June 2004, 14:50 - Computing - Permalink
My friend Tantek has told me I should spend a few cycles reading the minutes of the recent Workshop on Web Applications and Compound Documents. I read what Hixie wrote about it, I read his position paper, and I read a few private comments some friends shared with me. But I did not read all the position papers. So let me first summarize what I think should be the future of the Web here, without being influenced by these papers. Warning, if you are a regular visitor of this blog, you'll see below a few things you have already read.
First, I firmly, deeply, strongly believe the future of the Web must be based on its present, and that it's totally impossible to just say "let's forget about it and build a new better Web". I'll detail later.
The present of the Web is
- (on client side) built upon
- a wide subset of HTML 4.01; only an infinitesimal part of the Web as we see it today is made of XHTML; only a small part of the Web as we see it today is made of clean HTML; no browser, since a late first version of HotJava, shows a warning when the document is invalid.
- a subset of CSS 2; some CSS features are implemented in so different ways by the common browsers that writing a complex web page with CSS is a burden if you want similar renderings on IE, Gecko and Opera.
- JavaScript; despite of some famous voices, JavaScript is one of the main reasons why the Web became successful. People who think the Web could have succeeded without it, or think the Web should get rid of JavaScript, these people should land back on Earth, in my humble opinion.
- the DOM; even if some parts of the DOM specs are too complex, it's one of the bricks we could not work without.
- mostly Microsoft Internet Explorer, Gecko-based browsers, and Opera. And MSIE has here a weight that just cannot be forgotten.
But where do we come from?
- HTML 4 was released 18-Dec-1997, revised 24-Apr-1998 and HTML 4.01 was released 24-Dec-1999. So the ground for HTML 4.x is now five and a half years old.
- in 1998, the W3C organized a Workshop about the Future of HTML in
a big hotel (Hyatt Regency IIRC) close to San Francisco airport. Big
room, big crowd. A lot of people, a lot of ideas. Here's what I
personally recorded:
- we needed a much wider set of HTML form elements; at that time, I was working for Electricité de France and I stood up to say we needed gauges, sliders, text fields with "a priori" data control, native style widgets and so on
- we needed cleaner markup, with the help of the XMLization of HTML
- we needed compound documents, to integrate vectorial images, maths and be able to mix documents based on different document models.
- we needed XML-on-the-web
- some wanted the modularization of the XMLized version of HTML
- we needed super-easy extensibility so people could write their own extensions to HTML
- CSS 2 was released 12-may-1998
- 60% MSIE, 40% Netscape
And where are we now ?
- During the last 5.5 years, almost nothing has happened to the markup language of the Web. Yes, I know, one may object XHTML 1 was released and allows to switch to XML. Yeaaaaah, a major step. Most people think of XHTML as "add the <?xml line, change the doctype, add a trailing slash to empty elements, close all elements", period. Well, to be honest, some don't even close all elements, and we can see invalid XHTML documents on the web.
- the Future of HTML Workshop had mixed results
- HTML 4 was placed in maintainance mode, despite of the fact it shortly represented 100% of the Web; a little number of errata were integrated in revised specs more than four years ago.
- Form elements are still what they used to be in 1997. Despite of the need, we're still there, and W3C's work on XHTML and XForms does not help because the Web is still based on HTML4.
- compound documents are still a pain to browse because the browsers often need a plugin that is not available by default in the system. Some users just don't have the system privileges to install a plugin. Mixing arbitrary valid XML styled with CSS is still an unreachable dream for the vast majority of browser users.
- IE is still unable to display styled XML but IE still represents the vast majority of browsers
- who, outside of standards' folks, has ever used XHTML modularization?
- extensibility? where? really ???
- CSS 3 is on its way, parts of it being already implemented by browsers, and CSS 2.1 was recently released
- 80% MSIE, 11% Gecko, others are Opera, Safari, Konqueror, ...
I can draw a few personal first-level conclusions:
- we sucked; the Web (the thing that your neighbor or mine browse to read news, buy stuff, contact the Administration...) as it is today is almost exactly equal, on the client's side, to what it was back in 1998 when CSS 2 was released. We did not suck a little bit, we sucked a lot. We sucked because our world of techies is too far away from the User, from the Web author, because we think our implementors' expectations are the market's expectations. In other terms, we sucked because we are unable to listen, and that's a very serious observation.
- placing HTML 4 in maintainance mode was a major mistake. It has largely blocked innovation on the Web during more than five long years and that's an awfully expensive price to pay. Yes, I know, the HTML Working Group has worked on XHTML 2 in the meantime. But we'll probably never see a widely deployed XHTML 2 browser (I mean > 20% of the market share) because I hardly see Opera, Mozilla and Microsoft jump on XHTML 2.
- Fortunately, the server side improved a lot, filling more or less the holes of the client side.
- the world's not going to change tomorrow. The landscape is easy
to observe :
- MSIE6 is here and will remain at least three more years
- XHTML 2 is certainly not going to replace HTML 4 because it's not backwards compatible, because existing browsers won't be able to handle it
- authors are still waiting for extensions to HTML 4
- easy interoperable extensibility mechanisms are just not here. Mozilla has the powerful XBL, IE has the powerful but totally different HTC, and period.
- JavaScript is here to stay too.
Second-level conclusions:
- we need a successor to HTML 4
- it should be backwards compatible with HTML 4 or compatible at a so cheap price that any webmaster can afford it. Dropping backwards compatibility with HTML 4 is a so tragic error I just can't think of an equivalent in our Web world.
- it should drastically improve the form elements' set, and offer native widget or native widget styles. It is amazing, and quite disappointing I must say, to see that Microsoft has been using HTML for MSIE's dialogs fo so many years, that Mozilla has been using XUL for its UI for so many years, and that we have no standard for Application UI.
- similarly, that successor should offer a wide variety of new features, because that's what web authors are waiting for.
- it should offer an almost trivial extensibility mechanism
- it should trivially allow the integration of external marked-up data like SVG or MathML or whatever.
- access to the Document Object Model through JavaScript should be simplified when needed, and enhanced.
- that backwards compatibility or near-compatibility will not let current browser users on the side of the road but will be, for them, a good incentive to upgrade to newer versions
- we, standards people, have focused on a large number of things that are not immediately useful to the Web, and some of them will probably never be; it's urgent to focus again on what's really important. If you are not an implementor, the following is for you : the makers of the "product" Web need to focus again on the "customers", i.e. users and authors.
- the HTML WG has constantly refused over the years to touch HTML 4 other than for errata purposes. In particular, even when Netscape warned about the lack of equivalent on LINK/STYLE elements to the disabled attribute on DOM stylesheets - a blocker for our editor at that time and an inconsistency between two RECs - the Group has refused to add a new attribute to HTML 4, even if it was totally harmless to existing web pages. So the HTML Working Group, and possibly the W3C with it, is perhaps not the best place to make a successor to HTML 4 happen. Don't misunderstand me, I am deeply sad to write that, but it's just factual.
Remarks:
- are extensions mechanisms like XBL or HTC the right solution? I am not sure. Not because of XBL itself but because of the binding mechanism using a CSS rule. I do appreciate the fact it can be based on a selector and takes advantage of the CSS cascade. But disabling CSS disables the behavior. One may say that a mechanism like XBL or HTC should not be used for semantics. In theory. But if we start speaking of the future of the Web in the terms I used above, what really matters here is not only theory but common practice. So there will be people using behaviors for semantics. And I have personally no problem with that.
- I still don't fully understand what the WHAT is (but I admit I have little spare time these days and I did not fully read everything there) and, to be honest, the 22 first messages sent to its public mailing-list worry me more than they comfort me.
Part two will contain my comments on the minutes of Web Apps and Compound Docs Workshop. Update: Comments now allowed.
Comments
> Not because of XBL itself but because of the binding mechanism using a CSS > rule.
This is a well known problem, rest assured.