Microformats 2.0
By glazou on Friday 14 March 2008, 11:58 - Standards - Permalink
Summary : current microformats are a "quick and dirty" way of doing. We can do better.
Microformats, hAtom, Webslices... A set of nice extensions to
the HTML4/XHTML1 language that all have the same problem : they tamper
with the value set of the class attribute. Before Microformats, the
value set of the class attribute was unaltered. A web author could pick
up ANY class for his document. Including hfeed, entry-title or
entry-content... It's not possible any more. I said it in the past and
say it again : despite of my huge interest in microformats, that's bad
design. As the HTML5 spec says "Authors may use any value in the class
attribute".
That's bad design because a web site author - let's take the example of the hAtom-enabled AOL Sports - has to tweak his content template to enable microformats even though the template often already contains everything needed for hatom, but with different class names. Tweaking the content template is always dangerous, can lead to errors making the web site's data unavailable to visitors/customers.
A much much better solution is to tweak only the metadata in the document, namely the contents of the HEAD element. So what do we have here to enable some sort of Microformats 2.0 ?
- the META element is all we need to declare a CSS selector
targeting for instance hfeeds or hentrys. For instance:
<meta scheme="hAtom" name="hentry" content="body .hentry"/>
I think this is clear enough... We're declaring that for the hAtom microformat, hentry elements are elements matching the CSS3 selector in the
content
attribute, namely elements included in theBODY
of the document and carrying class hentry.One could perfectly imagine a much simpler declaration where all entries are DL elements where the first DT element represents the entry title and all following DD elements representing the entry contents... No need to use a class here.
If you really want to be clear and valid, you need to add a profile attribute to the HEAD element of your document. Oh well.
- the content attribute above can perfectly be scoped:
<meta scheme="hAtom" name="hentry" content="body .hentry"/>
<meta scheme="hAtom" name="entry-title" content=".entry-title"/>In that case, the hAtom format could say the entry-title definition is valid only in the scope of elements matching the hentry definition. In other words, entry-title elements are here elements matching the selector
body .hentry .entry-title
- it's trivial for a web page to retrieve the elements specified by such a selector using the W3C Selectors API spec. I urge all browser vendors to implement that spec as soon as possible. it's also trivial to dynamically style them, just reuse the selector in the content attribute, form a style rule that you add to the document through the DOM.
- since it's CSS-based, it's also trivial to add XBL or XBL 2.0 bindings to those elements, from chrome or page JS.
- microformats do NOT have to care about each other any more... Each new microformat using the class attribute only introduces new "predefined" and meaningful values for the class attribute. hAtom uses hentry. No other microformat can use it and that's bad design.
<html>
<head>
<title>my own in-page webslice example</title>
</head>
<body>
<h1>some dummy title we don't want to see in the webslices</h1>
<div id="#item1345" class="hslice">
<div class="entry-title">Canon EOS 40D</div>
<div class="endtime" title="2008-03-15T11:34:00">Expires tomorrow same time</div>
<dl class="entry-content">
<dt>pixmania.fr</dt> <dd>989,00 EUR</dd>
<dt>ldlc.fr</dt> <dd>1147,20 EUR</dd>
</dl>
</div>
<div id="#item7398" class="hslice">
<div class="entry-title">Canon EOS 5D</div>
<div class="endtime" title="2008-03-15T11:34:00">Expires tomorrow same time</div>
<dl class="entry-content">
<dt>pixmania.fr</dt> <dd>2099,00 EUR</dd>
<dt>ldlc.fr</dt> <dd>2399,00 EUR</dd>
</dl>
</div>
</body>
</html>
Please note how the title
attribute is used on endtime entries. I hate it. Browsers will show
that ISO date as a tooltip when the pointer hovers over the element.Anyway, the code above could become something like
<html>
<head>
<title>my own in-page webslice example</title>
<meta scheme="MS-webslice" name="hslice" content="div[id^='item']"/>
<meta scheme="MS-webslice" name="entry-title" content="div:first-of-type"/>
<meta scheme="MS-webslice" name="endtime" content=".expiration"/>
<meta scheme="MS-webslice" name="entry-content" content="dl"/>
</head>
<body>
<h1>some dummy title we don't want to see in the webslices</h1>
<div id="#item1345">
<div>Canon EOS 40D</div>
<div><span class="expiration" title="2008-03-15T11:34:00"/>Expires tomorrow same time</div>
<dl>
<dt>pixmania.fr</dt> <dd>989,00 EUR</dd>
<dt>ldlc.fr</dt> <dd>1147,20 EUR</dd>
</dl>
</div>
<div id="#item7398">
<div>Canon EOS 5D</div>
<div class="expiration" title="2008-03-15T11:34:00">Expires tomorrow same time</div>
<dl>
<dt>pixmania.fr</dt> <dd>2099,00 EUR</dd>
<dt>ldlc.fr</dt> <dd>2399,00 EUR</dd>
</dl>
</div>
</body>
</html>
It's totally different and it's still the same thing. It's much cleaner and more powerful. It's safer for the web author. The web author has probably nothing to change in its content template. The web author has total control over his class attribute's values, the structure of the document. Multiple entry-content elements don't need to be all tagged with a class, they can be identified by the context. I also easier to mix multiple microformats in the same document.
I am calling for a Microformats 2.0 effort based on the existence of the W3C Selectors API, and the META element.
Comments
If you want flexibility and clean separation from the content, there is GRDDL: http://www.w3.org/TR/grddl/#grddl-x... .
Couldn't get past "Authors may use any value in the class attribute"
With microformats, authors still can. class is space-separated, you can have as many values as you want.
@Stephen Paul Weber: Yes, you may still use another classes, but you can't use 'hentry' if you don't want the associated microformat's semantic. And ANY class value you use may be redefined to mean something in the future and break your site.
Sounds like a good idea to me. The need to opt-in for microformats will prevent ill-formed data coming from websites that could have used a class in a new microformat.
However, this may make it impossible for some websites to offer microformats if the HEAD can't be modified. Currently, wikis can show examples but if they don't allow to add META elements to the HEAD, microformats will not be available. (Perhaps plug-ins could be developed that would allow this for wikis, but for some other websites, there isn't as much luck.)
Yes, you can, parsers have to be smart enough to see invalid data and not parse it.