Tuesday 5 March 2013

Five years..

Peter Linss and I were appointed co-chairs of the CSS Working Group exactly five years ago :-)

Tuesday 26 February 2013


Following the W3C Workshop on electronic books in NYC two weeks ago, Dave Cramer (Hachette), Hadrien Gardeur (Feedbooks) and myself (Disruptive Innovations) have started a new Google Group called EPUB NG. Don't misunderstand us, it's called EPUB New Generation only because we needed a name and we start from what's available on the market right now, EPUB3. We're not forking, we're not doing a secret thing, we only needed a space where we could start discussions about the largest issues I found in current specs and what Dave recently called EPUB Zero.

So if you're interested in throwing ideas about a new, simpler, lighter format for electronic books more in line with W3C standards and Web habits, start reading us and ping one of us to request an invite. Please detail your affiliation and background in the electronic books' space? Thanks!

Wednesday 20 February 2013


Twenty years ago, while working at Grif, I was ironing the very first implementation of CALS tables (that eventually gave us HTML tables) in a Wysiwyg editor. Time flies, and I'm still working on content editors :-)

Thursday 14 February 2013


I just read Daring Fireball's short so-called « analysis » of the Opera switch to WebKit. Even I perfectly know that guy is almost only an Apple PR guy, I'm again surprised by his limited ability to analyse a situation. The only question that is worth it is the following one: whatever is the strategic rationale that led to that choice, it's obvious Opera had the choice between open-sourcing Presto to build a larger community around it and ditching it in favor of an already open-sourced rendering engine. So why did they choose the latter?

And in terms of WebKit better than Presto, well, Opera has always been a better player with respect to standards than Apple. As many people have already said, a test failing in Presto was often the sign the test was wrong or the spec had a problem, given their extreme adherence to specifications.

So as usual, you can avoid reading Daring Fireball. No hyperlink from here. Nothing to see there.

Wednesday 13 February 2013

Strange day for the Open Web

OperaIt's a really strange day... The annoucement Opera drops the Presto engine came at european hours, of course. Fortunately, the city of New York woke me up at 4am with road construction and lots of noise from construction engines. Found my iPad silently piling up tons of notifications from friends about Opera. Discovering the news, I should not be surprised since the rumors started to percolate in fact two weeks ago...

Opera-the-company is still here while Opera-the-rendering-engine is no more. It clearly reminds me of the last moments of Netscape :-( I can't help but thinking this is not a new beginning but the end of an era, and most certainly a bad omen.

The Web wakes up less fragmented today but this is a sad moment because fragmentation and competition are good for innovation. Just one year ago, Opera was one of the advocates for one of the strangest decision ever requested in the CSS Working Group, the authorization for a rendering engine to implement the CSS prefix of another rendering engine. It never happened but what happened today is another magnitude, unfortunately.

Oh, it's not the market share of Opera that makes the difference. Their self-acclaimed 300 million users are a drop in the ocean and are mostly related to low-end phones, still a huge market in some parts of the world. No, it's the loss of an independant innovation center. Opera engineers will discover the power of a r- you can't control... They aim at an iOS browser. Wait, based like the others on the slow html control all but Safari use? Seriously????

I can't see Opera still having a huge differenciating factor now, unless they drastically reinvent themselves and almost change of market. If Opera was a smaller company, I would say they're looking to value their browser implementation skills to be acquired by one the roughly ten big players desperately currently looking for WebKit expertise. In other terms, an investor's perspective, not an industrial one. Oh, wait, did I say it? Oh crap...

For the CSS Working Group, that's an earthquake. One less testing environment, one less opportunity to discover bugs and issues. Let me summarize the new situation of the main contributors to the CSS Working Group:

  • Microsoft: Trident
  • Apple: WebKit
  • Google: WebKit
  • Opera: WebKit
  • Adobe: WebKit and their own Adobe Digital Editions rendering engine found in many ebook readers
  • Mozilla: Gecko
  • Disruptive Innovations: Gecko
  • HP: has delivered WebKit-based products in the past but is pretty browser-agnostic IMO
  • Rakuten: ADE and probably WebKit
  • Kozea: WeasyPrint
  • Qihoo 360 Technology Co: both Trident and WebKit
  • other Members of the Group: I don't know

One CSS prefix is gone and -webkit-* increases its power. Yesterday night, I was telling Håkon Lie (Opera CTO) I could imagine him in the amazing NYC Mariott Marquis elevators looking down to Lars-Erik Bolstad (Opera VP Core Technology) on the 8th floor (at the bar with us, obviously) and saying « I am you father », Lars-Erik answering « Noooooo... ». Today, I can feel the power of the dark side of the Force.

Opera, do us two favors please:

  1. first, don't trash Presto, open its source !
  2. second, tell us the fate of Opera-the-desktop-browser, not mentioned at all in the press release

Thanks and good luck.

Luke, Luke...

Friday 7 December 2012

À la xkcd...

The daily xkcd is once again excellent. I forked it into:

<p style="background-color: -webkit-gradient(...)">
How do you annoy a Web Standards' author?

Thursday 6 December 2012

W3C Workshop on Electronic Books and the Open Web Platform

Reminder, a W3C/IDPF/BISG Workshop on Electronic Books and the Open Web Platform will take place in NYC, USA, the 11th and 12th february. The deadline for the submission of position papers is the 10th of december so hurry up if you plan to attend the Workshop!

Saturday 24 November 2012


Method-Draw is a superb fork of the popular web-based SVG editor SVG Edit. You can try it here and the project is hosted on github here.

As a reminder, BlueGriffon embeds SVG Edit. As soon as I have time for that, I'll look more closely at Method-Draw to see if it could replace SVG Edit in BlueGriffon since I find its UI and the UX it induces quite nice.

Tuesday 30 October 2012

EPUB3 fun #12

The EPUB 3 Content Documents spec reads:

The EPUB 3 CSS Profile includes @media and @import rules with media queries as defined in the Media Queries [MediaQueries] specification.

But it says nothing about stylesheets linked through a <link> element and Media Queries ! So, normatively per EPUB 3.0, <link> elements with Media Queries are currently forbidden...

Friday 14 September 2012

EPUB3 fun #11, not all metadata are metadata

Excerpt from EPUB 3 Publications section 4.3.2 (strong emphasize mine):

Cardinality: In the metadata section: zero or more; Attached to other metadata: zero or one

Other metadata? Outside of <metadata>?!? I don't understand. At all.

It can't be related to the META-INF/metadata.xml file since section 2.5.4 of EPUB3 Container Format reads (strong emphasize mine):

This file, if present, must be used for container-level metadata. This version of the OCF specification does not specify any container-level metadata.

And the other sections of the OPF file do not contain metadata but a manifest, a spine and other stuff but no metadata...

I think - but I can't be sure - it in fact means:

Cardinality: as a primary expression (i.e. applies to the Publication): zero or more; as a subexpression (i.e. refining a primary expression or another subexpression: zero or one for the refined expression or subexpression

Anyway, the original prose is very badly worded, to say the least. Carefully reading the spec, I have no idea what are "other metadata" so there is at least an important definition (sic) here that remains undefined...

EPUB3 fun #10, all your chains of ID/IDREF are belong to us

Remember when I said the following?

any mechanism based on ID/IDREF introduces an extra keyword, the ID of the element ("mainauthor" here)... Asking a non-techie user to provide an ID is clearly suboptimal in terms of UX. That means the editing environment should pick one ID for the author, at the risk of using a human language not understood by the package's author or even a meaningless random ID.

Seems the situation is even worse than I expected... The EPUB3 Publications specification explicitly allows chains of subexpressions through ID/REF mechanisms and refines attributes. I guess it'll be clearer with this example:

<dc:identifier id="bookID">
<!-- meta element below refines the dc:identifier element above -->
<meta refines="#bookID"
id="meta-authority1">My metadata Authority</meta>
<!-- link element below refines the meta element above... -->
<link refines="#meta-authority1"

One ID/IDREF is already hard enough to deal with in UI but a chain?!? I can find one single word only to describe such a mechanism in a Standard oriented towards creation of visual products, i.e. oriented towards Wysiwyg-ness and nicely composed UIs: ridiculous. EPUB3 metadata are clearly a deep weakness in the EPUB3 format because of the incredible complexity they introduce in editing environments. Honestly, I am not sure to implement this, it would drastically uglify a UI that is already complex enough.

Tuesday 4 September 2012

BlueGriffon EPUB3 Validator

I am glad to report my EPUB3 Validator add-on to Firefox is now available. A few details about it:

  • JavaScript only
  • not based on epubcheck's code at all, no Java code, does not rely on Java at all
  • my own xml content model validator
  • my own CSS parser and CSS Profile validator
  • Firefox >=12
  • Mac OS X, Windows, Linux
  • Free upgrades for life after purchase.

More information there.

Wednesday 29 August 2012


I have taken all packages available from the epub-samples EPUB3 repository and pushed them through my own EPUB3 validator. Some of these "clean" packages are not that clean, so here are the validation logs. Please note my validator also checks all stylesheets in the package according to the EPUB3 CSS Profile (AFAIK, no other validator is able to do that).

Monday 27 August 2012

EPUB3 fun #8, update

I said a few days ago that validation of EPUB3 Content Documents is impossible. That's actually incorrect and I apologize for that. The PUB3 Content Documents spec normatively cites a RelaxNG schema for XHTML5 markup and dataypes. Still, I have no idea from reading the spec what flavor or xhtml5 it represents... I'm left with a RNG to understand if a given html5 element is implemented or not, and how, in EPUB3. Hum.

Saturday 25 August 2012

EPUB3 fun #9

Validation of meta elements in EPUB3 triggers a few surprises... In fact, that's not the meta elements themselves, it's the properties they carry. Let me explain:

  • EPUB3 meta elements carry a property attribute
  • a property is a keyword optionally preceded by a property vocabulary prefix and a colon
  • EPUB3 Media Overlays use the reserved media prefix mapped to namespace http://www.idpf.org/epub/vocab/overlays/#
  • the Media Overlays says the media:duration property has a cardinality of "exactly one for the Publication and for each Media Overlay"

In fact, that's quite painful and costly to test in the following case:

  1. suppose the namespace above is declared (yes, I know it's already reserved but it's not forbidden to do it !!!) on the package's prefix attribute with prefix foo
  2. suppose we have one <meta property="media:duration"> in the OPF for the Publication
  3. but also a <meta property="foo:duration"> for the same value...
  4. that's invalid per spec, see the last item of the list above

You can't compare the prefixes, that's not enough. You really need to rely on URIs and that's where validation is expensive.

Conclusion: Media Overlays 3.0 have been released with EPUB3. The associated media prefix should be dropped and its related properties should be prefixless in EPUB3. If the issue is the media:narrator property too close to a value in another property vocabulary, it should be changed.

Thursday 23 August 2012


"So Twitter just joined W3C. Should we cut our specs into modules < 140 chars?"

Tuesday 21 August 2012

EPUB3 fun #8

EPUB3 is based on the xml serialization of HTML5. When I say HTML5, I really mean the W3C version. But there is a problem ; that also happens with many other specs normatively referenced by EPUB3 but let's focus on HTML5 right now: the link to the HTML5 normative reference has for href http://www.w3.org/TR/html5/ . In other terms, the last version (WD, LCWD, CR, PR, REC) published by W3C.

What's the version to consider for the implementation of an EPUB3 reader or editor? I have no idea. What has changed between the time the HTML5 WD was normatively referenced and now? I have no idea. What if for instance an "at-risk" feature is dropped in a future WD of the spec? I have no idea.

This is not only a theoretical issue, it has a deep and immediate practical impact: validation of the Content Documents inside an EPUB3 ebook is impossible. More globally, full validation of an EPUB3 package is then impossible.

EPUB3 fun #7

Hum, yet another bug in an EPUB3 spec... Excerpt (verbatim, nothing added or removed) from section of EPUB Content Documents 3.0:

The prefix attribute

The prefix attribute definition is unchanged, but the attribute is defined to be in the namespace http://www.idpf.org/2007/ops when used in Content Documents.

Unchanged from what?!?

Friday 10 August 2012

EPUB3 fun #6

In the EPUB ebook world, there's one sentence I heard so many times from so various sources I did not even feel the need to verify it:

"Validation of EPUB is extremely important and we heavily rely on the EPUB Validator"

First thought: "excellent!". People need to distribute validated packages because they don't want to suffer big issues on the some of the many ebook readers on the market. Excellent.

Unfortunately, after a closer look, the situation seems to me a bit different from that idealistic (I could even say utopian) view... Let me explain (please note I have spent time carefully reading the epubcheck source code for instance):

  • if you count the number of major Standards (de jure, de facto, proprietary) involved in the validation of a given *.epub EPUB3 file, you'll find a few dozens of them. And again, only the major ones, the ones a serious industrial validator must absolutely validate against.
  • some validations are complex and expensive, for instance a serious validation of encryption.xml or signatures.xml will involve much more than just a RNG-based validation of the XML instance... Validation of external property vocabularies can be extremely tricky or painful/expensive to implement.
  • the first step of validation is related to the ZIP package itself, and the epub30-ocf spec has a few very technical requirements there. I sincerely doubt all of them are validated, I sincerely doubt EPUB3 packages all around the world currently pass or will pass all the conformance requirements there.
  • only in the W3C space, an EPUB3 validator must at least validate against a dozen of specs.
  • the complexity of some of these specs is huge, drastically impacting users (ebook authors) either on a learning curve's basis or on a financial one. Or both.

EPUB3 is probably too complex as a spec. In EPUB2, most people did not understand the difference between spine, ncx and guide. Most common question was "why do we have multiple table of contents and which one is the good one?". In EPUB3, it's a bit better but only a bit. We have landmarks, multiple table of contents and still a spine. Personally I still wonder why there is a manifest of files; probably only because of the MIME types. Hey, even OSes rely on file extensions to infer a MIME type!!!

I'd love to see appear an EPUB4 strictly xhtml5/svg/mathml-based. No other XML namespace allowed. No more OCF, OPF, NCX. Manifest of files coming from the ZIP list of entries. Direct inclusion of non-conflicting property vocabularies. No need to have an OCF, we can have a nice index.xhtml. Rely on a vocabulary of classes/IDs/roles (not the ARIA role...) expressing extra constraints/behaviours on existing html5 elements. Would be enough for most ebook authors and publishers and would drastically decrease the complexity of the publishing chain, IMHO.

Wednesday 8 August 2012

EPUB3 fun #5

Dates and time... Dates and time are painful, dates and time are complex, "dates and time are the most complex objects we use on a daily basis" used to say Reuters' Misha Wolf in the HTML WG in the good ol'days.

Excerpt from section 2.2.7 of OPF 2.01:

The date element has one optional OPF event attribute. The set of values for event are not defined by this specification; possible values may include: creation, publication, and modification.

Excerpt from section 3.4.6 of EPUB Publications 3.0:

The date element must only be used to define the publication date of the EPUB Publication.
Only one date element is allowed.

Ahem... So EPUB3 allows only two dates for a package: the publication date and the last-modification date (through a <meta property="dcterms:modified"> element). It's impossible to preserve the date of creation or various modification dates. Is it only me or this change from EPUB2 to EPUB3 is weird?

Please note that to insert a publication date into an OPF document ready for distribution, you must modify it and rezip the whole package. So if an OPF document does have a <dc:date> publication element, its content value must be equal to the last-modified date!!! In other terms, this is totally useless. Wow.

- page 2 of 13 -