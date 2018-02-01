By glazou on Thursday 1 February 2018, 10:31 - Standards - Permalink

LibeOffice 6.0 is now available. And it's through the inevitable Korben I discovered this morning it has a builtin EPUB export. So let's take a closer look at that new beast and evaluate how it deals with that painful task. Conformant EPUB? And which version of EPUB? Reusable XHTML and CSS? We'll see.

After installation (on a Mac), I created a new trivial text document; it contains a paragraph, a level 1 header, an image, a table, and a unordered list of three items. I did not touch at all fonts, styles, margins, etc.

Then I discovered LibreOffice now has two new menu items: File > Export As... > Export directly as EPUB and File > Export As... > Export as EPUB ... .

Export directly as EPUB

It directly opens a filepicker to select a destination *.epub file. Let's unzip the saved package and take a look at its guts:

the mimetype file is correctly placed as first file in the package and it's correctly stored without compression

other files are correctly stored using Deflate

the META-INF/container.xml is stored in last position in the zip, which is probably a mistake

the OPF file says it's a EPUB 3.0 package and its metadata are clean ; AFAICT, the OPF file is conformant to the spec

XML and XHTML files in the package are serialized without carriage returns (if you except one after the XML prolog) or indentation...

a NCX is present

the Navigation Document (called toc.xhtml ) and the NCX live side by side in a OEBPS folder (sigh)

there is a empty OEBPS/styles/stylesheet.css file

file the content files are in a OEBPS/sections folder

folder that folder contains 2 files (!) section001.xhtml and section002.xhtml

and looking at these files, LibreOffice seems to have split the original document at section breaks, hence the two sections found in the EPUB package

there is no title element in these files

there is clearly a problem with exported CSS styles, the body of each generated document having no margins, paddings. And since there is no CSS-reset either...

the set of LibreOffice styles (the leftmost dropdown in the toolbar) are not exported to CSS; the whole export relies on CSS inline styles ( style attributes) and not on classes

the original document uses the "Liberation Serif" font, that is not registered under that name into the OS X fontbook (old issue well known in the OOXML world...). The final rendition in a browser is then buggy, font-wise. The font-family declarations in the document don't use a fallback to serif .

declarations in the document don't use a fallback to . there is a very weird font-effect: outline property serialized on all paragraphs in table cells

strangely again, all these paragraphs have text-decoration: overline; text-shadow: 1px 1px 1px #666666; while the original text is not overlined nor shadowed

while the original text is not overlined nor shadowed when a paragraph (a p in terms of OOXML) contains one single run of text (a r in OOXML), the output could be optimized getting rid of a span and adding its inline styles to the parent paragraph. The output is too verbose and will trigger issues in html editors, Wysiwyg or not.

when a paragraph (a p in terms of OOXML) contains one single run of text (a r in OOXML), the output could be optimized getting rid of a span and adding its inline styles to the parent paragraph. The output is too verbose and will trigger issues in html editors, Wysiwyg or not.
the margin values in the document use a mix of inches and pixels, which is kind of weird

values in the document use a mix of inches and pixels, which is kind of weird the image in the original document is lost in the EPUB package

headers are not generated as h1 , h2 , ... but as p elements with styles.

, , ... but as elements with styles. the EPUB version does not correctly deal with the unordered list and all list items become regular paragraphs. No ol or ul , bullet, no counter, no list-style-type . Semantics is lost.

Firefox Quantum viewing the resulting section002.xhtml file. You can clearly see where the html+CSS export is buggy:

How iBooks sees that EPUB:

Export as EPUB...

Aaah, that one is quite different since it first opens the following dialog:

The dialog offers the following choices:

export as EPUB2 or EPUB3 (nice!) Split at Page Breaks or Headings (very nice feature but why not also a "Don't split" option?)

and validating the dialog goes to the aforementioned *.epub filepicker.