Why html5 elements INS and DEL suck
By glazou on Saturday 25 June 2011, 12:26 - Standards - Permalink
I have said it multiple times here, in W3C mailing-lists or in public between 1998 and now but apparently it must be said again and again: the current HTML5 Last Call Working Draft - that does not reach at all the quality of other LCWD in the W3C and did not meet the basic requirements for a LCWD in the W3C Process - still has not worked on that erratum. So let me repeat it : html5 ins
and del
elements suck and should be dropped in favor of a better solution.
ins
anddel
are, by definition, both inline-level and block-level elements. If in a Wysiwyg editor, you select the textual contents of a paragraph, turn on a "Visible Modification Marks" feature and hit the Delete or Backspace key, the editor has the option between<del><p>....</p></del>
and<p><del>...</del></p>
. The user has no way to make a difference between the two but the two are NOT strictly equivalent. In the latter case, it is still theoritically possible to place the caret in the paragraph but BEFORE or AFTER thedel
element and insert new chars. In the former case, the whole paragraph is deleted and the user can't insert anything inside any more.- In the latter case just above, it's impossible for the user to know if a caret placed at the beginning of the paragraph is before the paragraph, inside the paragraph but before the
del
element, or at the beginning of thedel
element. - much more importantly,
ins
anddel
cannot cover one trivial case : since there is no equivalent to SGML inclusions (see for instance this link for a rather clean explanation) in XML, the following is impossible:<ul><del><li>a</li></del><li>b</li></ul>
. It is for instance totally impossible to mark an element as entirely deleted if the parent container's model does not allow thedel
element...
The situation is unfortunately very clear: the ins
and del
elements as they exist now in the various html specs are unable to provide editing environments with a workable and predictable solution for Visible Modification Marks, the primary reason why the elements were originally introduced in HTML 4. As a matter of fact, almost no Wysiwyg editor implements them.
For the n-th time in 13 years, I strongly recommend to drop the ins
and del
elements in favor of the following attributes. All elements inside the body
element should be able to carry them.
change
attribute ; possible values:inserted
,deleted
optionnally followed by a whitespace and one of the keywordsreviewed
orto-be-reviewed
.review-by
attribute ; an arbitrary value meaningful only when thechange
attribute contains theto-be-reviewed
value and meant to be displayed for human consumption ; can be for instance a name, a mail, a twitter id, etc.reviewed-by
attribute ; an arbitrary value meaningful only when thechange
attribute contains thereviewed
value and meant to be displayed for human consumption ; can be for instance a name, a mail, a twitter id, etc.- the
cite
anddatetime
attributes as currently defined in the html5 spec
This is the minimum attributes set needed to resolve the issue. Another attribute "tagging" the potential reviews of the proposed change could also be added.
I really hope this change is going to happen. Again, the current ins
and del
html elements are totally hopeless.
Comments
You're absolutely right that ins and del need to be sorted out, although I've never really thought about it. I would suggests that some of these attributes have a prefix like the ARIA attributes do. For example: "mod-change" or "mod-reviewer".
@Josh T.: I have no opinion here ; works for me as soon as we focus on attributes and not both-block-and-inline elements.
A lot more complicated, I guess, would be a system that would has no notion of nested markup. Because inserting and erasing a content has nothing to do with its markup structure.
We can imagine something like this. [fndel] here to not induce any specific markup. I don't know what it should be.
<p class="1stpar">balabla [fndel] abablabla</p>
<p class="2ndpar"> glglgl ajhk [/fndel] djhfjkhk dsjhdh jh</p>
This would be useful. I remember a XTech talk about something like this. Maybe by Jeni Tennison. Not sure anymore.
Retrouvé l'article original qui avait servi à la présentation
http://academic.research.microsoft....
The a element has the same content model. Same issues, I guess?
Maybe you should create a microdata format with your suggested features and use those instead of ins and del.
Deux remarques:
- (Avec mon chapeau de minimaliste bien vissé sur la tête) Il me semble qu'un seul attribut « reviewer » ferait l'affaire des deux « review-by » et « reviewed-by », qui sont mutuellement exclusifs. Le jeu proposé n'est donc pas minimal, ou alors j'ai manqué quelque chose ? (Retire chapeau, fait chaud !)
- Je signale au passage que l'éditeur XML XMetal utilise des instructions de traitement pour implémenter les marques de changements ; c'est donc parallèle à la structure du balisage comme le suggère Karl. Mais je signale aussi que dans un système de publication qui traite du contenu venant de XMetal, nous transformons ces IT en attributs (en introduisant au besoin des <span> synthétiques), pour consommation ultérieure par XSLT.
HTML is a bit in between two domains (or layers if you want) in a way, one that would be latex or all the markup languages used in todays CMS (markmin and like), and a kind of postscript or tex equivalent.
Not necessarily a major issue, but meanwhile "IT" in general (people practising it) is still totally shying away from its huge need of identifiers to get written, thinking it would be different from other domains (or that it could work like words), which of course isn't the case. Why not a simple abstract syntax only format with numerical IDs (GS1 bar codes kind distributed) used as operators identifiers for instance ?
Discussion around this :
http://groups.google.com/group/comp...