Wysisyg editing is easy with innerHTML.

I heard this a few days ago. This is so blatantly false I need to write something about it here on my blog. In fact, innerHTML is probably the last API on earth you want to use writing a DOM-based content editor for the web. Let's suppose you're editing an HTML document having this body element:

<p>This is paragraf</p>

and let's suppose you're going to apply two changes to the document:

  1. you're going to fix the spelling mistake,
  2. you're going to switch to some sort of source mode and turn the whole paragraph into <pre>some code</pre>

The first change causes the transaction manager to record the mutation of the content of a text node. That's totally undoable. The transaction is defined by the document, the text node itself, the original textual content and the final textual content. That's trivial to undo or redo the transaction.

Your code wants to apply the second change using a transaction based on innerHTML (supposedly simpler than DOM-based transactions). That's undoable too. The transaction is defined by the document, the parent node of the modified element, the original innerHTML and the final innerHTML. No problem. Trivial to write.

Wait... I said it's a bad solution but I also said both transactions are undoable. That's correct: they are individually undoable. But the second transaction harms the transaction manager and undoing the second transaction makes the first one not undoable. Before the second transaction, your document is:
<p>This is paragraph</p>

and the markup is exactly the same after performing the second transaction and undoing it. The markup is the same, but the document tree is not! The original <p> cannot be recreated with the same internal reference since innerHTML parses a document fragment. So, even if seen from outside you have the same document, it's not the same document at all and the first transaction now holds a reference to a text node that does not exist any more in the document tree. In other terms, the first transaction, the one fixing the spelling mistake, now refers to a non-existing node; it's obviously not undoable and your whole Undo stack is horked...

Conclusion: do NOT use innerHTML or any method parsing a fragment if you don't want to harm your undo stack. If you really need to rely on it, don't store the original and final serializations but listen to document DOM mutation events, and that will be undoable AND preserve the Undo stack's integrity.

Update. There is only one way an editor could relatively simply use innerHTML inside its Undo stack's transactions but that requires changing all other transactions: instead of keeping references to a node inside a transaction, the editor could keep an XPath to the right target node... That would work. But computing the XPath and querying it has a cost; you lose in performance what you gain refcount on the document nodes. All in all, I'm not sure it's worth it unless innerHTML is absolutely needed for your project.