Whenever you introduce a new name into an existing system, you have to worry about name collisions. What if someone else is already using this custom tag in their web page? It doesn’t matter how obscure and unusably complicated you make your custom tag, the risk of name collision is always non-zero.
So, to mitigate the risk that someone else is already using (or may someday start using) “contactscontrol” as a custom HTML element tag for their own purposes and conflict with our use of that tag, we should use a namespace to disambiguate our use from anybody else’s. HTML pages can define a namespace identifier like devlive and associated it with a URI in a domain that we control, dev.live.com. Namespaces can be defined in the HTML tag using the xmlns attribute, like this:
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:devlive="http://dev.live.com/contactscontrol">
With that definition in place, we can fully qualify our custom HTML tag like this:
<devlive:contactscontrol class="ContactsControl" devlive:privacyStatementURL="privacyPolicy.html" devlive:dataDesired="name,email"> ... </devlive:contactscontrol>
Notice that we can also use the namespace to prefix the attributes within the element as well, to eliminate the chance that our “dataDesired” attribute collides with another provided by the DOM implementer/browser or other unknown source.
That’s what we would like to use namespaces for in HTML. However, as far as Gallo and I have been able to determine, it doesn’t actually work. (If I’ve overlooked something major, please let me know!)
Everything looks fine at the HTML level. None of browsers complain too much about the custom tag, with or without the namespace qualifier.
Namespace support was added in the W3C DOM level 2 spec, along with several new methods that add a namespace param alongside tagname. element.getElementsByTagNameNS(ns, tagname), for example, sibling to the DOM 1 element.getElementsByTagName(tagname). The old methods still work in DOM Level 2, but since they don’t recognize namespaces they may return different results than the newer namespace functions. If you’re writing code to support namespaces, use only the namespace functions. Don’t mix calls to NS and non-NS functions.
Things start to unravel when you start working with the DOM in code.
First off, IE doesn’t support DOM level 2 at all (IE6 nor IE7). That’s not such a huge problem, since I can write my own rudimentary NS functions to backfill in IE.
Next, Firefox claims support for the DOM level 2 NS functions, but when you give them a spin in your HTML page, they don’t work. The functions are acknowledged to exist, but they don’t recognize your namespace. The sample provided in the Mozilla online documentation for element.getElementsByTagNameNS fails to locate any of the <P> elements if you copy the sample into an .html file on your own http server.
A clue to where things are breaking down is in the section title of the W3C DOM level 2 spec cited above: “1.1.8 XML Namespaces”. Namespaces are an artifact of looking at HTML through an XML lens. Firefox effectively disables its DOM level 2 namespace functions unless the document is an XML document. And the icing on the cake is that there is no way for a document to self-declare that it is a valid XML document. The only way to tell the browser to parse a page as XML is to modify the web server to send the page with a different Content-Type in the http header.
Ok, so to test this idea, copy the .html file on the web server to .xml. Just as all web servers have a preconfigured MIME type mapping the .html file extension to a Content-Type of text/html, many also have a MIME type mapping .xml to text/xml.
I had the pleasure of sharing more than a few lunches with Ian “hixie” Hickson during my stint at Google and talking about absolutely everything except work. Every time I try to put a label on what Ian does I get it wrong, so let’s just say he’s deeply involved with the evolution and formulation of “web standards”. Who he works for, I don’t know (He sat in our group but didn’t report to our manager). What standards bodies, I don’t know. What standards, I don’t know. But he’s really good at it!
After Gallo and I figured out what was going on with Firefox and the NS functions (that it was an HTML vs XHTML/XML document distinction), I was suddenly aware that one little tiny brain cell had been jumping up and down the whole time, trying to tell me to quit farting around with all this empirical crap and just go ask hixie. Or at least, hixie’s web site.
Sure enough, hixie has the topic covered in spades. All the reasons you shouldn’t let XML or XHTML loose on the web as HTML content, why you shouldn’t mix XHTML and HTML, and why we’re all basically screwed until DOM level 2 browsers reach ubiquity.
Hixie also makes a strong argument that XHTML is not HTML, and that numerous subtle differences exist in the semantics of each. Even if we could return content-type = application/xhtml+xml only for browsers that support it, we would still have to support non-namespaced HTML for the browsers that don’t. Oh, and don’t forget that the web server that has to return application/xhtml+xml is the third party host page’s server, not the Windows Live server. All pain, no gain.
The net of all this is that I’m now advocating internally that we step back from using namespaces as the recommended practice for HTML markup for third parties using our control. We’ll just bob around in the HTML tag soup for the time being. Namespaces are the right thing to do, but the requirements are too high and the coverage too spotty to build upon it in a broad-reach platform right now.
This post was originally published on my MSDN blog while I was at Microsoft.