Monday, December 3, 2007

Display is Data

Traditionally, the judgment has been that HTML is entirely about display, is generally throw away, and at best is for graphic designers to lovingly handcraft into skillful and brittle, baroque shapes.

Or to put it more bluntly: “HTML is the crap the server has to spit out and we don’t care what it is, just as long as it makes the browser do what the spec says.”

But what about a world where HTML is data returned from web services, display is the domain of CSS only, and javascript runs on the margins of the page and is not embedded into the HTML?

For example:

Say you need to build a page that is going to display a table of data. Let’s say it’s a table of charges on an account that includes name, amount, and date. In standard web development practice, this page would be given to a graphic designer and then to a developer who will plug data into the designer’s display, and you might end up with something like this:

<table class=”dataTable”>

<tr>

<td class=”tableHeaderLeftSelected”>Name</td>

<td class=”tableHeader”>Amount</td>

<td class=”tableHeaderRight”>Date</td>

</tr>

<tr>

<td class=”tableDataLeft”><span class=”bold”>Bob Smith</span></td>

<td class=”tableData”>57.50</td>

<td class=”tableDataRight”><span class=”highlight”>December 1st, 2007</span></td>

</tr>

</table>

The above example was driven by the following requirements:

  1. the first column had to be left aligned
  2. the last column needs to be right aligned
  3. all other columns need to be center aligned
  4. dates need to be highlighted
  5. names need to be bold

What’s so bad about this? Nothing is terrible, and in fact, this is a big improvement over a lot of markup I have seen (no inline styles and relatively clear class names), but there are a number of ways this could be improved:

  1. use the HTML structure as data (hence the “Cascading” in CSS)
  2. break up the classes into semantic tags
  3. tag data, not styles

here is the transformed HTML:

<table class=”data accounts”>

<tr>

<td class=”header first selected”>Name</td>

<td class=”header”>Amount</td>

<td class=”header last”>Date</td>

</tr>

<tr>

<td class=”data first name”> Bob Smith</td>

<td class=”data currency dollars”>57.50</td>

<td class=”data last date”> December 1st, 2007</td>

</tr>

</table>

This not only lets you build cleaner CSS, but also allows you to load the markup with of latent capabilities (for example, lets say months down the road, you want to make all ‘dollar’ data green, you just need to make one change in the CSS), and provides clean hooks for plugging in functionality to this HTML using javascript without actually putting any code inline. This has a lot of advantages in browser mashups where the content of a page is built on the fly and we want to defer the definition of most behviors to the controlling page. For example, if the above HTML were returned as the result of an AJAX request, a querying utility such as dojo,query could be used to pick out the headers and attach event handlers to control sorting in a way appropriate to the particular page:

var headersArray = dojo.query(“.header”);

for (var i = 0; i < headersArray.length; i++) {

//attach event handler to each header
}

Of course, HTML is not an alternative to XML or other pure data formats, but it can provide a useful middle ground in many content syndication scenarios. And overall, data is a good paradigm to start with when implementing HTML. The W3C sums this up well in the Working Draft for HTML 5:

HTML should allow separation of content and presentation. For this reason, markup that expresses structure is usually preferred to purely presentational markup. However, structural markup is a means to an end such as media independence. Profound and detailed semantic encoding is not necessary if the end can be reached otherwise. Defining reasonable default presentation for different media may be sufficient. HTML strikes a balance between semantic expressiveness and practical usefulness

2 comments:

robinb said...

th for table header cells
td for table data cells

Otherwise screenreaders won't like you very much!

;-)

n kolba said...

yes, you are right. Should be "th", which of course eliminates the need to the "header" class. Event more semantic.