Sensible RSS feeds for "link logs"

A few days ago I set myself up with another weblog just for storing links to pages (RSS feed is here). Sometimes it’s enough to point people at a site without discussing it. And sometimes I just want to store a link in case I need it again.

If you don’t care about RSS feeds, you can switch off here. I thought an RSS feed for such a simple weblog would be easy. To take a previous post, I thought an RSS v 1.0 item might look something like this:

<item rdf:about="http://www.gyford.com/phil/links/2003/09/index.php#n1371">
<title>Gore Vidal on why America is hated</title>
<link>http://www.meaus.com/gore-vidal-interview.htm</link>
<description>From July 2002.</description>
</item>

I’ve left off some of the other elements so we can focus on the important stuff. We have the item rdf:about... tag, pointing to where my weblog entry lives. There’s the <link>, pointing to the page I’m referring to and the title describing it. And then the description containing some further information. I didn’t expect to need the description tag often, but it might be handy, as here.

This was all simple enough until I started looking at the similar feeds of others, so I could include them in a new, and currently slightly flaky, section on Haddock Blogs. Danny O’Brien’s ObLinks feed is a nice and simple v0.91 one, but does things very differently:

<item>
<title>&lt;a href=http://labs.google.com/location&gt;google: search by location&lt;/a&gt;</title>
<link>http://www.commonhouse.net/blog/oblinks/2003/09/22#auto3f6fb5565cf45</link>
<description>&lt;a href=http://labs.google.com/location&gt;google: search by location&lt;/a&gt;</description>
</item>

Both the title and description contain the encoded HTML for generating a linked piece of text, while the link element is a link to his original weblog entry. Although more complicated, Matt Webb’s feed (which I’ve simplified here) is quite similar:

<item rdf:about="http://interconnected.org/home/mini/archive/2003/09/26/#106459281494049104">
<title>Colour scheme explorer, with many axes...</title>
<link>http://interconnected.org/home/mini/archive/2003/09/26/#106459281494049104</link>
<description><![CDATA[<a href="http://www.pixy.cz/apps/barvy/index-en.html">Colour scheme explorer, with many axes and combination styles</a>]]></description>
</item>

The difference here is that the title only contains text, with the HTML link and text combination only in the description. But the title is actually just a truncated version of the text in the description. Tom Coates comes up with another variation in his feed:

<item rdf:about="http://www.test.org.uk/archives/000979.html">
<title>Matt Locke on must-have book, &quot;New Media: 1740-1915&quot;</title>
<link>http://www.test.org.uk/archives/000979.html</link>
<description><![CDATA[<a href="http://www.test.org.uk/archives/000979.html">Matt Locke on must-have book, &quot;New Media: 1740-1915&quot;</a>]]></description>
</item>

This time, similar to Matt’s except Tom uses the link he’s pointing to in both the link and item rdf:about... elements. While not needed for Haddock Blogs, I also took a look at Jason Kottke’s remaindered links feed to see how he did things. Another slight twist:

<item rdf:about="http://jameswagner.com/mt_archives/003571.html">
<title>ON TRIAL! NOT ON STAGE! LIKE GOEBBELS AND LORD HAW HAW</title>
<link>http://jameswagner.com/mt_archives/003571.html</link>
<description>ON TRIAL! NOT ON STAGE! LIKE GOEBBELS AND LORD HAW HAW (Dissent at the Jeffrey Goldberg and Paul Wolfowitz New Yorker Festival event)</description>
</item>

Differing from Tom’s by losing the HTML’d link and text combination entirely, and repeating the contents of the title element in the description (occasionally, as here, with extra text appended).

Altogether that makes five different ways to do feeds for perhaps the simplest kind of weblog. All are valid and fine, except for the fact it makes attempting to parse, store and display data from all varieties a little nightmare. One never knows whether a particular element will contain text that’s repeated in another element, or HTML that may itself contain repeated data. Obviously, I think my method makes most sense, and, without wishing to offend anyone or claiming any great RSS expertise, here’s why:

  • An element’s tag should reflect its contents. Danny puts an encoded HTML link and text within the title. This obviously isn’t a “title”, and makes it hard to guess what to do with a feed’s contents.
  • Don’t repeat data unnecessarily. It seems pointless to fill up description elements with either encoded HTML containing repeated links/text (Tom and Matt) or with the same text as in the title element (most, but not all, of Jason’s). Perhaps a content:encoded element, as in Ben Hammersley’s main blog feed would be a suitable element for satisfying any need to create pre-formed HTML links?
  • Use the link element to point to the page you want to show people (unlike Matt). This is different from normal weblog feeds, where you’re pointing people to your weblog entry. In brief “link logs”, there’s nothing to be gained by the user visiting your website — send them straight to the object of your post.
  • On the other hand the item rdf:about... element should point to your weblog entry. The link element points to the object of your post, but this whole item is data about your weblog post. So the item rdf:about... should be a unique pointer to your post (unlike in Jason and Tom’s).

So, to repeat, here’s my brief description of an item that satisfies these criteria and, to me, makes semantic sense:

<item rdf:about="[URL of weblog post]">
<title>[Brief text describing what you're pointing at]</title>
<link>[URL you want to point people to]</link>
<description>[Optional additional text about what you're pointing at]</description>
</item>

This could be expanded with dc:date, content:encoded, etc, but it seems like a sensible framework from which to start. I’d appreciate any comments at all, because I’m feeling my way through this and would like to know why I might be wrong.