Using XSLT to create a nice rendering of RSS feeds

Here’s a nifty little usage of XSLT. Check out my RSS feed at https://marc-abramowitz.com/feed/ – with your browser – yep, that’s right with a browser. I know that you’ve already added my feed to My Yahoo! or added my feed to Bloglines or used USM (feed:// URL), or whatever feed reader you use :-).

I’m using XSLT to transform RSS 2.0 into good ol’ HTML and CSS. Here’s the XSL file that I used. Tip of the hat to jr, from whom I borrowed some ideas and code.

After getting the basic idea working, I was annoyed that there were “<p>”s, “&#8217;”s, etc… showing up in the output. At first, I thought that it was not valid to have HTML tags inside of the <description> tag, but this turned out to be wrong. The RSS spec says that it’s OK for HTML to go inside the <description>, as long as it’s escaped or enclosed in a CDATA. So it’s legal for HTML to appear in there; it’s just that the XSLT engine escapes it. This escaping can be disabled with the disable-output-escaping feature of XSLT. Unfortunately, this is an optional feature in the XSLT spec and it is supported by IE, but not
Firefox (see the Mozilla XSLT FAQ, question 3). Being that I and probably most of my readers use Firefox, I opted to just strip out and decode the nastiness.

I hacked around and got it to look fairly nice. I had to make some hacks to wp-rss2.php like adding some filters to strip
out <p> tags that the Markdown plugin was adding and decoding
“&#8217;”s and the like (since they’re already in a CDATA). Stripping out the <p> tags was easy:

<?php
add_filter('the_excerpt_rss', 'strip_tags', 9);
?>

Now I had things like “&#8217;” and such inside my CDATA, which made then render as exactly that, which looked very strange. These Unicode encoded characters needed to be decoded. My first instinct was to use the PHP function html\_entity\_decode, but the manual page and some experimentation showed that this doesn’t work with Unicode yet. Then I poked around the PHP manual and tried to use some things involving iconv, but Dreamhost‘s PHP doesn’t have that built-in. Not wanting to make a major project out of a little novelty, I just wrote some quick and dirty code to decode the particular characters that were showing up:

<?php
add_filter('the_excerpt_rss', 'strip_tags', 9);
add_filter('the_excerpt_rss', 'decode_entities', 10);

function decode_entities($text) {
   $text = preg_replace('/’/m', "'", $text); # decimal notation
   $text = preg_replace('/&#8230;/m', "...", $text); # decimal notation
   $text = preg_replace('/&#8243;/m', "\"", $text); # decimal notation
   $text = preg_replace('/&#60;/m', "<", $text); # decimal notation
   $text = preg_replace('/&#62;/m', ">", $text); # decimal notation
   $text = preg_replace('/&#38;/m', "&", $text); # decimal notation

   return $text;
}
?>

End result: https://marc-abramowitz.com/feed/

Tip: While hacking around with your RSS feed, it’s a good idea to check it with Feed Validator, to make sure that you’re not breaking it. After all, it’s cool to make your feed look nice in a browser, but feeds are primarily for syndication and so if you break its well-formedness, you risk that feed readers will choke on it.

Update 2005-05-23: Here’s an in-depth tutorial from someone who did the same thing.

Update 2005-05-24: The RSS Blog linked to this post, although they misunderstood what I did and spelled my name wrong. 🙂

Update 2005-06-29: Here’s a patch file, which makes it easy to repatch your files after a WordPress upgrade (all too often lately :-)).

One thought on “Using XSLT to create a nice rendering of RSS feeds

Leave a Reply

Your email address will not be published. Required fields are marked *