January 1, 2008
A previous blog engine I used to use, e107, created links like “http://iandouglas.com/page?3.0" (my old SpamAssassin trainer tutorial page) so I decided I should at least attempt to toss in a few mod_rewrite rules to even up the playing field so search engines and people with bookmarks could still get to the tutorial text which I’ve started adding to the new CMS.
Trouble is, mod_rewrite doesn’t support something simple like:
RewriteRule ^page.php?3.0$ /spamassassin-trainer/ [R=301,L]
It’s not supported because the “?” is a reserved regular expression (regexp) character meaning “whether or not the preceding bits existed”. For example:
RewriteRule ^abc?def.html$ ...
… would match “abcdef.html” and “def.html” because the “?” means “whether or not ‘abc’ was part of the filename.”
For my use, where my old URLs look like “page.php?3.2” (my biggest beef with e107 was non-SEO friendly URLs), I needed to find a way to tackle the “?” issue.
I tried escaping the question mark, but that didn’t work:
RewriteRule ^page.php\?3.0$ /spamassassin-trainer/ [R=301,L]
I tried putting (.) in place, which tells the regexp engine to check for the existence of any character, but that didn’t work:
RewriteRule ^page.php(.)3.0$ /spamassassin-trainer/ [R=301,L]
In the end, since the SilverStripe CMS that I moved to didn’t include any script called page.php, I just
created my own page.php
, and it looks like this (edited for brevity):
<?php
$querystring = $_SERVER['QUERY_STRING'] ;
$newpage = "http://iandouglas.com/news/" ;
switch($querystring) {
case "3" :
case "3.0" :
$newpage = "/spamassassin-trainer/" ;
break ;
case "3.1" :
$newpage = "/sa-trainer-assumptions-and-terminology/" ;
break ;
}
header("HTTP/1.1 301 Moved Permanently") ;
header("Location: ".$newpage) ;
?>
The key here was just looking at the existing query string coming in, and doing a switch() check on it, and building a new URL for each match. Then, to cover the [R=301] for the “permanently moved” redirect for mod_rewrite, you simply have to do a header() acll in PHP with the HTTP 1.1 string before the header() call to bounce the browser to the new page.
This worked really well at the time, but it certainly wasn’t very scalable in the long run to maintain such a list on every page I had ever written. I debated doing something similar for my old blog posts until I got them all moved, but in the meantime a mod_rewrite redirect for the news.php script works just fine to jump to iandouglas.com/oldsite/news.php.