MainDoctor WhoMusicSoftware
Main Page

Alden Bates' Weblog

Website Management Archives

Page 5 of 5

October 26, 2004

It never ceases to amaze me

People can get into heated discussions about just about everything, including whether links should be underlined on web pages. Personally I tend to have little trouble distinguishing links, whether they're underlined or not.

Tactical faux pas I've seen today on the web: black text on black background, black text on bright-green background (MY EYES!!!!)

Mind you this is mostly because I've been voting weblogs into industries on BlogShares.

Posted at 9:08 PM | Comments (0)

October 21, 2004

#1 authority on Pease Pottage

A search on Google reveals they rank me as the net's foremost authority on Pease Pottage. Oddly enough, the nearest I've been to England is Chicago, Illinois.

Posted at 5:59 PM | Comments (0)

October 12, 2004

Nudging the Proxy

As I've mentioned before, I've been having trouble with Paradise's proxy server in that it tends to serve up old versions of documents from tetrap.com without consulting the server. I discovered today that you can prevent this by setting a header when the web server sends the page out. By setting the "Cache-Control" header to "proxy-revalidate", you instruct the proxy server to always recheck the server to see if there's a new version.

Setting this for the NZDWFC message board was trivial. Since it's a CGI script, I just had it write "Cache-Control: private, proxy-revalidate" as one of the headers. (The private argument specifies that the proxy server should not return a copy it cached for one user to a second user. This is essential if you do any sort of personalisation.)

However I wanted to set the header on the NZDWFC index page as well. Here's where it gets tricky. To set the "Cache-Control" header for the index document, I'd have to use the Header directive in my .htaccess file. Unfortunately when I attempted this, it broke the site. I suspect this may mean that the server is not compiled with mod_headers, which it would need to be in order for me to use the Header directive...

I've dropped a question in on the HostForWeb forums to find out if mod_headers is compiled in or not. If it turns out it is, I'm probably just doing something wrong. :) If not, I'll have to find another solution.

Posted at 8:09 PM | Comments (0)

October 3, 2004

Tetrapyriarbus Zeitgeist for September

Because I can't think of anything interesting to say...

Popular DiscContinuity Guide entries

  1. The Natural History of Fear
  2. Creed of the Kromon
  3. Land of the Dead (No, I don't know why)
  4. Davros
  5. Zagreus

Popular musicians

  1. Enigma
  2. Enya
  3. Mike Oldfield
  4. Dead Can Dance
  5. Era

Top archived TSV artwork

  1. Davros (TSV 19 cover)
  2. Daleks playing computer games (cartoon)
  3. Sixth Doctor and Peri cartoon
  4. Peanuts cartoon
  5. Cyberman cartoon

Posted at 5:53 PM | Comments (0)

September 27, 2004

Updates on previous entries

My strategy for dealing with Yahoo's redirect problem seems to be working. Yahoo has dropped over one hundred redirected pages for www.tetrap.com from its index and is starting to index the proper URLs at nzdwfc.tetrap.com. This should hopefully reduce the number of unnecessary redirects and lessen server load.


It occurred that the reason BlogSpot is over-represented at BlogShares is because Blogger's free accounts ping weblogs.com by default, and that's where BS picks up new blogs. Add to this the fact that Google pimps Blogger via the Google Toolbar, and it's little wonder many of the listed blogs are BlogSpot blogs.

OTOH, LiveJournal is poorly represented (around 3000 listed) because only paid LJ accounts (accounting for about 4% of all LJ accounts) can ping weblogs.com, and the pinging is turned off by default (ISTR). LiveJournal doesn't do much in the way of working with the rest of the blogosphere.

There are even fewer Xangas (about a hundred or so), so I'm guessing Xanga doesn't ping weblogs.com at all.


I also today ran Ad-Aware for the first time in a while, and it reported that a cookie my site generates is a tracking cookie. I suspect this may be because it contains the user's email address. Pah.

Posted at 8:25 PM | Comments (0)

September 15, 2004

The Right Tool

Today's lesson: get the right tool for the job and the job will become much easier.

A long while ago now I opted to mirror Paul Harman's Web Guide to Doctor Who in order to avoid having to maintain my own directory of Doctor Who links. This didn't quite work for two reasons:

  1. People still emailed me to ask me to add a link to their web site, despite the word "Mirror" in large letters and a link to the page for adding sites to the Web Guide, and
  2. I ended up volunteering to assist Paul in maintaining the Web Guide.

The Web Guide has 21 sections and almost 700 sites listed in it. It's pretty kick ass and I don't think there's another comparable Doctor Who directory in existence. Most other sites mirroring the Guide would simply use a copy of the HTML-Version Paul sends out on a mailing list every update, but I had to be different and split the guide into sections, thus making my job slightly more difficult.

My first technique for doing the update was as follows:

  1. Take the HTML web guide and...
  2. Replace all the links to graphics to point to the same graphics on my server, avoiding the sticky issue of stealing bandwidth.
  3. Copy and paste each section separately into the individual pages.
  4. Process the pages into real HTML.
  5. Upload all the pages.

But this took too long, and I decided to write a program to make things easier. The technique now became:

  1. Copy and paste the HTML web guide into a file with my site format.
  2. Feed the resulting page into my Windows-based program, which produces some CSV files to be used to generate the section pages dynamically on the site.
  3. Copy and paste the compile date into the mirror index.html.
  4. Copy and paste the changes into the changes.html page.
  5. Process the pages into real HTML.
  6. Upload the three pages and CSV files.

This resulted in a process which was not actually a hell of a lot faster than the first process. Tonight I wrote a completely new perl-based program to turn the process into the following:

  1. Upload the HTML web guide.
  2. Run the script to process it into all the separate files.

Leaving me more time to play Unreal Tournament 2004, swim the Amazon, invent cold fusion, achieve world peace, etc. Teh skillz.

Posted at 10:27 PM | Comments (0)

September 12, 2004

TSV 26

I spend some of this weekend getting TSV 26 ready to go online. There is some pretty good material this issue, including:

In addition to a bunch of other cool stuff.

The Doctor's Dilemmas are difficult to do without them looking like just a bunch of paragraphs of text. I have bolded the first three sentences of the paragraphs in which a question is begun, but it still doesn't help much...

There is also the problem created where the article going online has been revised and no longer matches the description in the index (e.g. the Doctor Who in Advertising piece is described as containing the transcripts of two Prime Computer adverts, which is accurate for the copy which appears in the print issue, but the revised version has transcripts for all four adverts). I have opted to leave the descriptions as is, purely to keep it consistant with the rest of the index, but this leaves the description not quite matching the archived piece.

...And that completes the 1991 set of TSV issues, or Volume 5 as it is referred to in the front cover.

Posted at 7:29 PM | Comments (0)

September 10, 2004

Yahoo's search bot and your web site

A year or so ago now I moved the NZDWFC page from http://www.tetrap.com/ drwho/nzdwfc/ to nzdwfc.tetrap.com. To ease the transition, I placed a permanent 301 redirection from the old URL to the new one. Anyone going to the old URL gets bounced to the new URL without having to do anything.

Yahoo uses a spider called "Yahoo Slurp" to crawl the web looking for pages to add to the search index. Slurp hits http://www.tetrap.com/ drwho/nzdwfc/ and gets redirected to nzdwfc.tetrap.com like everyone else. Unfortunately Slurp has a bug in it, and adds the page to Yahoo's search index under the old URL.

Most of the NZDWFC page is indexed in Yahoo under the old URL, even pages I've added since the move (There are, in fact, only 5 pages in the Yahoo index for the nzdwfc.tetrap.com subdomain). This means that if the NZDWFC page comes up in a search and the user clicks on it, my server has to redirect them to the new page.

Last month the main domain www.tetrap.com got 4633 hits which resulted in redirects, a good number of those caused by people coming from Yahoo search results. Obviously I want to reduce this number so my web server has less work to do - the trouble is how to tell Yahoo Slurp not to index the old URL without breaking the redirection for users who surf in.

To do this I use Apache's rewrite engine like so:

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} help\.yahoo\.com [NC]
RewriteCond %{REQUEST_URI} ^/drwho/nzdwfc/.*$ [NC]
RewriteRule ^.* - [G,L]

This all goes in the .htaccess file which sits in my root directory. The lines work as follows:

  1. Turns the rewrite engine on. Kinda essential.
  2. Checks the user-agent of the bot for the string "help.yahoo.com". Slurp uses this in its user agent.
  3. Matches any file they request in the /drwho/nzdwfc/ directory or below.
  4. Tells Apache to send back a 410 response. 410 means "it's gone, matey, and it ain't coming back". Additionally the L indicates to Apache not to process any more Rewrite stuff because we're finished.

So put them together, and the server tells Slurp that the file it is requesting is gone, but lets anyone else through to hit the redirection. There are still a lot of redirections happening, but hopefully Yahoo will gradually drop the old URLs in favour of the new ones, and the redirections will decrease.

That's the theory, anyway. I'll update this weblog with the results in a few month's time, hopefully...

The Apache rewrite engine is a great and powerful thing, but also a dangerous thing.

Posted at 9:18 PM | Comments (0)

September 9, 2004

Messing with the look

I've mucked about with the template a bit to make it look more like the rest of my site. I started doing a complete template to make it look like my LiveJournal but then realised how long it would take. LiveJournal's method of creating journal styles, named S2, may be tricky to learn, but it's much easier to customise every page without duplicating a lot of effort.

I will have to fiddle with it a bit more to put in cool stuff like a blogroll and some links, and maybe install some plugins, but that should be fairly easy.

Posted at 10:13 PM | Comments (0)

1 2 3 4 5 >>
Search


Categories

Tetrap.com Site Map