MainDoctor WhoMusicSoftware
Main Page

Alden Bates' Weblog

Website Management Archives

Page 2 of 5

April 9, 2007

Experimenting with Adsense

Out of interest, I slapped Google Adsense on here a month or so ago, to see what would happen. It didn't make enough money to counteract the deep, deep shame I felt in having adverts on my weblog, so now they're gone again.

I suspect the problem was twofold: 1) the contextual adverts weren't relevant enough, and 2) users tend to ignore most adverts these days anyway. Oh well. :)

Posted at 8:26 PM | Comments (1)

March 25, 2007

Hooray for vanity domains! (and new cameras)

This weblog is now located at www.aldenbates.com, because I can. And also because it's easier to remember than 'abates.tetrap.com' (the old subdomain redirects here, of course). Please update your links and bookmarks. :)

So this post isn't a total wash, here's a picture I took last weekend from the top of Cannon Point:
[High shot!]

In the foreground is Upper Hutt, with the Hutt River winding its way up the right-hand side. The bridge at lower right is the one I mentioned in this post last year. Beyond the hill in middle centre is Lower Hutt (It's actually quite a lot bigger than it appears in that photo. Yeah, as if we haven't heard that one before), beyond which is Wellington Harbour, on the other side of which is Wellington (yes, where Peter Jackson lives). In the far far background, you may be able to make out the very dim outline of the South Island.

This photo was taken with my shiny new Canon Powershot digital camera. :)

Posted at 7:51 PM | Comments (4)

March 19, 2007

Accidentally bad search results

Sometimes the most unexpected things happen on this site. For instance, so far this month the five most popular archives TSV items are:

  1. What's really inside a Dalek
  2. Who Killed Kennedy eBook
  3. Cyberman shot by an arrow
  4. Is that a Sonic Screwdriver in your Pocket, Doctor?
  5. The Doctor and Rose standing over a Slitheen

Why is a crumby cartoon I drew back in 1993 the most popular? Because when you do an image search on Google for inside a dalek, the first result looks like this:

[image results]

...which looks deceptively like something the searcher would be looking for. I feel vaguely guilty now!

Posted at 11:32 PM | Comments (1)

February 20, 2007

Crawl Stats

I was looking at Google Webmaster Tools before and noticed that after years of languishing at at around 50th hit for the query "Alden", I'm now at 12th position. I'm headin' for the top, baby!

I'm still confused over something on the "Crawl stats" page though: For months and months now, it's reported that over half the pages on tetrap.com have "PageRank not yet assigned" like so:
[Crawl stats image]

I can't tell if this is a glitch in the system (it's counting pages which are blocked from being indexed or something?) or simply that it's not being updated.

Posted at 8:56 PM | Comments (2)

February 17, 2007

Totally XHTML

I've just knocked a long-standing item off my todo list: Every page on tetrap.com is now validated strict XHTML. This means that the pages will all be a lot more compatible with browsers other than Internet Explorer.

The last section that needed to be converted was The DiscContinuity Guide, which was unfortunately last updated more than two years ago. The HTML code it contained was very old (and buggy in a couple of places) but is now all fixed up and shiny.

This also marks the first time in a few year that tetrap.com has had a uniform look to it (except the NZDWFC site which has a look of its own).

My good mood is only slightly spoiled by the fact that TV2 have resumed playing new episodes of Stargate SG-1 at 1pm on Saturdays and DIDN'T BOTHER TO TELL ANYONE! It appears I've only missed the first episode of season 9 (yes, NZ is two years behind) though...

Posted at 3:08 PM | Comments (5)

November 6, 2006

Implementing If-Modified-Since

The amount of bandwidth the NZDWFC site's been using has been steadily increasing recently, so I've been looking at what I can do to reduce the amount of data sent, hopefully without impacting on anyone's browsing experience. The top ten pages in terms of hits last month were:

  1. the index page
  2. the new series message board
  3. the forums index page
  4. the general message board
  5. the page of series 2 images
  6. a piece of Cyberman artwork
  7. The Traders' Corner message board
  8. the Andrew Cartmel interview
  9. the artwork from the cover of TSV 72
  10. Pr1me Computers
I suspect that main culprit is the fifth item there, since it's basically a page of thumbnails, but aside from that seven of the pages there have something in common - they're dynamically generated. When someone hits the forum index, a script grabs the last ten posts on each message board and constructs an HTML page which is sent to the browser.

When a user visits a static page, which is stored as a .html file on the server, the web server sends a "Last-Modified" header telling their browser when the file was last changed. The next time they visit it, the browser sends an "If-Modified-Since" header to the web server to say "send me the page if it's been updated since X date/time". The web server checks against the .html file and will only send it to the browser if it has been changed. This saves a bit of bandwidth by not sending unnecessary data.

If a web page is generated dynamically by a perl script (or a script in any other programming language, for that matter), the web server has no way of knowing whether the contents of the page have changed since the user last looked at it, so it sends it again. Support for "Last-Modified" and "If-Modified-Since" have to be done in the script itself. So last night I implemented it in the script which generates the forum index page.

The problem with this, as I discovered, was that the forum index also has controls on it to expand and shrink the lists for each message board. These affect the way that the script generates the HTML page, so if the script is only checking for changes to the message boards and not changes to these controls, the controls stop being persistent between visits. I probably would have found this out last night if Xtra's broadband wasn't so crap - at one point it completely dropped my connection for about ten minutes...

So I think the answer is to use an ETag header instead. ETags work in a similar way, but you're not limited to a date/time value, so it can include whatever other settings affect the generated page as well. One question I have which I haven't been able to find an answer for is that the If-None-Match header which a browser sends can contain more than one entity-tag value, so how does the browser know when an entity-tag value is no longer valid? The RFC doesn't make it clear what the client should do. Does that mean eventually browsers could be sending hundreds of entity-tag values?

Posted at 5:58 PM | Comments (0)

September 16, 2006

The Trials of Shifting Webhost

Last week I took part in the shifting of A Teaspoon and an Open Mind from one web host to HostForWeb. I struck only two major problems.

1/ The backup format for MySQL databases which HostForWeb and the old web host used were completely different. The old host generated a gzipped tar of FRM, MYD, and MYI files, while HostForWeb expects a file in some format I didn't recognise. Fortunately I've struck a similar problem before, so I generated a file full of SQL queries and away I went.

2/ I discovered that if you tar and gz a directory structure in Windows XP, then decompress it again on Linux, all the directories are created without the executable attribute (On UNIX systems, directories must be set executable so you can use them, while Windows doesn't even have an executable attribute). I wrote a perl script to run through all the directories and set the executable bit.

Hopefully we won't need to do that again soon.

Posted at 11:11 AM | Comments (2)

July 28, 2006

Frequently Hotlinked Images

I've ranted on about hotlinking images on my site before. I find it rude because they're using my bandwidth to decorate their own site with pretty images. Although I've put some measure of protection on, sometimes people will try it and not bother to remove the link when it doesn't work. I thought it might be useful to list the top 5 hotlinked images:

The V for Vendetta comic image from the Comic Connection article from TSV 26, for obvious reasons.
A convention photo of Tobey Maguire, predating the release of Spider-Man.
The Greg the Bunny icons, particularly the ones of Tardy and Count Blah.
Turdy, the alien from The Outer Limits. I'm still not sure what Turdy's appeal is. Maybe it's his smile.
Thing, from The Tomorrow People.
An old Dead Can Dance image. Just a picture of some arched windows which I no longer use on my music site - the one currently gracing the DCD page is a DVD screenshot.

I believe the most effective way to prevent undesired image hotlinking is to prevent Google Image search from indexing the pictures on a site; The vast majority of hotlinkers find the images via Google image search.

As a side note, myspace recently added a mechanism for reporting offensive images - there's a link at the bottom of each profile page - so it may be that the most effective way to deal with myspace hotlinkers is to redirect the image to a copy of tubgirl or similar, then report it. I do notice that hotlinking from myspace appears to have reduced drastically on my site since they've introduced this.

Posted at 9:23 PM | Comments (0)

July 1, 2006

More on "Even better hotlink protection"

I was asked to share the .htaccess and Perl code I used to achieve my new hotlink protection method, so, first of all, from my .htaccess file for tetrap.com:

ErrorDocument 403 /cgi-bin/err403.cgi

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} .*jpg$|.*gif$|.*png$ [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !tetrap\.com [NC]
RewriteRule (.*) - [F,L]

The first line sets the Perl script I'm using as my error 403 document, so whenever anyone gets an error 403, that script is executed and the output sent to their browser. The next line starts processing with mod_rewrite. Line 3 matches if the request is for a filename corresponding to an image file - if your images are named differently, yuo should change this line to suit. The next line will halt if there is no referrer present in their request, because many people have referrer reporting turned off. Line 5 halts if the referrer contains the text tetrap.com. Should all the tests succeed (The user is requesting an image, and the referrer is set to another site) they will get a 403 error and the script will execute.)

And now the perl script:

#!/usr/bin/perl
# Error 403 script by Alden Bates (www.tetrap.com)

$theurl="$ENV{REDIRECT_URL}";
if($theurl eq "/cgi-bin/err403.cgi") {
  $theurl="$ENV{REQUEST_URI}";
}

if($theurl =~ /jpg$|gif$|png$/) {
  print "Content-type: image/gif\n\n";
  open(GFX,"error403.gif");
  seek(GFX,0,2);
  $size=tell(GFX);
  seek(GFX,0,0);
  $amount=read GFX,$data,$size;
  print "$data";
  close(GFX);
} else {
  print "Content-type: text/html\n\n";
  open(HTML,"error403.html");
  while(<HTML>) {
    print "$_";
  }
  close(HTML);
}

Here, the first clump of code fetches the path to the file that the user was trying to load. The rest of the code looks at the path to see if it is an image. If so, the script opens error403.gif and sends it to the user. If not, it opens error403.html (which is an error page) and sends that to the user. Note that, because the script is sending the file directly, any server-side includes or code will not be executed, so this would not be suitable for, say, a php script.

So that's basically it!

Posted at 10:26 AM | Comments (1)

June 22, 2006

10th Birthday

Not only is it TSV's 19th birthday this month, but according to my hosting history post it's also (roughly!) the 10th birthday of this web site. In net terms, that's positively ancient. I hate to think how many hours of time I've devoted to this beast...

Suddenly I feel very old. :)

Here, have a piece of Dalek cake:
[CAKE!]

Posted at 8:26 PM | Comments (2)

May 25, 2006

Even better hotlink protection

I haven't been happy with previous methods I've been using for hotlink prevention, because usually they result in a broken graphic on the other site which, depending on the browser, may not be visible.

Method 1: If a user hits a graphic with a referrer from another site, they get a 403 error and an HTML error page. Drawback: this results in a broken image on the other page.

Method 2: If a user hits a graphic with a referrer from another site, they get an HTML page which includes the actual graphic in an <img> tag, and a "hosted by tetrap.com" message. Drawbacks: results in a broken image on the other page, and the hits are recorded as traffic in my server statistics.

So I decided to try a new method: If a user hits a graphic with a referrer from another site, they get a 100x100 black and white image which looks like this*:
[Error 304 graphic

That allows them to see instantly what the problem is instead of giving them a broken graphic with no indication as to why, and it still registers as a 403 error in my server statistics. I've achived this by using a Perl script for my error 403 page. It detects whether the user is trying to load a web page (in which case it gives them an HTMLerror page) or a graphic (in which case it gives them the graphic shown above. I think it's nifty. :)

* Except myspace.com users, who still get tubgirl. Bwahaha.

Posted at 9:26 PM | Comments (2)

May 8, 2006

Optimising mod_rewrite Part 1

Mod_rewrite is a great tool in Apache for doing fancy stuff with URLs and redirections and blocking spammers. That said, while using it you should remember that the conditions and rules you add are checked by Apache on every single hit, be it for an HTML file, an image, or a style-sheet. So, here are some suggestions for reducing the amount of work Apache does for each hit:

Continue reading "Optimising mod_rewrite Part 1"

Posted at 11:54 PM | Comments (0)

April 20, 2006

Strange FireFox prefetch problem

I've noticed a couple of times an odd bug occuring when navigating forward through the issue pages on the NZDWFC site. Because they include a "next" metalink, FireFox will always prefetch the next issue's index page. This means if you click on the link to the next issue, FireFox already has the page and just needs to download the images.

However a couple of times this has somehow gone wrong. When I've clicked on the link to the next issue, nothing has happened. This seems to be a relatively rare occurance (It's only happened to me twice, as I say) but I'm mystified as to what's happening.

Because I have the access logs, I can see roughly what's happening but it doesn't help:

  1. after loading the issue's index page, in this case TSV 20, FireFox loads the index page for TSV 21. This appears from the server end to have gone as normal.
  2. FireFox re-requests the page and gets back a status of 206 (partial content) indicating that FireFox requested a small piece of the file. Total transferred: 15490 bytes (the TSV 21 index page is only 4978 bytes in size).
  3. Whenever I click on the link or try to load the page from other links, FireFox repeats step 2 and doesn't load the page.

The only way to fix this appears to be to clear the cache. I presume the cached page has somehow become corrupted, but it's hard to tell what's happening.

Posted at 11:17 PM | Comments (0)

March 28, 2006

V for Remote Linking

Sometimes events can conspire to give your web site a sudden spike in traffic:

Back in TSV 26, an article called The Comic Connections was published. Amongst other things, this article made mention of the comic book V for Vendetta. The article, including a comic frame from aforementioned comic book, is currently archived on the NZDWFC site.

Released this month is the movie version of V for Vendetta. I haven't seen it yet, but it looks pretty good from the previews. Unsurprisingly, a lot of people are doing web searches for it, and the comic frame from The Comic Connections happens to lie on the second page of Google image search results.

Last month, the archived version of the article got a total of around 260 hits. So far this month, the total hits is around about 2800.

Of course, there's a price to fame: lots of people (mostly on forums) have remote-linked to the comic book frame or, as I've now redirected it to, tubgirl.jpg. Is there no end to the amusement I can get from that jpg?

Posted at 9:30 PM | Comments (0)

March 22, 2006

Yahoo!'s Excellent Error Reporting Mechanism

While reading my Yahoo! Mail, I got this error:

There was a problem accessing your account.

We recommend clicking the "Refresh" or "Reload" button on your browser's toolbar. If this doesn't correct the problem, re-login to your account. If the problem persists, click here to send us a report.

We sincerely apologise for the inconvenience and thank you for your patience.

So I clicked on "click here" (great link text there, Yahoo!) and got "Sorry, the page you requested was not found."

Brilliant!

Posted at 12:09 AM | Comments (0)

1 2 3 4 5
Search


Categories

Tetrap.com Site Map