One week left for Siteseers.Net

I’m retiring my long-time domain, siteseers.net, next week. I’ve had it since 1997, back when I used ISDN to access the Internet. While it will be sad to see it go, I don’t really use it anymore. I’ve got more than a dozen other domain names and this one is needlessly adding to the cost of annual domain renewals.

I’m going to park it at Sedo in case anyone is interested in buying it.

Returning from another break

MT.Net was down for the last couple of days due to more strangeness seen on the server. I took it down and rebuilt everything.

Should you experience any quirkiness here (outside of the stuff I post already, ha ha), let me know!

How to rewrite a hacked URL?

Hey Lazyweb,

When my WordPress site got compromised, The Google began indexing links that have a “?y%” in the middle of the URLs:

http://www.markturner.net/2007/04/page/4/?y%/you-are-what-you-grow/

This turns the second half of the URL into a query string, which complicates fixing it a bit. I’ve tried a few RewriteCond rules but haven’t figured out this voodoo well enough yet:

RewriteCond %{QUERY_STRING} y\%/(.*) [NC]
RewriteRule (.*) $1 [R=302,L]

Anyone have any pointers on how to turn the above URL into this?

http://www.markturner.net/2007/04/page/4/you-are-what-you-grow/

P.S. WordPress 2.8 is now out. Time to upgrade!

Rankcrawler update

I received an email this evening from Philippe Martin at RankCrawler, apologizing for the bad bot behavior:

Dear Mark Turner,

I apologize for not properly identifying our crawler (RankCrawler) by using the user agent. Our reverse-dns go to rankcrawler.com but we don’t use our own user agent. We will fix this problem soon. We have stopped to crawl your website as soon as I read your message.

We DO NOT crawl with the IP 94.23.51.159 as you claim in your second blog post about Rancrawler. It should be another company that we don’t know and that uses the same ISP (OVH is a very large ISP). We uses at this time only 5 IP that goes to rankcrawler.com.

I apologize again for this problem and I hope you will let our crawler access your website once we properly identify our crawler with our own user agent.

Thank you for your message,

Philippe Martin
http://rancrawler.com

I’m pleased that Mr. Martin chose to respond to my complaint and as such, I will allow RankCrawler to access MT.Net once again.

Rankcrawler bot update

Sheesh. Just after I finished blocking Rankcrawler from accessing my site, I found yet another connection attempt from them – this time from a totally new IP address:

94.23.51.159 – – [31/May/2009:07:14:02 -0400] “GET /2009/05/30/conn-clusion/ HTTP/1.1” 200 5574 “http://real-url.org” “Mozilla/4.0 (compatible;MSIE 5.01; Windows -NT 5.0 – real-url.org)”
94.23.51.159 – – [31/May/2009:07:15:25 -0400] “GET /2009/05/30/conn-clusion/ HTTP/1.0” 200 5574 “-” “-”
94.23.51.159 – – [31/May/2009:07:15:25 -0400] “POST /xmlrpc.php HTTP/1.0” 200 473 “-” “XML-RPC for PHP 2.2.2”

This IP resolves to rps6637.ovh.net. OVH.Net is the same ISP that Rankcrawler uses. They just can’t take no for an answer.

[Update: 1 June 2009] Rankcrawler says this isn’t them. Duly noted.

Bad bot alert: Rankcrawler

Looks like a bot has been scouring my website without properly identifying itself. I noticed that my older posts were getting a lot of unexplained hits. I checked the logs, looked up the IPs, and discovered the visitors were bots from the rankcrawler.com domain. The bots don’t properly identify themselves in their user agent field, as good bots should do:

Some of the bots came from these IPs (though there may be others):
87.98.249.75
87.98.133.249
91.121.26.45
94.23.152.34
94.23.153.8

As you can see, Rankcrawler prefers to disguise itself as a regular browser. This is a no-no.

87.98.249.75 – – [29/May/2009:23:56:09 -0400] “GET /page/2/ HTTP/1.0” 200 34160 “-” “Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.6) Gecko/2009011913 Firefox/3.0.6”
87.98.249.75 – – [30/May/2009:00:11:16 -0400] “GET /2006/07/ HTTP/1.0” 200 41171 “-” “Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.6) Gecko/2009011913 Firefox/3.0.6”

91.121.26.45 – – [29/May/2009:20:47:22 -0400] “GET / HTTP/1.0” 200 34467 “-” “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)”
91.121.26.45 – – [30/May/2009:00:01:23 -0400] “GET /2008/05/ HTTP/1.0” 200 27858 “-” “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)”

Continue reading

MT.Net recovers from another hack

MT.Net has been down for about 28 hours due to my WordPress installation being hacked. Fortunately, I had a copy of the database from the day before (yay, backups!). I am still not sure how it happened as my code was all up-to-date but the WordPress folks are now checking into it. I suspect an xmlrpc.php attack but do not know for sure.

Yesterday morning, my friend Scott reported that my comments links were simply refreshing the main page rather than taking him to the comments. I studied the links my WP site was now spitting out:

http://www.markturner.net/2009/05/?y%/credit-cards/#more-6422
Continue reading

4,000+ posts

This post makes the 4,008th post to MT.Net. I had hoped to mark the 4,000th post but it came and went without me realizing.

Thanks for reading!