A blog by Gary Bernhardt, Creator & Destroyer of Software

Zero to Slashdot in Three Days

06 Mar 2007

The Genesis

A few days before PyCon, Brian suggested that we build a web app in one night. It took a little longer than that to polish it up, but we launched sucks-rocks.com on Tuesday. Since then, it's had over 40,000 page views and been slashdotted (OK... it was the Japanese Slashdot, but it's still a Slashdot.)

Sucks/rocks rates the terms you enter by doing web searches and counting results. For example, if you search for "Windows sucks" using Google, you'll get many more results than for "Windows rocks". The opposite is true for FreeBSD. From this, we can infer that people probably like FreeBSD more than Windows. The actual searches that are done by sucks/rocks are more complex than this, but they follow a similar pattern.

The Search Engine Arms Race

Once we started getting a lot of traffic, it was very hard for us to keep sucks/rocks going because we kept running out of searches. Here are the search APIs we used, in the order that we added them:

Search Engine Queries/Day Interface Suckiness of Results
Google 1,000 SOAP Low
Yahoo 5,000 REST Pretty low
live.com 10,000 SOAP IMMEASURABLY HIGH!

We started with Google, but ran out of queries before we even launched. We then used Yahoo, but ran out when 100shiki.com linked to us, forcing me to add support for live.com. Unfortunately, live.com's search results are terrible. Terrible! If you search sucks/rocks for "lord of the rings", you'll get a "?" back. This means that the engine whose results are cached (which is live.com, of course) reported that there were 0 "total results available". Great.

Now we have a cache of almost 60,000 searches, most of which are from live.com. Many of those are totally wrong, of course. My next task is to add a background thread that slowly replaces all of the cached live.com results with Yahoo results.

The Code

Sucks/rocks runs on top of web.py, but only uses it for URL dispatching. Paste does the HTTP serving, with WebFaction's Apache instance on the front end (disclaimer: the WebFaction link is an affiliate link). This simple setup handled about a million HTTP requests in four days, using less than 5% of the CPU almost all of the time (except when it was at the top of slashdot.jp).

Traffic Statistics

Easy Come, Easy Go

With our slashdotting over, We've gone from 10,000+ pageviews per day to about 1,000. Slashdot giveth, and Slashdot taketh away. That's OK, because I need some time to push all of the crappy live.com results out of the cache anyway.

(Brian has also posted about sucks/rocks: 1, 2.)