The mad ramblings of a scientist
August 2019

Saturday, August 25, 2012

Holy Spam, Batman!

I am not interested in promoting my web site, partly because it generates no income (no advertisements), but mostly because I am not into self promotion in general. So I was happy for these articles to sit in quiet obscurity. Even the comment forms I developed for this blog were really a programming exercise.

Of course, there is no such thing as obscurity on the web for any published site. And my site is published, given that I have a registered domain name. In fact, I noticed that several web robots (some of which were fabulously obscure) trawled my web site the very first night I put it up. So, even though my web site is basically irrelevent, it is not completely unknown!

And so the spam bots have found this blog. I suppose they did many months ago, but the code I used for the comment forms is distinctly unique (I wrote it myself, after all), and so somewhat hardened against spam robots. No hard enough, though, since it was recently broken. This morning I deleted several hundred spam messages (fortunately I know how to edit my MySQL database by hand to delete comments in mass on the command line). For old time's sake, I left the very first spam message, which you can find in a couple articles below.

Time to escalate the war! Today I install a CAPTCHA scheme!

Posted at: 8:45 AM
Categories: Diary, Projects

Sunday, September 18, 2011

Cookies, yum!

I have been working on polishing up my new web site in my spare time this week, and I have been uncovering little details here and there that need work. One detail I bumped into was the look of my web error pages. The web server software I am using has built in error pages, of course, but they are standard, boring pages, and don't fit into my web site style at all.

Web site designers often go to some trouble to personalize their error pages with their own special look. Because error web pages don't come up very often (or at least, aren't supposed to), to put a lot of effort into them is a sign of thoroughness. And being the perfectionist that I am, I had to do the same.

I started by constructing a prototype using a CSS layout that nicely matched the style of my web site. I then just quickly composed some "suggestions" for users who might have gotten a web page url incorrectly, including a suggestion that was a joke about getting a cookie (everyone likes cookies). I was quite happy with how the web page layout turned out, especially considering how quickly it was coded. But it was entirely in text, and it seemed something was missing. Maybe a picture.

I then had the inspiration to add a picture of a cookie (to play off the cookie joke in the text). So I peeked at some pictures of cookies in the public domain. Some were okay (others weren't), but none had the right composition. So I felt at a bit of a loss. Until I had another inspiration!

She Who Must Be Obeyed is an excellent baker, and a prolific photographer. So I asked her to make some chocolate-chip cookies to my specifications: just the right size and shape. And they were perfect! (And, incidentally, really, really delicious.) I then experimented with various materials to photograph cookies on, in order to reproduce the approximate shade of the background of my web pages ('#333'). A black piece of paper worked fairly well. I found that natural lighting looked far better than flash, and then set up an impromptu studio in the bedroom (with a simple tripod made from old boxes). After some trial and error (and getting just the right bite in the right cookie), I had the perfect picture.

The image I had in my mind was a chocolate-chip cookie sitting on top of the web page. To get that image right, I had to convert the background of the picture into the html color #333. That is were GIMP comes to the rescue. GIMP is a well known open source image manipulation program with capabilities similar to Adobe Photoshop. Being an amateur user, it took a few tries to get a natural look, but I think it turned out pretty good in the end.

The moral of this story is that if you are looking for just the right look in your web page design, and can't find a good picture in the public domain, consider making one yourself. It can be a bit of fun, and educational. And in some cases, you can get added benefits like a entire batch of yummy chocolate-chip cookies.

Posted at: 2:53 PM
Categories: Diary, Projects

Sunday, September 11, 2011

Extracting Articles from Thingamablog
HyperSQL browser

I was committed to recreating my web site using django, but I wanted to preserve all of my old Thingamablog entries. I no longer had Thingamablog on my computer, but I was pretty sure that I had all of my original user files. So I poked around a bit to see if something obvious stood out.

What I found on my computer was a directory called "database" in my "Documents" directory that was filled with a bunch of files with names that started, funny enough, with "database". One file called "database.script" has SQL commands for creating tables like "AUTH_TABLE" and "FEED_TABLE" which definitely identified the files as associated with my blog. Another file called "database.properites" looked like a bunch of configuration parameters and began with "#HSQL Database Engine". So I checked Google for a SQL capable database called "HSQL" and found a java-based database engine called HyperSQL that looked like a match.

Before going any further, I copied all of the database files to another location (to keep the originals safe and sound). I then downloaded an ancient version of HyperSQL to match the version indicated in my files. After a bit more research, I figured out that the following magical invocation in a terminal window (on my Mac OS machine) brought up a database browser:

java -cp hsqldb/lib/hsqldb.jar org.hsqldb.util.DatabaseManager

(If you are using a newer version of HyperSQL, the command might be a little different—check the documentation for your version.) The first thing the browser shows when it starts is an utterly confusing dialog box. After more research, and a lot of trial and error, I figured out the following settings opened my old blog database file:

Parameter Setting
Type HSQL Database Engine Standalone
Driver org.hsqldb.jdbcDriver
URL jdbc:hsqldb:file:/xxxxxx/database/database
User SA
Password (blank)

where "xxxxxx" is the absolute path to the directory holding the copied database directory.

Once I used the correct set of magical parameters, the database browser presented me a list of four tables. The one of primary interest had the cryptic name "ENTRY_TABLE_1145640371555" (I assume Thingamablog associated "1145640371555" with a blog name). I then asked the browser to run a simple query:


Boom! The contents of all of my old articles appeared in the browser (see the figure above). Now all I had to do was use the menu item "Save result csv" to make a csv text file of all of my old blog articles.

Now that I had all of my old stories in text format, I only needed to write a python script to convert and insert them into the mysql database of my blog. But that is a subject for another day.

Posted at: 9:58 PM (Edited: September 11, 2011, 10:27 PM)
Categories: Projects

Saturday, September 10, 2011

Testing 1, 2, 3

I started this blog more than four years ago, using the web services provided by my account. Because Apple's web hosting did not provide scripting of any sort, I implemented my blog using Thingamablog. I then happily blogged at various levels of activity (or inactivity) for several years.

Then I got a new job and decided my online presence wasn't so important anymore.

Recently Apple announced the end of their services (now renamed to MobileMe and soon changing to iCloud). If I wanted to keep my home page (as neglected as it was), I had to do something. I decided to properly host my web pages elsewhere (WebFaction), convert them to something more modern (django), and setup my own domain name (slashdave). And this is where we are.

Django permits a more complete web site, including a blog that creates pages dynamically (this scales far better), allows for comments, and fully integrates multimedia. And, although I still have that nice job, I figured it was time to get back to some writing.

Posted at: 11:06 PM (Edited: September 11, 2011, 5:36 PM)
Categories: Diary, Projects

Monday, May 01, 2006

My site

I suppose it is appropriate that I spend my first free "working" day finishing up my new personal web site.

Posted at: 11:50 AM
Categories: Projects