Archive for July, 2010

Be Careful What You Index

Thursday, July 22nd, 2010

In an attempt to speed up parts of Amethyst, I have been adding indexes for every column that is searched on.  For example, to find old, unused word/word-pairs, the following SQL statement is used.

DELETE FROM tokens WHERE 162129586585337856 <= fnv AND fnv < 166633186212708352 AND updated_at<'2010-06-22 13:43:06' AND occurrences=0

(fnv is a hash of the word(s) and the range is used to spread the deletes out over a week.) In a mis-guided attempt to speed it up, I added indexes on updated_at and occurrences. The result was that adding a single feed could take as much as 15 minutes.

Digging into the MySQL docs, asking questions in the Ruby on Rails forums, and using MySQL’s EXPLAIN SELECT ... command revealed that generally only one index is used in a query (fnv in this case). For Amethyst, all actions on this table used the same index and the other indexes where just slowing down writes to the table. I’ve read it several times, indexes can speed up reads, but they slow down writes. Deleting all the indexes on the tokens table and adding back the one index that is actually used as the primary key cut the add time for the feed mentioned above from 15 minutes to 2 minutes.

Before you rush off and do something similar, check it out on your development machine. The tokens table has 16 million rows. The delete and add index changes took 6 hours. 6 hours where the table was in read-only mode and the application essentially unusable.

Setting Goals with a Deadline and a Exit Criteria

Monday, July 19th, 2010

I’ve been looking for a todo list manager for my Android smartphone to replace Life Balance on an aging Handspring Visor (PalmOS).  There are lots of candidates, many claiming to implement David Allen’s Getting Things Done (GTD) ranging in cost from free on up.  Working with them has pointed out that I’ve become sloppy and let my projects have no end point.

So I’m adding goal dates (probably too aggressive) and a clear cut success criteria (e.g., at least one additional user and referrals from search engines).

Projects need to be much smaller than I’m used to. (Life Balance handles sub-projects to arbitrary depth, leading to top level projects that span years and no end in sight.)

The longest revised project is due in 6 months and has an unambiguous success criteria, at least one paying customer.  Most are in the 1-3 month range.  This keeps my working time much  better focused.

From Minimum Viable Product to Landing Pages

Thursday, July 8th, 2010

This morning I read Ash Maurya’s blog on From Minimum Viable Product to Landing Pages. Really nice, detailed look at his experience bootstrapping several products, pivoting an interesting but not economically viable product, and experimenting with and polishing the landing page and marketing. Much of it matches what I am hearing from other sources and none of it is contradicted by my own experience.

But if he’s correct, Amethyst is quite a way from being ready to attract significant numbers of users, paying or otherwise.  I need to spend both time and some cash on user testing the appeal.  The example he talks about generalizes well to apps like mine and he lists several useful and affordable resources.

Quantitative Models and Change

Saturday, July 3rd, 2010

For a time such as this. This is a long blog and only partially about Amethyst.  It pulls together threads from several areas of my life so I need to spend some words on setting the context.

Amethyst uses a quantitative model to measure how close a particular post is to posts a user has read in the past (along with up and down votes).  If the model was really good, they would see more and more of a smaller and smaller part of the Web.  The up side presumably is that they see the posts that had the most value.  That’s what Amethyst promises.  The down side is that they would only see posts that agree with what they have already read.  Amethyst isn’t there yet.

An obvious way around this is to frequently at least glance at the latest or the bottom scoring posts.  Of course, it only contains posts from the feeds you have added, but it is a wider view than the top scoring posts.  And you might just throw in some blogs or news sites with opposing views.

Many quantitative models are like this, they winnow through a mountain of possibilities to find more like a particular example or examples, i.e., more of the same.  When the world isn’t changing much, this works fine.  I own several mutual funds that use quantitative models to winnow through thousands of stocks to find more like some model of a good stock.  They have own hundreds of stocks and have delivered good performance through the good times and through the recovery.  But times are changing and their performance is dropping.  One is up 22% for the last 12 months and down 9% for the last 3 months.  I think it is time to change.

Many economic preditions models (e.g., the US Bureau of Labor Statistics jobs birth/death ratio) behave fine through economic booms and busts, and badly at the changes from one to the other.  As the fine print at the bottom of every investment brochure or newsletter says, “Past performance is not indicative of future results.”  Nothing goes up or down forever.  Most are “mean reverting” meaning they swing back and forth around an average.  Often a big swing in one direction will be followed by a big swing in the other direction.

A species too specialized for a specific ecological niche is vulnerable to changes in that niche.  In economics, unused capacity is not good.  In life, you had better keep some reserves to deal with emergencies.

So maybe I don’t want Amethyst to become too good are showing me more of the same.  It isn’t there yet, but I am seeing a pretty narrow view of things in the top scoring page.