06 September 2008

Boggle Solver

I've been working on a pet project with App Engine to try and get a better feel for it. It’s a Boggle puzzle solver that does some AJAXy tricks to multithread the solving work. The source code is online as well for anyone that’s curious. (There is a popular knockoff called “Scramble” on Facebook that is either different enough to keep Hasbro from filing a lawsuit or Hasbro’s lawyers are waiting around for them to make some money before bothering.)

In the case of App Engine it's particularly sensitive to requests that take “too long” to process. This was a particular hassle when I was importing the dictionary that is used. In order to solve a Boggle puzzle fairly quickly you want to have all the possible words arranged in a trie. This way you can stop quickly if you’re following letters that will never spell anything as you traverse the puzzle board. It took splitting the dictionary into 5000 separate pieces to get it to load without pushing me over my quota for “long” requests. Luckily we only have to do that once.

Next came the challenge of the puzzle solving itself. Again, solving the whole board in one request takes a while to process, apparently more than is allowed without running into that “long” request quota. Even in an optimized form (see the links to Dan Vanderkam’s work at the end of this entry), the full dictionary trie is 3MB and that takes a non-trivial amount of time to load in when you’re trying to handle requests within a few hundred milliseconds.

The solution was to reload the dictionary again but this time to break it up by initial trigrams. For every initial three letters, I store the appropriate shard of the dictionary (all the words that start with that combination) as a trie. There is also a blacklist of trigrams that form no words (for instance “frw”).

The javascript calls in with a copy of the puzzle and a point on the board to solve from. The server code then finds all the trigrams starting from the specified point and loads the appropriate dictionary shards. Since we're only solving from one point on the board there won't be more than 72 shards to load for any javascript call. (9 directions from the point and then 8 directions from each of those points because we're not allowed to backtrack.) The server then traverses the board using the dictionary tries hunting for words. When it finds them it stores the word and the places on the board where the word was found.

This information is all reduced into JSON and returned back to the browser that made the javascript call. The javascript on the browser is then responsible for taking all the found words and locations and sorting them in a sane way and displaying them for the end user.

Dan Vanderkam has written some interesting code and blog posts about optimizations in solving Boggle puzzles.

No comments: